Hi,
I have a Lefthand Cluster with 2 storage nodes and 1 Failover Manager. On the Vsan there are running several virtual machines. When I disconnect a storage node to run a failover test, the virtual machines stay powered on and the guest stay pingable. The problem is, that the guests stop any other operations. The storage luns on the ESX servers all keep status ok. I can click on browse storage and see the root folder but when I click on the folder nothing happenes.
When I connect to the san using CMC the san reports that it is degraded but it is up and running.
When I reconnect the storage node the guest continue their work as if nothing has happened.
I hope anybody can help me.
Thanks in advance,
madmax
Are you running your volumes in 2-way replication between your pair of nodes?
We have the exact same setup at my company. We are not in production yet and are also performing the same type of tests. We unplugged the SAN connection to one ESX box and had the same results where the host froze until we plugged the cable back in. We fixed the issues by properly enabling iSCSI multipathing using the vSphere RCLI. Also make sure to enable round robin for each datastore iScsi path.
http://goingvirtual.wordpress.com/2009/07/17/vsphere-4-0-with-software-iscsi-and-2-paths/ (this covers the concepts and config)
http://goingvirtual.wordpress.com/2009/12/01/vsphere-4-0-update-1-with-software-iscsi-and-2-paths-on... (this shows how to do config using vCenter RCLI instead of ESX CLI)
http://virtualgeek.typepad.com/virtual_geek/2009/09/a-multivendor-post-on-using-iscsi-with-vmware-vs... (this one is just good reading)
Hi.
Yes we use a 2 way replication between our nodes.
I discovered the following problem. I configured round robin for my storage path in vCenter Server. Everything seems to be ok until I restart my esx servers. Because then the configuration for the storage path is reset to it's defaults again and just one storage path is marked for I/O. It seems to be the same problem with the autostart properties for virtual machines. They are also deactivated again after a resart of the esx servers.
I think that behavior is since I use the vCenter server. The version of vSphere is 4.1.
madmax
We are still on vsphere 4.0 and see the dame problems with the HP VSA P4000, MPIO, and Round Robin. If the round robin would just stay set after a reboot I think it would be ok.
This document was generated from the following thread: Lefthand P4000 failover problem