We have setup Dell Equallogic SAN storage to connect to 2 Dell
switches and from these 2 switches to 2 ESX servers with NIC teaming. However,
when we test the high availability by power down one of the Dell switches during
a virtual machine cloning, the cloning process pause immediately and did not
failover to the other live connection. After a while, the cloning simply failed.
Is there anything we miss out in the configuration that it didn't work as expected?
How have you configured your path selection for the storage?
Frank
No,
thats not what i want to know.
Check your storage path selection in the storage configuration tab.
Frank
From your two nics that you are using for iSCSI is one locatet at the switch 1 and the other on switch 2?
And your storage is connectet to both of the switches?
Like an X?
Frank
Do you know why it's showing both the NIC cards as Active?
Yes. Each ESX host has 2 NICs connected to 2 Switches which in turn connect to 4 NICs on the SAN, with 1 pair in X (cross) link.
I was told that instead of setting one as standby, I should set both as active for load balancing to double the bandwidth. So if 1 link is down, the other link will no longer take over unless it's a standby?
I am not 100 % sure, but i think you have to create one iSCSI VMKernel Port for each adapter and bind them to one of your physical nics.
So infact, you have two Kernel Ports with one adapter. Each on a different switch.
I do not use iSCSI, but read this manuel.
http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf
There a some description how to configure multipathing for iscsi.
Frank
I think, if you are using the NIC's in teaming, then you need to set one NIC in standby mode.
Jay
MCSE,VCP 310,VCP 410
Consider awarding points for "helpful" and/or "correct" answers.
Look at the attachment. You want to make sure each path is going out only one nic but have 2 nic's setup for redundancy. If you have 2 connections use one nic per path. I use 4 connections to gain use of all 4 Gig paths on the EqualLogic 6000xv
This video will show you exactly how and why.
Hi,
Sorry for late reply because I need to go onsite to test out the failover. One of the ESX server was setup accordingly to Dell recommendation, the same as ndd822 and esexon has provided. After going thru the setup again, I realize that I have misconfigured some of the IP addresses. After reconfigure the iSCSI again, I am now able to see 12 paths. However the other ESX server is showing 13 paths and I don't know where does this additional path come from.
I proceed to test the failover by cloning a VM from localstore to iSCSI storage and unplug one of the 2 connection. There is a pause of about 1 minute before the cloning proceed again. Then I plug back the connection and unplug the other connection after a while, there is another pause of about 1 minute before the cloning proceed again. I did this for the other ESX server and this time, when I plug back the first connection and unplug the other connection, it pause for over 10 minutes and failed.
When I check the NIC setup, both ESX server are now showing nic2.png instead of nic1.png (see attached.) Why is this so? And how do I restore to the original setup without having to remove them and set them up all over again in the ESX Console?
btw, the failover works again when I moved the NIC from unused adapters to standby adapters. However, I suspect that the bandwidth is half since only 1 adapter is now active instead of both.
Hi,
Sorry for late reply because I need to go onsite to test out the failover. One of the ESX server was setup accordingly to Dell recommendation, the same as ndd822 and esexon has provided. After going thru the setup again, I realize that I have misconfigured some of the IP addresses. After reconfigure the iSCSI again, I am now able to see 12 paths. However the other ESX server is showing 13 paths and I don't know where does this additional path come from.
I proceed to test the failover by cloning a VM from localstore to iSCSI storage and unplug one of the 2 connection. There is a pause of about 1 minute before the cloning proceed again. Then I plug back the connection and unplug the other connection after a while, there is another pause of about 1 minute before the cloning proceed again. I did this for the other ESX server and this time, when I plug back the first connection and unplug the other connection, it pause for over 10 minutes and failed.
When I check the NIC setup, both ESX server are now showing nic2.png instead of nic1.png (see attached.) Why is this so? And how do I restore to the original setup without having to remove them and set them up all over again in the ESX Console?
btw, the failover works again when I moved the NIC from unused adapters to standby adapters. However, I suspect that the bandwidth is half since only 1 adapter is now active instead of both.
A Senior vmware support engineer replied:
Please
make sure that you don't use multiple VMkernel NICs on single vSwitch with
multiple uplinks. This configuration (though suggested in the Dell's best
parcitces guide), know to cause failure on ESX server.
You will
need to install at least patch described in KB http://kb.vmware.com/kb/1019492A
to get the solution working correctly.