VMware Cloud Community
Marcel1967
Enthusiast
Enthusiast

no vMotion resiliency on vSphere 5.5

I have a problem  creating resiliency for vMotion traffic on vSphere 5.5 installed on Hitachi blades.

Goal is: when one of the blade switches is down for maintenance or just broken, vMotion should remain operational.

Multil-nic vMotion has been configured and both adapters on source and target server do  have vMotion traffic processed.

Yet when I disable either one of the vMotion nics in ESXi or disable the port in the switch (to which one of two vMotion nics is connected), vMotion will not work anymore. So the other adapter is not used for resiliency.

Configuration:

multiple-nic vMotion is configured like described in the VMware KB here

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200746...

In my case:

vmnic6 is connected to portgroup vMotion-1 with an ip-address of 10.20.120.51 /24

vmnic6 is active adapter and connected to switch0. Switch0 is a switch installed in the Hitachi blade enclosure

vmnic1 is set as standby adapter

vlan-id is set at 1941

vmnic1 is connected to portgroup vMotion-2 with an ip-address of 10.20.120.52 /24

vmnic1 is active adapter and connected to switch1.

vmnic6 is set as standby adapter.

vlan-id is set 1941

When performing a vMotion both vmnic6 and vmnic1 are used. This is confirmed by performance graphs in vCenter. vMotion with 2 nics is 30% faster than with single  nic.

When vmnic6 is put to 'down' state in ESXi , vMotion is not possible. The process hangs at 14%. The error says 10.20.120.51 cannot be reached.

Initially I tried another configuration:

single portgroup named vMotion. vmnic6 set as active adapter. vmnic1 set as standby adapter. When vmnic6 was disabled, vMotion did not work anymore. So no resiliency here either.

Any help is very much appreciated!

4 Replies
brunofernandez1

do you remove the vmnic while vmotion is running?

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
Reply
0 Kudos
Marcel1967
Enthusiast
Enthusiast

No. Procedure was:

1. perform vMotion on multiple nics. Worked perfectly. Performance graphs in vCenter show data transfer on all nics.

vMotion succesfull.

2. Disable nic in the source ESXi host using esxcli network nic down -n vmnicX

3. Start vMotion

4. Hangs at 14% . Error 'the ESX hosts failed to connect over vMotion network'

The same happens when the switchport is disabled/shutdown. This is done before the vMotion starts.

Reply
0 Kudos
brunofernandez1

allirght, now I unterstand.

How does your vswitch0 Nic teaming configuration looks like? can you take a screen of it?

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
Reply
0 Kudos
Marcel1967
Enthusiast
Enthusiast

Problem solved.

I added the vlan-id used by vMotion to the uplink switch ports of both blade enclosure switches. The vlan-id was also added to the core switches.

After that, failover works.

I am not sure why this needed to be done.

Before the vlan id was added to uplink trunk blade-switch->core switch:

ping vmnic6 /host1 to vmnic6/host2 was possible

ping vmnic1/host1 to vmnic1/host2 was possible

disable vmnic6

ping vmnic1/host1 to vmnic1/host2 was still possible.

However vMotion failed at 14%