VMware

This Question is Not Answered

1 "correct" answer available (10 pts) 2 "helpful" answers available (6 pts)
4 Replies Last post: Nov 27, 2006 12:47 PM by mcallistera  

Vmotion, VM loses network route posted: Aug 28, 2006 1:16 PM

Click to view mcallistera's profile Novice 22 posts since
Jan 27, 2006
We've got 4 ESX 3.0 boxes in two different datacenters. We can vmotion a VM from either of our two ESX boxes in our primary dataceter to either ESX box in the standby, and we only lose a packet or two. However when we vmotion the machine back, we loose connectivity for 5 minutes (almost exactly 5 minutes each time). Obviously, we span our network between the two datacenters.

The network engineer can see the MAC address move from interface to interface between the two buildings in the same manner regardless of the direction the VM is moving. Acording to him all the switch/router interfaces are updated and the "network" knows that the MAC address has moved properly. Yet, for some reason the VM cannot send or recieve data outside its own VLAN for 5 minutes after moving back to the primary datacenter.

We run tcpdump inside the VM and can see that it is still receiving IP and ARP broadcast traffic (that would be local vlan traffic), but established TCP and ICMP traffic outside the VLAN is gone for 5 minutes.

Here's another REALLY ODD thing: when the network engineer puts a sniffer on the ESX port with port mirroring, then the vmotion works perfectly, no outage. If the port-mirror is off (no sniffer) whe the VM is moving, we loose connectivity. If we turn the mirror/sniffer on during the connectivity loss then all connectivity is immediately restored.

Network is Cisco, we're doing 802.1q vlan tagging and 802.3ad teaming with ESX configured to use IP-hash.

At this point I'm fairly certain this is a network issue, but it still doesn't make sense to us. Anyone out there seen anything remotely like this or have advice?

Update: we physically unplugged one of the teamed ports on each node in the two datacenters in order to try to take out the adapter teaming IP-hash/mac hash issues. That didn't help. Traffic was OK prior to vmotion and died for 5 minutes after.

It would appear that this is a problem in the Cisco switch/routers. Anyone out there seen this with Cisco gear?

Message was edited by:
mcallistera

Re: Vmotion, VM loses network route

1. Aug 30, 2006 4:45 PM in response to: mcallistera
Click to view Mike_Laverick's profile Virtuoso 4,063 posts since
Jan 5, 2004
Perhaps there is a delay in update in the mac address across multiple ip devices...

Have you tried using the enabling the Notify Switches option to Yes, on the NIC Teaming section of a vSwitch?

That might help?

Regards
Mike

Re: Vmotion, VM loses network route

2. Aug 30, 2006 7:17 PM in response to: mcallistera
Click to view wcrahen's profile Expert 353 posts since
Sep 24, 2004
5 minutes is interesting because that is also the CAM table default timeout value. So when the VM is migrated and not communicating are you able to ping your default gateway, does this restore the connection?

VMware Developer

SDKs, APIs, Videos, Learn and much more in the Developer community.

Learn More

Developer Sample Code

Increase your developer productivity with VMware API sample code.

Learn More

VMworld Sessions & Labs

Online access to the latest VMworld Sessions & Labs and online services.

Learn more

Purchase PSO Credits Online

Purchase credits to redeem training and consulting services online.

Buy Now

Community Hardware Software

View reported configurations or report your own.

Learn More

VMware vSphere

Come witness the next giant leap in virtualization.

Register Today

Communities