VMware Cloud Community
donikatz
Enthusiast
Enthusiast

ESX(i) 4.1 -> ESXi 5.0: some vMotioned VMs lose network

When vMotioning VMs from ESX(i) 4.1 to [any of] my new 5.0 hosts, some VMs are losing network connectivity. Can't ping them, can't ping from them, NIC shows no network connectivity. Rebooting sometimes helps, otherwise have to upgrade VMware tools, even though 4.x tools should be compatible. This is obvously putting a major crimp on my upgrade planning.

Seeing machines experiencing and NOT experiencing this on the same VLAN, OS (RHEL, W2k3, W2k8), hosts... haven't found a pattern yet.

Anyone else experienced this? Any ideas? Thanks

Reply
0 Kudos
3 Replies
a_p_
Leadership
Leadership

To me this sounds as if one or more of the uplink ports may not be configured correctly. Did you already double check the physical switch port configuration, e.g. - in case of a Cisco switch - switchport mode access/trunk, spanning-tree portfast [trunk], ...?

André

Reply
0 Kudos
donikatz
Enthusiast
Enthusiast

Thanks, my first thought too, but the switch config looks good. In fact, these are same switch ports and vSphere network config that had just been in use for ESX 4.1 hosts -- in a couple cases the same host hardware even, just fresh ESXi install. Really strange, there must be a common factor I'm not seeing re. the VMs that are affected and that aren't. Still looking...

Reply
0 Kudos
donikatz
Enthusiast
Enthusiast

Update: tested a few more migrations, found a couple more VMs with this problem. As soon as vMotion from 4.1 to 5.0, network connectivity dies. vMotion to another 5.0 host, same problem. vMotion back to a 4.1 host, network connectivity returns.

Decided to test further with the VMs running on the v.5 host. As I was troubleshooting, eventually after several minutes networking suddenly started working again on its own! And now that networking has kicked in, I can vMotion those VMs back and forth between 4.1 and 5.0 without issue. Strange!

I still have no idea what's going on. And this will still be a big problem for production critical servers. Continuing to investigate...

Reply
0 Kudos