VMware Cloud Community
blu60077
Enthusiast
Enthusiast

VM's lose connectivity after VMotion and sometimes after a reboot

Morning,

I have a very strange issue that has been happening for quite some time that I cannot resolve. Have opened cases with VM with no resolution.

Setup…

Running ESXi 5.0

Clustered 2 host environment with 18 VM’s

Issue… which can be reproduced…happens to every VM (or so I believe)

I have a VM whose IP address is statically assigned as

IP - 10.1.1.38
SN - 255.255.255.0
GW - 10.1.1.1

When a PC is Vmotioned from host1 to host2 (or vice versa) it loses connectivity to the network. Connectivity is regained once a 2nd and sometimes a 3rd E1000 network card is added to the VM and this newly added NIC obtains an IP from DHCP server.

After it regains network connectivity, I can remove from newly added NIC(s) and connectivity remains.

VMware had me perform some wireshark captures and informed me that once the Vmotioned VM loses connectivity that is has issues resolving ARP commands. But.. that is as far as anyone ever has gotten…

At this point it is unclear if this is a hardware or software issue.

Hardware setup is this…

Two HP DL380 servers each with a HP 4 port server adapter

They are connected physically to 2 – Cisco 3560 switches with three VLANs , Vlan1 for normal network, Vlan2 for Vmotion, vlan3 for SAN

Does anybody have any ideas what might be happening?

If you were me, what might be the next step in troubleshooting, any help is greatly appreciated.

Thanks

Blu

Reply
0 Kudos
3 Replies
TBKing
Enthusiast
Enthusiast

4 ports, 3 vLANs, 2 switches ... I'd take a closer look at those connections and configs.

Maybe temporarily simplify the setup if you have an opportunity ...

One vLAN1 connection from each host to one switch and see if the problem persists.

... guessing vMotion and SAN are on separate pNICs from Network (where is your vManagement connection?)

Do you have any ghosted nics in the VMs?

The VMs just have single vNICs and not NLB correct?

Reply
0 Kudos
blu60077
Enthusiast
Enthusiast

Hello TB,

Thank you for your reply, With VMware's help, we narrowed down this issue. On each of the two host servers there are 4 on board nics and an intel 4 port Ethernet card. The issue appears to be when VM's use the NICs on the intel card. That is when the errors occur. When we remove the "intel nics" out of the config things operate fine.

Nest step will be to update drivers via update manager, read intell nic and see if issue is resolved.

Reply
0 Kudos
a_p_
Leadership
Leadership

Maybe related. How are the physical switch ports configured. Can you confirm spanning-tree portfast trunk is set?

André

Reply
0 Kudos