We have multiple 08r2 SQL servers, file servers and Exchange guests with internal iscsi connections to volumes on our EQ SAN. Most guests have multiple nics with the EQ HIT kit loaded using MPIO. Everything is the latest and greatest as far as EQ firmware, HIT kit version, etc. Best practices are followed. When we were on 4.1 I never had issues vMotioning these guests. But since the upgrade to 5 I get random issues. Some times when I vmotion a guest the iscsi nics will totally disconnect and I have to edit the guest, disconnect, reconnnect NICS and everything is fine. Other times guests will lose their iscsi connection every few minutes and if I vmotion to another host the problem goes away. The guest NIC for non iscsi traffic never has issues. Hosts have been checked, config is good and identical between them all. Switch config is good (Cisco 3750's). All other guests vmotion with no issues. It's a pretty standard setup, nothing fancy. Have not upgraded guest NICs to VMXNET3, still all on 2, but I can't find anything that says I need to.
Things that are different.
- EQ multipath driver is loaded on the hosts, followed best practice setup for that from EQ
- Hosts are brand new dell 620's with ESX installed from scratch.
Just wondering if someone else has experienced this and there's something obvious my searching isn't diggin up.
How much memory is in the VMs? Are you seeing any packets/pings drop during vMotion? Wondering if the final memory cut over is just taking a smidge longer than it can handle. Also VM Tools/Hardware versions updated now that you are on ESXi 5?
I think it would be useful to see a VM with similar resource allocation but updated go through a vMotion.
8-12 gig or RAM. I'll maybe get one ping drop or a long ms response time when the guest moves. But it's not the guest network thats having issues, it's just the guest iSCSI NICs that are either dropping totally or repeatedly losing connections every few minutes. Tools were updated immediatly after being moved to new hosts.
The thought process behind the ping question wasn't questiong the guest network, rather, I am wondering if the switch over phase is taking to long and causing the HBAs to lose connection to the storage (see http://www.vmware.com/files/pdf/vmotion-perf-vsphere5.pdf).
We are experiencing the same issue after upgrading to 5.0 update 1. We never had issues before this. Dell R610 ESXi hosts with Equalogic SAN and Dell powerconnect switches. After VMotion, I sometimes cannot ping servers. There doesn't seem to be any consistency behind the problem. I've upgraded VMware tools and we use VMXNET3 adapters.