Bitfarmer
Contributor
Contributor

Guest iSCSI NIC vmotion issues after upgrade to 5.0 up1

We have multiple 08r2 SQL servers, file servers and Exchange guests with internal iscsi connections to volumes on our EQ SAN.  Most guests have multiple nics with the EQ HIT kit loaded using MPIO.  Everything is the latest and greatest as far as EQ firmware, HIT kit version, etc.  Best practices are followed.  When we were on 4.1 I never had issues vMotioning these guests.  But since the upgrade to 5 I get random issues.  Some times when I vmotion a guest the iscsi nics will totally disconnect and I have to edit the guest, disconnect, reconnnect NICS and everything is fine.  Other times guests will lose their iscsi connection every few minutes and if I vmotion to another host the problem goes away.  The guest NIC for non iscsi traffic never has issues.  Hosts have been checked, config is good and identical between them all.  Switch config is good (Cisco 3750's).  All other guests vmotion with no issues.  It's a pretty standard setup, nothing fancy.  Have not upgraded guest NICs to VMXNET3, still all on 2, but I can't find anything that says I need to.

Things that are different.

- EQ multipath driver is loaded on the hosts, followed best practice setup for that from EQ

- Hosts are brand new dell 620's with ESX installed from scratch.

Just wondering if someone else has experienced this and there's something obvious my searching isn't diggin up.

Thanks

Tags (3)
0 Kudos
6 Replies
jfrappier
Enthusiast
Enthusiast

Is all traffic (VM, vMotion, Managment, iSCSI) all going through the same 3750?  Have you tried setting up a test VM using VMXNET3 drivers?

0 Kudos
Bitfarmer
Contributor
Contributor

They are all spread across multiple 3750's in my core stack.  No, that was my next step if no one could offer up anything simpler.

0 Kudos
jfrappier
Enthusiast
Enthusiast

How much memory is in the VMs?  Are you seeing any packets/pings drop during vMotion?  Wondering if the final memory cut over is just taking a smidge longer than it can handle.  Also VM Tools/Hardware versions updated now that you are on ESXi 5?

I think it would be useful to see a VM with similar resource allocation but updated go through a vMotion.

0 Kudos
Bitfarmer
Contributor
Contributor

8-12 gig or RAM.  I'll maybe get one ping drop or a long ms response time when the guest moves.  But it's not the guest network thats having issues, it's just the guest iSCSI NICs that are either dropping totally or repeatedly losing connections every few minutes.  Tools were updated immediatly after being moved to new hosts.

0 Kudos
jfrappier
Enthusiast
Enthusiast

The thought process behind the ping question wasn't questiong the guest network, rather, I am wondering if the switch over phase is taking to long and causing the HBAs to lose connection to the storage (see http://www.vmware.com/files/pdf/vmotion-perf-vsphere5.pdf).

0 Kudos
alexherweyer2
Contributor
Contributor

We are experiencing the same issue after upgrading to 5.0 update 1. We never had issues before this. Dell R610 ESXi hosts with Equalogic SAN and Dell powerconnect switches. After VMotion, I sometimes cannot ping servers. There doesn't seem to be any consistency behind the problem. I've upgraded VMware tools and we use VMXNET3 adapters.

0 Kudos