VMware Cloud Community
PaulFF
Contributor
Contributor

iSCSI slows to a halt after high IO

This problem is driving me completely mad. Hopefully someone has some ideas.

Just for an overview of how the isci is set up:

esxi -> dumb switch -> netapp

When things are working fine, the throughput is fantastic, no slowdowns at all. I can't seem to always replicate the issue. During high IO to the iscsi device, I watched with esxtop and noticed that it goes from high operations to absolutely ZERO. No reads/writes. Nothing in the queue. Just absolutely nothing. It's almost as if esxi completely forgets there is even a LUN there at all. The other LUNs appear to work with no issue. This issue isn't specific to a LUN either. This happens with all of them.

Sometimes high IO works, and if just flies through with no IO issues. Sometimes it just drops to near zero. Using DD when it gets to this state, I usually only get ~5MB/sec. When everything is running smoothly I get about 80MB/sec. This is independant of how many VM's are running.

Please help!

Reply
0 Kudos
2 Replies
Lightbulb
Virtuoso
Virtuoso

If using Software initiator troubleshooting it like a regular network issue. Check Firmware updates for your NIC. As your switch is unmanaged there is not much in the way of troubleshooting data to be gained there. Easy things to do would be to switch to a diffrent port on switch and replace cable. As your other systems are not having issues that kind of puts a switch utilization issue out but you should check your switch vendor to see if there are any updates or Vendor specific alerts (Kind of stretching I know). Could be that your ESX host is hitting a limit on your switch it is hard to tell without management info.

PaulFF
Contributor
Contributor

We are using a Cisco Catalyst 2960 for the switch. I've tried both an onboard nic and another nic. I'm running esxi on 2 seperate boxes, and they both display the same issue, no matter the network configuration.

It's a very strange problem...

Reply
0 Kudos