krik011
Contributor
Contributor

ESX / iSCSI high Disk Latency Slow

Hello all I hope I can get some insight as to what is causing this. We are expieriencing high Disk Latency which is causing noticeable slowdown to ALL our Guests. Here is an outline:

  • 2 - ESX 3.5 Hosts (48 GB RAM Dual Socket Quad Core Xenon 2.5 GHZ Poweredge 4 Series)

  • Connected to the ESX Hosts is an MDI ExpressStor 1450 with Expansion Array via iSCSI, 2 - Raid 5 Arrays Each has 15+1 Spindles of 7200 400 GB SATA - Running Server 2003 Storage

  • iSCSI Network was using a Cisco 3750 VLANed, now using a dedicated 3Com 2830 Gigabit Switch for iSCSI (still latency)

  • Have 27 Guests nothing too IO Itensive our total IO peaks during production at about 900 (measured using esxtop on both Hosts by cmd/s)

  • Using Intel 82571EB Ethernet Controllers for iSCSI and Microsoft iSCSI Soft Initiator

  • 6 LUNs/Datastores at 1 TB each (split between both Raid 5 arrays)

Issue: Disk Latency in VCenter stays consistently at 20-40 ms with peaks at 120-150 ms. Latency/slow response is felt on all Guests.

Steps tried: Isolating iSCSI network on own switch/subnet, changing team on iSCSI Nics to 4 - 1 GB Nics with Adaptive teaming,

Things noticed: When IOMeter is run on a Guest during production hours for 30 sec these are the results: 1100 I/O's / sec , 42 ms Average I/O , 1005 ms Max I/O (run with Iometer config file from Unofficial Results thread 100% Read 32kb 100% Seq)

When IOmeter is run on Guest During non production hours for 30 sec results are: 2269 I/O's / sec , 26 ms Average I/O , 1000 ms Max I/O ( so by these results it seems we are not maxing out our Storage from an I/O level ) When we look at network utilization on the Team on the Storage array it does not peak over 9% , however when copying a VMDK from one data store to another we see it peak at around 70%.

I know these issues are hard to troubleshoot and I also know this is a long post but we don't know at this point where to go next. Any help would be immensely appreciated!

0 Kudos
32 Replies
Chuck8773
Hot Shot
Hot Shot

We were having the same problems in our environment last year. We resolved it by using SAS 15 K disks instead of SATA. We used to get about 800-900 IOPS. Now we get about 4000-5000. ESX 3.5 using iSCSI is limited to 1 Gbps of throughput, because it cannot use MPIO. SATA is good enough for the system drive of on server. But it is not good enough for many VM's. We started having serious issues when many servers would boot up together. Boot times in the 30 - 40 minutes. Now we don't have any of those issues. That is where I would try first.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
krik011
Contributor
Contributor

Thanks a bunch for your input!

We have an inclination our Storage is causing the issue but I would like to pin point it exactly before making an investment. Is there anyway we can say for sure it is our storage?

0 Kudos
krik011
Contributor
Contributor

We are also being told from a vendor that going with 10 GB Ethernet will fix our issue... they are so sure of it they are giving us the equipment and if it doesn't solve the issue they take it back with no cost. I am pretty skeptical based off of my findings and reading.

0 Kudos
Chuck8773
Hot Shot
Hot Shot

You are doing everything right for determining the source of the problem. Ensure that your IOMeter is using a block range that cannot be possibly stored in the cache of your storage appliance. You need to find the IOPS value from the disks. Not the cache. Your numbers look close to from disk but they look a little high.

You should be focusing on how quickly your storage can server small random IO from disk. Change your settings to 1 KB, 100% read, Random. Make sure the block range is larger than the cache in your storage appliance. Typically larger than 2 GB's.

I think by how your IO changes during productionand off production tests, the disk is the problem. The network utilization will not be very high on a MB/sec view point. But it will be high on an IO/sec view. The 1100 that you get during production is indicating contention. The VM is capable of 2200, and only able to use 1100 during the day. That suggests that the storage is only capable of around 2200 total. Which is not surprising for SATA RAID 5.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
Chuck8773
Hot Shot
Hot Shot

You are not maxing out the network. 10 Gbps will not help. The disk is the more likely culprit. One more thuinig on my last post. Now consider splitting the 2200 IOPS among your 30 VMs. That is only 70 IOPS for each VM. That is not very much at all.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
krik011
Contributor
Contributor

Thanks again Chuck.

I have had this feeling all along that our storage was not keeping up.

We have 2 arrays so shouldnt it be around 2200 IOPS per array? We have the data stores split between the arrays.

Thanks,

0 Kudos
Chuck8773
Hot Shot
Hot Shot

Correct. So divide 2200 by half the VM's. You are still around 150 IOPS. Which is just barely better than a single SATA disk.

In our environment, I get about 4000-5000 IOPS at all times of the day. That is looking more like a max for the VM. That same storage appliance is shared with 18 other ESX hosts and 300 other VM's. I don't really know what the storage appliance can push at a maximum. But I am not hitting it yet. This is suggesting that each VM is able to use 4000 IOPS all day long.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
krik011
Contributor
Contributor

Thanks Chuck. With the 15k drives did you estimate about 120-150 IOPS / spindle?

Any other thoughts from anyone?

0 Kudos
Chuck8773
Hot Shot
Hot Shot

I tested two 15K SAS disks in a RAID 1 and got about 1800 IOPS. That should be very close to what a single SAS 15K disk would get as that was just a mirror. SATA I got about 350.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
shawchyn
Contributor
Contributor

May I know what software you guys use to do the IO test?

Thanks for sharing. Smiley Happy

0 Kudos
Josh26
Virtuoso
Virtuoso

SInce your storage system is running WIndows 2003, you should run the Performance Monitor there and look for disk IO issues. If it looks overloaded, you know it's not iSCSI, the network, whatever causing your issues.

My feeling is going to point at those disks, SATA is never a great choice for enterprise storage. Put it in a SAN running VMs and I wouldn't expect much.

0 Kudos
J1mbo
Virtuoso
Virtuoso

The stated IOPS values seem high. EqualLogic reckon on 170 IOPS per 15K SAS drive for example.

Also looking at SATA disks like WD's RE3s, 7200rpm 8.9ms, that gives a total average access time of 13ms (77 IOPS) which is rioughly what I get benchmarking them.

Some tests I've seen suggest SAS drives perform up to twice as fast as SATA drives (everything else being equal) due to the native command queuing capabilities, i.e. nearing 400 IOPS for a 15K disk.

HTH

0 Kudos
krik011
Contributor
Contributor

I used IOmeter to get my results. Not sure what they are using for their "benchmarking" applications.

Thanks guys for your continued help.

0 Kudos
krik011
Contributor
Contributor

We went back to the original installer and asked why they recommended this solution.

They are claiming there is still something wrong with our iSCSI Network. They are sending 3 - 10 GB dual port Nics. (I don't think a direct connect configuration is even supported by VMWare let alone I don't understand how it can work?) I guess they can Team the 2 10 GB ports on the Storage array. Anyway, I am leaning way in the direction as a limitation on the Storage Array's possible IOPS. They guaranteed me 100% that it is not our MDI Storage Array, that we are not even scratching the surface of what it can do. I think it is time to contact another VMWare consultant.

Any thoughts?

Thanks!

0 Kudos
J1mbo
Virtuoso
Virtuoso

Bear in mind it's just an ethernet network - nothing to stop you reducing it to a single path at one end, mirroring the port, and running up wireshark to see what's going on.

0 Kudos
RussellCorey
Hot Shot
Hot Shot

Is there a secondary storage controller in your setup?

Is there a failed drive in the array?

Are the drives at capacity?

10Gbe is 99% likely to not provide you any tangible benefit.

Can you dump some vmkernel logs somewhere to take a peak at?

0 Kudos
krowczynski
Virtuoso
Virtuoso

I recommend not more than 10 VMs and 3000 IOps per LUN

MCP, VCP3 , VCP4
0 Kudos
krik011
Contributor
Contributor

First of all thanks everyone for your continued help!

"Is there a secondary storage controller in your setup?" - We have an Expansion Array connected with a 3 GB Interface card, to the main Array BUT I believe both of those share 1 Adaptec 3805 Controller card.

"Is there a failed drive in the array?" - There is no failed drive as reported by the management software neither any indication on the array's lights.

"Are the drives at capacity?" - There are 2 - Raid 5 Arrays which are 2 Logical Partitions in Server 2003 - the Main Array has 600 GB Free (as reported by Server 2003) and the Expansion array has 3 TB Free

I agree about the 10 Gbe but I figure if we can toss out all their current theroys than hopefully we can move on to the array itself. (also if it works we pay for the 10 Gb nics, if it doesnt they take them back with no cost)

I dumped the logs is there one particular thing you would like to look at? Since the kernel log is about 80 MB.

0 Kudos
RussellCorey
Hot Shot
Hot Shot

Can you shut down one of your ESX hosts and re-run your benchmark?

0 Kudos