VMware Cloud Community
krik011
Contributor
Contributor

ESX / iSCSI high Disk Latency Slow

Hello all I hope I can get some insight as to what is causing this. We are expieriencing high Disk Latency which is causing noticeable slowdown to ALL our Guests. Here is an outline:

  • 2 - ESX 3.5 Hosts (48 GB RAM Dual Socket Quad Core Xenon 2.5 GHZ Poweredge 4 Series)

  • Connected to the ESX Hosts is an MDI ExpressStor 1450 with Expansion Array via iSCSI, 2 - Raid 5 Arrays Each has 15+1 Spindles of 7200 400 GB SATA - Running Server 2003 Storage

  • iSCSI Network was using a Cisco 3750 VLANed, now using a dedicated 3Com 2830 Gigabit Switch for iSCSI (still latency)

  • Have 27 Guests nothing too IO Itensive our total IO peaks during production at about 900 (measured using esxtop on both Hosts by cmd/s)

  • Using Intel 82571EB Ethernet Controllers for iSCSI and Microsoft iSCSI Soft Initiator

  • 6 LUNs/Datastores at 1 TB each (split between both Raid 5 arrays)

Issue: Disk Latency in VCenter stays consistently at 20-40 ms with peaks at 120-150 ms. Latency/slow response is felt on all Guests.

Steps tried: Isolating iSCSI network on own switch/subnet, changing team on iSCSI Nics to 4 - 1 GB Nics with Adaptive teaming,

Things noticed: When IOMeter is run on a Guest during production hours for 30 sec these are the results: 1100 I/O's / sec , 42 ms Average I/O , 1005 ms Max I/O (run with Iometer config file from Unofficial Results thread 100% Read 32kb 100% Seq)

When IOmeter is run on Guest During non production hours for 30 sec results are: 2269 I/O's / sec , 26 ms Average I/O , 1000 ms Max I/O ( so by these results it seems we are not maxing out our Storage from an I/O level ) When we look at network utilization on the Team on the Storage array it does not peak over 9% , however when copying a VMDK from one data store to another we see it peak at around 70%.

I know these issues are hard to troubleshoot and I also know this is a long post but we don't know at this point where to go next. Any help would be immensely appreciated!

Reply
0 Kudos
32 Replies
krik011
Contributor
Contributor

I will give that a shot this weekend.

Are there any procedures people follow when designing their VMWare storage? Any documents or best practice would be great.

Thanks,

BTW we told the installer we would not be pursuing the 10 Gbe.

Reply
0 Kudos
s1xth
VMware Employee
VMware Employee

In regards to your post on MPIO, you CAN do MPIO with ESXi 4 and utilize round robin to get 2GB throughput. I agree with what everyone is saying, you are going to max your disks out before you max that 1GB link out, same goes even if you are using MPIO. There have been many posts from people on here regarding performance of their SAN's with SATA disks. Swap the SATA's out for 10k or perferrably 15ks. I dont recommend to anyone to use 7.2's unless its going to be raw storage only.

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos
krik011
Contributor
Contributor

Yes that is what I have been hearing.

It looks to me that everything is pointing in the direction as a redesign to our storage. It also seems that by simply adding new drives it is basically putting a bandaid to the situation as we virtualize further. A new SAN (looking towards EMC) seems to be the direction we are going to head.

What methods or best practices are used to determine the amount of IOPS a given VM Infrastructure requires?

Thanks everyone!

Reply
0 Kudos
JeffDrury
Hot Shot
Hot Shot

After reading your posts the biggest flag is that the MDI ExpressStor is not on the VMware storage compatibility guide: http://www.vmware.com/resources/compatibility/pdf/vi_san_guide.pdf

WhenVMware certifies a storage device they run it though a suite of tests that ensure the storage can deliver a certain level of performance and perform all VMware options. Another concern is that this device is using Windows 2003 Storage server to present the storage to the ESX cluster. It is very likely that the Windows Storage server is to blame for the high latency that your disk IO is experiencing. Latency of 20ms or more is not acceptable for enterprise class storage. The Microsoft SQL team considers a DB transaction as failed if latency is more than 25ms. Even if the installers tell you that the device is capable of more IOPS that does not mean it can deliver those IOPS at low <20ms latency. Can you check CPU and RAM usage on the storage device itself? Is the storage server installed on a seperate partition from the data?

I would definately recommend getting your money back on this storage and evalutating something on the compatability list. Don't be afraid of iSCSI based on the performance of your current storage. Any iSCSI device that has passed VMware compatability should be able to provide lower latency than you are experiencing. As a rule of thumb I would plan for 100 IOPS per VM before taking into account the apps on the VM. For example if you had a VM running MS Exchange 2003 for 1000 users you would use 100 IOPS for the OS and between 1000 and 1500 IOPS for the Exchange DB.

Good luck.

Reply
0 Kudos
DillonMiller
Contributor
Contributor

I recently designed a very similar environment. 2 ESX hosts connected to Enhance 16TB iSCSI storage shelf. I'm using basic NICs in the ESX servers and the software initiator that comes with ESX. I think the target is a linux software target within the Enhance box... it's not Windows but rather something custom. I had similar problems with slow VMs after about the 10th active VM but was able to determine that the Ethernet frame size limitation of 1500 bytes was the major cause of my bottleneck. I come from a strong background in VMware running on EMC fiberchannel SAN so I was expecting some slowness but nothing like what I noticed.

Enabling jumbo frames was easy in my case since I already had setup dedicated NICs in the ESX boxes and using dedicated gigabit switch to connect the ESX boxes to the iSCSI enclosure. ESX 3.5 support jumbo frames for sure and most iSCSI storage enclosures support jumbo frames as well. You also have to enable jumbo frames on any man-in-the-middle like a switch. Some switches can be enabled per-port and some have to be enabled globally and also per-port.

Might want to give this a try before you give and arm and a leg to EMC Smiley Happy

Reply
0 Kudos
vm_arch
Enthusiast
Enthusiast

We solved the 1Gb issue simply by spreading our iSCSI load across two or more IP addresses in seperate small subnet ranges. True, it isn't MPIO in any form, but it gives us more bandwidth from the SATA iSCSI/NFS disk back to the host machine. IT has also meant that when one nic (cable, switchport or other) goes down, we don't lose all of out iSCSI/NFS eggs from the basket (and gives us the ability to remap to that dev/test vm via another path)

Reply
0 Kudos
krik011
Contributor
Contributor

Do you have any documentation as to how you enabled this in ESX 3.5?

Thanks everyone.

edit Are Jumbo frames only supported with NFS iSCSI? If so this will not be an option for us.

Reply
0 Kudos
krik011
Contributor
Contributor

Thanks Jeff.

CPU and RAM usage is very low, never see the CPU go about 15% and RAM out of the 2 GB 700 MB are used.

Storage Server 2003 is installed on a RAID 1 - 2 Spindle Array

Reply
0 Kudos
DillonMiller
Contributor
Contributor

Here are the commands to run. In ESX you have to create a new port but you can use an existing virtual switch and just modify it's MTU to jumbo size. Standard jumbo frame size is 9000 but our iSCSI device only supported 4000 for some reason so I set the ESX boxes to that. The switch doesn't really matter as long as you enable it to at least as high as you set the ESX servers since they are generating the Ethernet Frames... switches only have to pass them without trying to fragment or you'll have even more performance problems. You also should monitor this by using a sniffer like wireshark so that you can see the frame sizes are in fact larger and not still 1500 byte once you set this.

esxcfg-vswitch -l List the virtual switches

esxcfg-vswitch -m 9000 vSwitchX Set the MTU of a virtual switch where = the swtich number

esxcfg-vmknic -l list VMKernel Nics

esxcfg-vmknic -a -i 172.16.1.1 -n 255.255.0.0 -m 9000 iSCSI change an existing VMnic to VMKnic and up it's MTU to 9000

After you set this and verify that your switch and iSCSI is able to support jumbo frames then you're all set to setup iSCSI in the GUI again making sure to use the virtual nic created by the command line.

Reply
0 Kudos
DillonMiller
Contributor
Contributor

Jumbo frames are supported using iSCSI.

also to test if you're able to send and recieve jumbo frame sizes you can do a special ping from the VMware kernel NIC you created using the command line:

vmkping -s 9000 xxx.xxx.xxx.xxx

Reply
0 Kudos
Chuck8773
Hot Shot
Hot Shot

We actually had four links to the SAN. ESX selects one of those links for each LUN. If a link fails, it logs in with a different link. The overall throughput to the host could get up to the link maximum as long as your have more LUN's than links. The problem we were running into is that the SAN was having trouble keeping up. The ESX host would be able to accept enough IO, the SAN was unable to fill the requests quickly enough. With Eql SANs, your volume can live on up to three arrays. With ESX 3.5, you could only connect to one of the arrays. Any data you need to gather from a different array creates cross array chatter. With ESX4, you can connect directly to each array. This reduces the chatter and increases throughput.

ESX 3.5:

With an ESX 3.5 host connected to Array1 and trying to read data that is stored on Array2, the traffic is as follows.

Host -> Switch -> Array1 -> Switch -> Array2 -> Switch -> Array1 -> Switch -> Host

ESX 4:

With an ESX4 host connected to Array1 and Array2 and trying to read data that is stored on Array2, the traffic is as follows.

Host -> Switch -> Array2 -> Switch -> Host

Because the host is connected to each array, it can access different portions of the LUN at the same time and get 1 Gbps on each connection.

In 3.5, it could still access different portions of the LUN, but the path back would be restricted by that single Gbps link.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
Reply
0 Kudos
Chuck8773
Hot Shot
Hot Shot

With an outstanding queue length of 64, I got 355.58 on SATA and 1744.69 on SAS.

With an oustanding queue length of 4, I get closer to the stated IOPS. 154.06 on SATA and 1094.5 on SAS.

With an oustanding queue length of 1, I got 123.61 on SATA and 379.6 on SAS.

The SATA test was on a 2 disk mirror of 7200 RPM SATA disks.

The SAS test was on a 2 disk mirror of 15 K 2.5 Inch SAS disks.

Hope this clears my numbers up.

Charles Killmer, VCP

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Charles Killmer, VCP4 If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
Reply
0 Kudos
krik011
Contributor
Contributor

Interesting and good to know.

In mean time we are going to go the route of doing a fresh install of VSphere. We don't anticipate this to fix the issue however it is a step in the right direction as we get everything in line to get up to "Enterprise Class" storage infrastructure.

Reply
0 Kudos