VMware Cloud Community
Fleischen
Enthusiast
Enthusiast

Really bad 10Gbe iSCSI performance with VNX5200

Hi Everyone,

I really need some input from you!

My company recently bought and VNX5200 SAN from EMC and...

Host Setup:

3 x HP DL360p Gen8 E2650

1 x HP 2-port 561T per host

4 x 1gbe broadcom (built in)

Switch:

1 x HP 5406zl

2 x 8 Port 10gbe Modules

2 x 24 1gbe modules

SAN:

1 x VNX5200 block iSCSI with Block iSCSI

a storage pool configured as 2 x (4+4) Raid1/0 (as per best practice)

Operating system

VMWARE Esxi 5.5 U1 Enterprise Plus (updated with all the latest patches) build 1746018

Other:

* All nic and switches are configure with jumbo frames.

     - Have been tested and confirmed working from all hosts using vmkping -s - d 8972 <ip-adress>

* Delayed Ack has been disabled on all iSCSI targets

* Drivers have all been updated to the latest available from VMware.

The issue(s):

* The sequential write performance when copying files within Harddrive A to Harddrive B within the guest operating system is around 60mb/s, I'm expecting at least 900 mb/s

     - Tested a 15 gb file on a windows server 2008 r2

* The storage vmotion performance when from SAN to a 6 disk raid 10 storage on the host-01 server is about 75 mb/s.

So what is going on?!!?!?!

Any input is much appreciated!

14 Replies
DITD
Contributor
Contributor

Hi,

do you have configured VLANs or have separate LAN for iSCSI? What is you iSCSI configuration (2x1GBe with 2 vSwitches?).

Have a similar problem with a customer of mine. Througput copying in w2k8 r2 VM is around 60mb/s.

Can you have a look on Windows Perfmon Average-Queue-Length while copying? And also check with esxtop - see this:

VMware KB: Using esxtop to identify storage performance issues for ESX / ESXi (multiple versions)

Regards, Karlheinz
Reply
0 Kudos
Fleischen
Enthusiast
Enthusiast

The iscsi is configured as follows per host:

vSwitch 1: vmk 10gbe

vSwitch 2: vmk 10gbe

Multipathing and round robin is also configured.

I'll test esxtop today, I'll let you know.

Reply
0 Kudos
DITD
Contributor
Contributor

Hi,

could you also do a test from within Windows (Windows iSCSI-Initiator) directly connected to a iSCSI-Target?

This is to see if any VMWare config is the problem.

Regards, Karlheinz
Reply
0 Kudos
Fleischen
Enthusiast
Enthusiast

Hi,

I opened a case with EMC and they told me that an 60mb/s throughput is what you can expect from a 2 x 4+4 raid 10 configuration. If you want to saturate the 10 gbe iscsi you'll ned 100+ sas 10k disks. I'm not happy with their answer and I really do not belive it. So I really do not know what to do now...

They are refering to page 12 - 13 in the below document.

http://www.emc.com/collateral/software/white-papers/h10938-vnx-best-practices-wp.pdf

Reply
0 Kudos
DITD
Contributor
Contributor

Hi,


Have done some tests tests so far with my laptop: externel HDD (USB3.0) to internal 7.2k HDD with a 20GB-file ~ 45MB/s.

Internal 7200k HDDto internal SSD ~67MB/s. This means that one SATA 3G with only one SATA-Disk has the same performance as your EMC
2x4+4 raid 10! I cannot believe that - but could you add some more disks in to your storage pool raid-10 to see if performance grows?

Regards, Karlheinz
Reply
0 Kudos
Fleischen
Enthusiast
Enthusiast

I can't add more disks to the pool, that would violate the best practice (only 4 disks remaining). If I wanted to add more disks then I would need to buy a new DPE which costs a lot of money.

I also have better performance on my laptop than on the SAN. EMC will not recognize the issue so I'm at a loss.

Not a good start to my weekend.

Reply
0 Kudos
DITD
Contributor
Contributor

Can you run esxtop and have a look to the Disk-Values below and if you have more DAVG's or KAVG's?

DISKGAVG25Look at “DAVG” and “KAVG” as the sum of both is GAVG.
DISKDAVG25Disk latency most likely to be caused by array.
DISKKAVG2Disk latency caused by the VMkernel, high KAVG usually means queuing. Check “QUED”.
DISKQUED1

Queue maxed out. Possibly queue depth set to low. Check with array vendor for optimal queue depth value.

Regards, Karlheinz
Reply
0 Kudos
corvettefisher
Enthusiast
Enthusiast

One thing you may want to check is the firmware on those Broadcom nic's, also are you using the HP iso image or the standard image from VMware. Generally the Vendor image will include additional, or higher performing drivers over the standard esxi image.

Reply
0 Kudos
stephanph
Contributor
Contributor

Hi Fleischen,

Sorry to use this old ticket to communicatie with you.
I can't sent a PM because I net 10 vmware points..

Did you ever resolve this problem?

We are having the seem issue and vmware/emc didn't find anyhting wrong.

With vmware io/Analyzer, iometer we are getting a good speed 250MB/sec.
But when copying by the Explorer in Server 2012 we are getting really bad speed.
Between 50-70 MB/s.

Kind regards,

Stephan

Reply
0 Kudos
francescpages
Contributor
Contributor

Dear Stephan, I'm in the same problem that you are reporting. Did you find the solution?

We are implementing a ISCSI solution with MSA1040 and 2 HP DL360 G9, and we have the same problem. IO-meter inside a windows 2008 Server with obtain 250-300MBPS, but when we did a file copy inside windows 2008 server we only have 50-60MBPS. (This is the same speed that obtains iometer whith only 1 Outstanding i/O s per target).

Thanks a lot,

Reply
0 Kudos
stephanph
Contributor
Contributor

Hi francescpages,

We have temporally disabled our SAN because almost all our EMC VNXE are running crappy…..

We are now using local disk with a very good speed.

Kind regards

Stephan

Reply
0 Kudos
Dan_Johnson
Enthusiast
Enthusiast

Run ESXTOP to see where your I/O constraint is

6a00d8341c328153ef0168ebeac7c5970c-pi.jpg

Duncan Epping has a good list of thresholds listed here on his blog, this will help you identifiy what componet is causing the slow down

http://www.yellow-bricks.com/esxtop/

If you found this or any other answer useful please use of the Helpful or Correct buttons to award points ---------------------------------------------------------------------- VCP4/5/6-DCV, VCAP5.5-DCA, VCAP6-DCD, VCIX6-DCV, VCIX6-NV ---------------------------------------------------------------------- Follow me on Twitter @tattooedtechy
abnoduti11
Contributor
Contributor

Did any of you ever make any progress with this issue? We have just had a VNX5200 installed and are seeing similar performance to that mentioned above. The main issue is when using storage vmotion to move VMs from one datastore to another on the same SAN. Running vmotions causes high disk latency on other VMs and users complain of poor performance. We logged a call with VMWare, who looked at esxtop and identified that DAVG/cmds is reaching 40-50. They say this should be less than 5, and therefore blame storage array.

Reply
0 Kudos
kastlr
Expert
Expert

Hi,

how exactly did you run the tests?

When using iometer, how many outstanding IO's did you configure.

A single file copy job performed from a windows VM is usually a single thread application which doesn't send parallel IO's (or only a limited amount of parallel IO's).

And when using thin vmdk's the values could also differ.

Before IOmeter measures the throughput and response time, it creates a testfile on the tested volume.

When that testfile resides on a thin vmdk, VMware will zero out the used tracks during the file creation process.

After the testfile is created, IOmeter starts the measurement.

When Explorer is used on a thin vmdk, any write IO on an uninitialized track will be intercepted by ESXi and a pre zero out IO will be injected.

So the total amount of write IO's needed to copy the testfile is much smaller than the total amount of IO's generated by the ESXi server..

If you would like to validate how fast your (EMC) array could handle multiple windows file transfer job you could use RichCopy on an eagerzeroed thick vmdk..

Without using eagerzeroedthick vmdks you can't trust IO performance values created inside a VM.

It's a known issue that svMotion Operations could cause high disk latencies, this is explained in the following VMware article.

VMware KB: Abnormal DAVG and KAVG values observed during VAAI operations

With VNX, the svMotion performance might also be slower if source and destination LUN aren't owned by the same SP.

If the LUN's are owned by different SP's, the VNX has to move the data from one SP to the other via the internal bus system.

When source and destination are owned by the same SP the data movement is handled by the same SP.

Also keep in mind that the IO response time for a small IO is much better than for a large IO.

AFAIK, Windows explorer does uses 1MB IO's, so I would expect higher response times.

If you're only testing with a single VM and the test period isn't long lasting, you could also use vscsiStats to measure the performance.

Regards,

Ralf


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos