VMware Cloud Community
christianZ
Champion
Champion

New !! Open unofficial storage performance thread

Hello everybody,

the old thread seems to be sooooo looooong - therefore I decided (after a discussion with our moderator oreeh - thanks Oliver -) to start a new thread here.

Oliver will make a few links between the old and the new one and then he will close the old thread.

Thanks for joining in.

Reg

Christian

574 Replies
mikeyb79
Enthusiast
Enthusiast

Looks like my reply never made it up here for whatever reason. I haven't had a chance to bench my Linux host in the remote office as it was running a monitoring package but wasn't needed anymore, so the local guy shut it down. I think I will simply build another for the purpose of testing.

The array in that location is a PS4000E that is only half-populated with 8x 1TB SATA drives. 2 are hot spares, leaving 6 as data disks in RAID-50 (no longer a best practice for SATA drives 1TB or higher). The numbers are surprisingly good for 6 data disks, especially SATA disks. The RealLife test results work out to 162 IOPS/spindle using the test workload.

Server is a Dell R610 (1x 5600-series, 16GB RAM, 6 Broadcom 5709's - 2 bound to iSCSI and configured by EqualLogic MEM script) running vSphere Essentials Plus 4.1. Virtual guest is Windows 2008 with 1 vCPU, 4GB RAM, and 60GB hard drive.

Dell EqualLogic PS4000E 8x 1TB SATA, RAID-50 (2:1 x2), 2 hot spares
 
Access SpecificationIOps       MBps (Binary)Average Response Time
Max Throughput-100%Read6729.23210.288.87
RealLife-60%Rand-65%Read  971.687.5847.55
Max Throughput-50%Read6419.51200.619.18
Random-8k-70%Read837.036.5356.82

Will post the Ubuntu results as soon as I get a chance to install and configure.

0 Kudos
PinkishPanther
Contributor
Contributor

Thanks for the post, I'm very interested in your Linux results on that same host/storage with the same CPU/RAM assignement.

Given windows is getting 7.58MB/sec (this is very similar to our  RAID6 2TB SATA local disk config) I predict your Linux result on the same test will be 1.3MB/sec.

Our EQL PS6010E array with 16 x 2TB drives in RAID50 gets 35MB/sec in test 2.

0 Kudos
larstr
Champion
Champion

PinkishPanther wrote:

Here's a new data point - I went back to 2010 in this thread searching for the keyword "Linux", no luck.

I have recently discovered in our environment that Linux guests have terrible IO and needed a measure between Linux and Windows so I went back to IO Meter with the dynamo agent sending the results to the Windows GUI.

I've done some Linux tests in this thread prior to 2010 and I also documented some of it here: http://vmfaq.com/entry/33/

As I noted earlier in this thread (http://communities.vmware.com/thread/197844?start=165&tstart=0) iometer seems to have some problems with IO queueing on newer linux kernels. I've heard rumours that there exists newer iometer dynamo versions that fixes this, but the following thread doesn't look too promising: http://forum.commvault.com/forums/thread/21824.aspx

Lars

0 Kudos
PinkishPanther
Contributor
Contributor

Hi, thanks for the response Lars.

Perhaps there is an IO scheduling issue with IO Meter, but we also have two other data points suggesting this IO discrepency is real at the application level.

One is a database benchmark test run on both Windows and Linux, Windows finished in 1 minute - Linux 20 minutes.

Also, we have a customer running high IO apache/Linux applications and are reporting very slow IO, they are running redeploying on Windows and I'll post those results when I know them.

0 Kudos
larstr
Champion
Champion

PinkishPanther wrote:

One is a database benchmark test run on both Windows and Linux, Windows finished in 1 minute - Linux 20 minutes.

Also, we have a customer running high IO apache/Linux applications and are reporting very slow IO, they are running redeploying on Windows and I'll post those results when I know them.

Pinkish Panther,

Storage performance on Linux is not normally any slower in Linux than in Windows. Performance may vary depending on filesystem type, block size and scheduler, but performance is not significantly different.  One example of that is in the link I gave you above (http://vmfaq.com/entry/33/) where a Windows VM was running the iometer test on VMware Server on both a Windows host and Linux host.

It's also worth noting that most HPC clusters are running on LInux, not Windows, so if your Linux application is performing badly it's probably due to a bottleneck within your application, misconfiguration or hw failure.

Lars

0 Kudos
PinkishPanther
Contributor
Contributor

Lars,

Thanks again for the reply.  I had not looked at your link in detail previously because none of the random read/write results were above 15MB/sec, the issue is most obvious when Windows is getting 58 MB/sec and Linux on the same ESXi 5.1 host is getting 11MB/sec or less.

However, this time I did look at thoseresults in more detail.  Unfortunately there is only one test with Linux (Debian) shown below.  And unfortunately it is not on VMWare but rather XenServer.  However, it does actually show the problem I'm describing, just as a smaller scale.

Debian gets 2.6 MB/sec and Windows gets 6.3 MB/sec with apples to apples.

So, I would conclude that my environment is NOT the only one with the issue.

Again - ideally, someone else on this thread will run a Windows and Linux IO Meter test in their own environment and post the results!!

Either this is an issue with the IO Meter test, or there is a real Linux IO performance issue at higher speeds, right?

ERVER TYPE: Virtual Windows 2003R2sp2 on XenServer release 4.0.1-4249p (xenenterprise)
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP DL360G5, 4 GB RAM; 2x XEON E5345, 2,33 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: P400i 256MB 50% read cache / 2xSAS 15k rpm / raid 1 / 128KB stripe size

TEST                NAME

Av.                Resp. Time ms

Av.                IOs/sec

Av.                MB/sec

Max                Throughput-100%Read.

5

10445

326

RealLife-60%Rand-65%Read

44

810

6.3

Max                Throughput-50%Read

6.46

8896

278

Random-8k-70%Read.

55.9

811

6.3

EXCEPTIONS: CPU Util. 92% 52% 83% 37%


SERVER TYPE: Virtual Debian Linux  4.0, kernel 2.6.18.xs4.0.1.900.5799 on XenServer release 4.0.1-4249p (xenenterprise)
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP DL360G5, 4 GB RAM; 2x XEON E5345, 2,33 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: P400i 256MB 50% read cache / 2xSAS 15k rpm / raid 1 / 128KB stripe size

TEST                NAME

Av.                Resp. Time ms

Av.                IOs/sec

Av.                MB/sec

Max                Throughput-100%Read.

0.36

2773

86.6

RealLife-60%Rand-65%Read

3.04

328

2.6

Max                Throughput-50%Read

1.38

724

22.6

Random-8k-70%Read.

3.3

302

2.36

EXCEPTIONS: CPU Util. 0% 0% 0% 0%

0 Kudos
zgz87
Contributor
Contributor

It is known that IOmeter running in Linux does not give reliable measurements. The problem started with a specific evolution of the kernel. Apparently, IOmeter uses wrongly some Linux libraries. I recommend to you not using Linux with IOmeter. I would use the FIO tool instead.

0 Kudos
PinkishPanther
Contributor
Contributor

Appreciate that response.  I chose IO Meter due to the large number of comparible results at this post.

I have been suspicious of our results for a few days now as my other negative performance indicators have been attributed to other factors - most specifically an issue with Dell R720/820 BIOS issues with VMWare, when set to have VMWare control the performance/power ratio.  For anyone else hitting this - the fix is to get the latest BIOS and set the power to "Maximum Performance"

0 Kudos
Gabriel_Chapman
Enthusiast
Enthusiast

Have you looked at the IO Analyzer fling that VMware put out a while back? I think its running CentOS prepackaged with IOmeter. You run their pre-defined tests or your own: http://labs.vmware.com/flings/io-analyzer

I would be curious to see if you get the same poor performance with it.

Ex Gladio Equitas
0 Kudos
larstr
Champion
Champion

Pinkish Panther,

The test you refer to shows a problem in iometer. The tests I was trying to point you to was these:

SERVER TYPE: Virtual Windows 2003R2sp2 on VMware Server 1.0.4 on Windows Server 2003R2sp2
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP DL360G5, 4 GB RAM; 2x XEON E5345, 2,33 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: P400i 256MB 50% read cache / 2xSAS 15k rpm / raid 1 / 128KB stripe size / default ntfs 4096

TEST NAME

Av. Resp. Time ms

Av. IOs/sec

Av. MB/sec

Max Throughput-100%Read.

0.5

10900

340

RealLife-60%Rand-65%Read

156

368

2.8

Max Throughput-50%Read

1.22

7472

233

Random-8k-70%Read.

88.1

630

4.9

EXCEPTIONS: CPU Util. 99% 17% 98% 22%


SERVER TYPE: Virtual Windows 2003R2sp2 on VMware Server on Debian Linux 4.0 2.6.18 x64
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP DL360G5, 4 GB RAM; 2x XEON E5345, 2,33 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: P400i 256MB 50% read cache / 2xSAS 15k rpm / raid 1 / 128KB stripe size / default jfs (4096)

TEST NAME

Av. Resp. Time ms

Av. IOs/sec

Av. MB/sec

Max Throughput-100%Read.

0.5

8550

267

RealLife-60%Rand-65%Read

79

747

5.8

Max Throughput-50%Read

0.63

3804

237

Random-8k-70%Read.

97

609

4.7

EXCEPTIONS: CPU Util. 100% 17% 98% 16%

As you can see, iometer was running within a windows guest in both tests, but the host os was different.

Lars

0 Kudos
PinkishPanther
Contributor
Contributor

Larstr,

I see your point, but VMWare Server is not a good starting point since it's not desiged for high speed.  I am in agreement that the issue I've noticed is likely 100% IO Meter running on Linux rather than Linux itself having such a major disk random IO speed difference over Windows 2008R2.

I had 4 separate indicators of this issue.

2 client benchmark apps showed better results on Windows

1 database vendor benchmark showed signifcantly better on Windows (400%)

and IO Meter

At the start it was difficult to say all 4 indicators were flawed, it was much more likely there was an issue with Linux OS.

As of today I believe both my client benchmarks were impacted by the unrelated E5 issue I described earlier, bad timing since we already believed it was a Linux issue.  The client applications seem to be Java throttled rather than IO, seems to be only using a single thread.  They are still working on that, the E5 fix has helped but not solved it.

The database vendor gave me a new benchmark test this morning (hot off the press), I believe my feedback led them to discover an issue in their test because now it's "almost" as fast as the windows test.

and finally, we have confirmation that IO meter is having Linux issues.

Conclusion... hopefully my environment was optimized all along, other than the E5 issue which is now corrected on 1/2 of our servers.

I hope these notes help someone else in a similar situation - we've invested several hundred hours working on this in the last month.

John

0 Kudos
jb42
Contributor
Contributor

First of all, thanks to the long time host for providing that service. Will you be back? If not, have asnyone worked out another quick way of charting the iometer output?

0 Kudos
larstr
Champion
Champion

Yep, vmktree.org is down while the server is being upgraded (hw and OS). Will be up and running again sometime during the weekend.

In the meanwhie you can use Excel, LibreOffice Calc, Lotus 123 or similar to retrieve the values. Smiley Happy

Lars

0 Kudos
jb42
Contributor
Contributor

I tried working the output in excel for about 3 minutes. What a mess. I can wait till Monday! Thanks again for providing the service!

jb

0 Kudos
jb42
Contributor
Contributor

I went ahead and created spreadsheets to handle the IOmeter results extraction. vmktree was displaying it's parsing script today. I figured there was 50/50 chance that was just for requests coming from my router since I must be exceeding reasonable use! So I figured I owed ya.

I added a comparative feature which assumes you take a base test, then make a configuration change and test again and want to see the change between the two results. There is a file that handles a test from a 2 CPU system and a 4 CPU system. You'll need to modify the row reference values in the xExtract worksheets for different numbers of CPUs.

To use: open results.txt as comma-delimited text with no text-qualifiers in excel. Copy that worksheet into the BaseResults worksheet. Repeat for ExperimentalResults, which are your results following a configuration change. Evaluate the improvements in the Change worksheet.

0 Kudos
lpisanec
Contributor
Contributor

I needed to get some storage for a small business where I work.

We did not want to spend a lot of money, but wanted top performance of course Smiley Wink. Space was a concern, as we have around 6TB of data, growing by about 300GB/year (or more).

At first we were looking to get a Dell Powervault MD3600 or MD3620 and a 10GE switch plus some 10GE cards.

That was a bit costly - it would have been around 25-30k EUR for 12*2TB or 24*1TB raw space.

Besides that we would have been locked in to Dell providing spare parts, service and possible upgrades in the future.

So I did some research and came up with a plan to build our own storage with Nexenta, a zfs-based storage appliance.

It is based on a Supermicro chassis, Supermicro motherboard, 2*Xeon E5 2609, 256GB RAM. HDDs are cheap nearline SAS drives from Toshiba, 24*1TB per drive at 7.2k rpm which are connected by a SAS expander to a single port SAS controller (LSI 2308).

22 of the drives are in a mirror-stripe configuration (also known as RAID10), the last 2 are hot spares.

The storage box is connected by Infiniband (Mellanox ConnectX-2 QDR) using SRP to 2 ESXi 5.1 and will serve as main storage for 15-20 VMs.

As for redundancy: it was too expensive to remove all single point of failures. I decided it would suffice to get some spare parts in case something dies; we can live happily with an hour or two downtime to replace stuff. Besides we can utilize 1G ethernet as a backup if infiniband goes down and takes the IB switch and most cards with it, also, our server for backups can serve as a new "head" for the storage in a pinch as well.

So far this storage setup has set us back around 15k EUR.

512GB testsize, no writecache at all (readahead caching is too good)
Test name Latency Avg iops Avg MBps cpu load
Max Throughput-100%Read1.075062515824%
RealLife-60%Rand-65%Read93.0263443%
Max Throughput-50%Read71.88823253%
Random-8k-70%Read91.5364354%

512GB testsize, with writeback cache
Test name Latency Avg iops Avg MBps cpu load
Max Throughput-100%Read1.114972415534%
RealLife-60%Rand-65%Read36.201613123%
Max Throughput-50%Read1.045113215973%
Random-8k-70%Read40.301441113%

32GB testsize, fits into RAM cache, writeback cache
Test name Latency Avg iops Avg MBps cpu load
Max Throughput-100%Read1.114892515284%
RealLife-60%Rand-65%Read1.254114732139%
Max Throughput-50%Read1.055134516043%
Random-8k-70%Read1.164374734140%

Tests to get maximum performance of infiniband, blocksize 128k and 1M, testsize 32GB, writeback enabled
Test name Latency Avg iops Avg MBps cpu load
128k 100% random read3.4517098213734%
128k 100% random write22.2126543315%
128k 100% sequential read3.3317731221634%
128k 100% sequential write2.652204027553%
1M 100% random read26.62224922492%
1M 100% random write150.843993993%
1M 100% sequential read25.37235023504%
1M 100% sequential write198.173023025%
0 Kudos
IRIpl
Contributor
Contributor

lpisanec

where did you get the driver for SRP for ESX 5.1 ? Or mayby You use iscsi via infiniband IPoverIB??

Mellanox didn't provides SRP driver for ESXi 5.1!!! on officail web site.

Can You explain?

0 Kudos
lpisanec
Contributor
Contributor

Mellanox does provide official drivers that support SRP. It was released a week or so ago.

http://www.mellanox.com/page/products_dyn?&product_family=36&mtag=vmware_drivers

Driver version 1.8.1.0 is the one you want (or greater, if someone reads this in a few months Smiley Wink)

Quote from release notes:

MLNX-OFED-ESX package contains:
o MLNX-OFED-ESX-1.8.1.zip - Hypervisor bundle which contains the following
  kernel modules:
   - mlx4_core (ConnectX family low-level PCI driver)
   - mlx4_ib (ConnectX family InfiniBand driver)
   - ib_core
   - ib_sa
   - ib_mad
   - ib_umad
   - ib_ipoib
   - ib_cm
   - ib_srp

...

  - Storage:
    o NFS over IPoIB
    o GPFS over IPoIB
    o SRP
0 Kudos
IRIpl
Contributor
Contributor

lpisanec

Can You do performance test? from my IOMeter profile -> http://vmblog.pl/OpenPerformanceTest32-4k-Random.icf

In this profile is 4k random read/write test.

And show screenshot from vSphere Client configuration for scsi SRP devices?? if exist in "Storage Adapter"

Thanks

Tom

0 Kudos
lpisanec
Contributor
Contributor

There you go.

Test name Latency Avg iops Avg MBps cpu load
Max Throughput-100%Read0.514839515126%
RealLife-60%Rand-65%Read0.893665628646%
Max Throughput-50%Read0.464834315103%
Random-8k-70%Read0.674266133330%
4k-Max Throu-100%Read-100%Random0.435084219831%
4k-Max Throu-100%Write-100%Random4.25120114610%

Screen Shot 2013-03-17 at 23.58.03.png

0 Kudos