VMware Cloud Community
christianZ
Champion
Champion

New !! Open unofficial storage performance thread

Hello everybody,

the old thread seems to be sooooo looooong - therefore I decided (after a discussion with our moderator oreeh - thanks Oliver -) to start a new thread here.

Oliver will make a few links between the old and the new one and then he will close the old thread.

Thanks for joining in.

Reg

Christian

574 Replies
MKguy
Virtuoso
Virtuoso

Just to add to the suggestion, for iSCSI with Jumbo Frames I found that using a byte-controlled policy instead of IOPS can deliver improvments as well.

See:

http://blog.dave.vc/2011/07/esx-iscsi-round-robin-mpio-multipath-io.html

-- http://alpacapowered.wordpress.com
Reply
0 Kudos
JonT
Enthusiast
Enthusiast

That is about what I would actually expect for regular SAS performance. The Read only performance of local SAS is actually quite good, but writes are very poor when compared with any larger scaled storage solution that you will see here in this thread.

Reply
0 Kudos
JonT
Enthusiast
Enthusiast

Good suggestion. We use all 8Gb FC for our storage but good suggestion about the iSCSI setting as most here seem to prefer that over costly FC.

Reply
0 Kudos
JonT
Enthusiast
Enthusiast

Ok obviously the host is a bit over-sized compared to other posts here but this is what we consider our "JUMBO" workload that we put very large SQL database production workloads on. This is the results from a 4Gb FC connection in our older datacenter. I will post the results of our 8Gb FC connection version next.

(corrected the VM size)

SERVER TYPE: Windows 2008 R2 VM, 16vCPU 32GB Memory

CPU TYPE / NUMBER: 8x Quad-Core AMD Opteron 8356, 2.3Ghz

HOST TYPE: HP DL785 G5, 256GB Memory

STORAGE TYPE / DISK NUMBER / RAID LEVEL: EMC VMAX, Virtual Provisioned, F.A.S.T. enabled across FC and EFD disks.

Test name

Latency

Avg iops

Avg MBps

cpu load

Max Throughput-100%Read

2.44ms

20178.01

630.56

6.6%

RealLife-60%Rand-65%Read

2.13ms

19989.03

156.16

6.7%

Max Throughput-50%Read

4.2ms

12987.63

405.86

5.06%

Random-8k-70%Read

1.87ms

22080.91

172.5

6.86%

Reply
0 Kudos
fredlr
Contributor
Contributor

jb,

your values seems rather disappointing for 22 disks implied in RAID10, especially on the benchs where sequential read/write is applied. Did you tried with RAID5/6 ? What kind of disks are you using ? NLSAS 7k2  ? Which connectivity ? 2x10Gb or 2x 1Gb ?
If you look at the top of the page, the PowerVault 36xx I'm testing seems to deliver more perf for less money.

Reply
0 Kudos
MaxStr
Hot Shot
Hot Shot

TEST

Average Respone TimeAverage IO/s
Average MB/sCPU load
Max Throughput 100% Read2.0528,88090215%
RealLife 60% Rand 65% Read40.451,16799.8%
Max Throughput 50% Read2.8820,23463210%
Random 8k 70% Read25.281,2159.59.8%

SERVER TYPE: Windows 2012, ESXi 5.1
CPU TYPE / NUMBER: Xeon E5-2690, 2x sockets, 8x cores per socket

HOST TYPE: Cisco UCS B200 M3 Blade Server
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EMC VNX5300, NL-SAS + SAS, Auto-tiering, (no SSD)

I can see a lot of room for improvement in the random reads, but I'm quite impressed with those max throughput numbers. I believe that adding the SSD cache should greatly improve the random reads.

Any idea why my random tests are so low? It seems a bit lopsided to have reads that fast and random crawls at 1100 IO/s

Reply
0 Kudos
jb42
Contributor
Contributor

Thanks all for the input. I ran additional IOmeter tests yesterday and today. There's 5MBps variation on Real Life values between tests so I no longer think I acheived any real difference going from Dell MEM to RR. Just gave the bytes limit a shot. Bytes setting results were essentially the same too. See below.

MKguy, I updated the command (below) to update all EQL volumes on host to byte setting with what is I guess a new 5.1 syntax compared to your(?) blog post?

for i in `esxcli storage nmp device list | grep EQLOGIC|awk '{print $7}'|sed 's/(//g'|sed 's/)//g'`;

     do

          esxcli storage nmp psp roundrobin deviceconfig set -d $i -B 800 -t bytes;

     done

Jon, for reference I was running with an IOPS value of 3, which is what EQL support likes for whatever reason. Here's their full command:

# esxcli storage nmp satp set --default-psp=VMW_PSP_RR --satp=VMW_SATP_EQL ;

for i in `esxcli storage nmp device list | grep EQLOGIC|awk '{print $7}'|sed 's/(//g'|sed 's/)//g'` ;

     do

          esxcli storage nmp device set -d $i --psp=VMW_PSP_RR ;

          esxcli storage nmp psp roundrobin deviceconfig set -d $i -I 3 -t iops ;

     done

Fred, its NLSAS 7k2. 2x 1Gb. I'm not advanced enough yet to interpret these results well but I thought they seemed about right after scanning similarly sized results. I think I am saturating my cache in these tests if that's relevant in comparison to yours. If the PowerVault is a better value that'll be good info for the next guy making a purchasing decision - or the next time I am.

Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read8.5467762116%
RealLife-60%Rand-65%Read13.2335022714%
Max Throughput-50%Read8.58673521022%
Random-8k-70%Read13.5834112613%
Reply
0 Kudos
fredlr
Contributor
Contributor

JB,

you're absolutely right, with NLSAS 7k2. and 2x 1Gb links it does make sense. I've added 2 benches (SAS 15k and NLSAS 7k2) with your 11+11 RAID10 setup to compare

Reply
0 Kudos
MKguy
Virtuoso
Virtuoso

It's not my post but I implemented the same in our environment and saw some (roughly 10-20%) improvements. Anyways, your numbers don't seem bad at all for a 1Gbit network with 7.2k RPM disks.

esxcli storage nmp psp roundrobin deviceconfig set -d $i -B 800 -t bytes;

800 bytes is not when a path switch should occurr. With Jumbo Frames (MTU 9000), you should switch paths every 8800 payload bytes. If you're not using Jumbo Frames then you should stick with your IOPS policy. Apart from that your commands seem fine and you can check the result with:

# esxcli storage nmp psp roundrobin deviceconfig get -d naa.6000eb38ccef4544000000000000017d
   Byte Limit: 8800
   Device: naa.6000eb38ccef4544000000000000017d
   IOOperation Limit: 1000
   Limit Type: Bytes
   Use Active Unoptimized Paths: false


# esxcli storage nmp device list | grep "Policy Device Config"
   Path Selection Policy Device Config: {policy=bytes,iops=1000,bytes=8800,useANO=0;lastPathIndex=0: NumIOsPending=1,numBytesPending=4096}
   Path Selection Policy Device Config: {policy=bytes,iops=1000,bytes=8800,useANO=0;lastPathIndex=0: NumIOsPending=0,numBytesPending=0}
   Path Selection Policy Device Config: {policy=bytes,iops=1000,bytes=8800,useANO=0;lastPathIndex=1: NumIOsPending=1,numBytesPending=4096}

Run your tests with these values again (if you're using Jumbo Frames!) What about 802.3x flow control btw?

-- http://alpacapowered.wordpress.com
Reply
0 Kudos
jb42
Contributor
Contributor

Good catch MK. I had caught that but neglected to update my command here.  flow control is on.

Fred, you're providing a great service. Do you feel like your result-set can basically be used as a size benchmark on the Real Life results?

I feel like my limiters are spindle count and spin rate. Even during the max throughput test I can see in the EQL performance data that the network ports are only working at 50%. I guess the iSCSI network ports will start to play a role at the point I have many guests pushing big IO? The EQL has 4 ports for iSCSI while the ESXi hosts have 2. Will that give me some room to upgrade if I hit this limit?

One question about the tests vs. real life: In real life yesterday, the devs copied a 133Gb backup database directory, containing 1x100Gb file and a few others, between the production and dev SQL server. They are on the same host but different LUNs. IO rate was spot on with Max Throughput, 50% read. IOPS were significantly lower - about a third. And I saw an initial read latency spike up to 60 ms, then sustained write latency averaging 24ms with two 5 minute spike to 40ms.

The tests don't seem to trigger the latency but I'm using a quiet host and datastore for testing. Is that the difference there?

Reply
0 Kudos
fredlr
Contributor
Contributor

JB,

Basically I was testing the MD3660F to determine if this <low cost, low-cache, high-density, current generation> SAS storage can replace our previous generation engenio-based (too) FC storage. In many way it appears that the MD3660F is ahead, taking advantage of its 8Gb FC links on sequential workloads with multiple clients on separate Raid groups and its faster electronic for maximum IOPS.
All of the 4 benchs are usefull to compare and make focus on characteristics of the hardware. A fifth 100% write could be usefull too but the impact of the RAID hardware and cache can already be guessed on write for those who are not host-side bandwidth limited.
Many results here miss the exact hardware (disks number, spindle speed, disabled read-ahead …) letting them hard to compare objectively. I tried to be as exhaustive as possible, showing how small configurations can impact the overall results.

For your streaming 100%read and 50%read results, if you indeed have 2x1Gb attachments, don't expect to go any further as you already maxed out the available network bandwidth. Or I'm I wrong on your connectivity ? And for the 2 IOPS benchmarks, that seems also to be a limit for 22 active NLSAS drives, network is not limiting you and even if you have more cache, I'm encountering similar values on your specific setup. For our workloads cumulated small independent RAID5 SAS arrays (1 Array=1 Lun) are prefered over larger shared ones (1 Array=n Luns). Thats's perhaps the cause of your latency spikes.

Reply
0 Kudos
jb42
Contributor
Contributor

You are right about my connectivity: 2x1GB/host, 4x1Gb/controller and yes, it looks like I'm hitting about 90% utilization on 2 EQL links during the max throughput tests. I spent an hour looking at performance data to prove that to myself but I suppose the math is easy. Smiley Wink If I'm running a 210MBps IO rate across two links, that'll be 105MBps per link, or 88%. duh!

So you've got 2x8Gb links setup on your NLSAS 2TB 7k2rpm / RAID10 11+11 test. Am I right then that the link limit is a combined 2048MBps so that you've established the spindle count+speed ceiling at about 700MBps?

So two more NICs per host may net me an additional 200MBps - in theory, untill all three hosts start competing over the same links.

Here's an interesting data point on the file copy operation I asked about. EQL support told me that SMB 2 (windows 2008+) has a max throughput of about 75MBps. I'm gonna fire off that copy job again so we can study the latency if it recurs.

Reply
0 Kudos
jb42
Contributor
Contributor

This was a cheap and relatively easy way to repurpose old hardware for shared backup storage. I'm hoping to utilize these Napp-it boxes with vSphere Data Protection. Have a couple of different ZFS pools mounted and am logging performance differences. Performance/cost ratio is shockingly good - and quite serviceable for backups! Still have some optimizations available too. Only running 1x1Gb NIC on the OI boxes right now.

SERVER TYPE: Windows Server 2008 R2 VM

CPU TYPE / NUMBER: Intel Xeon X5672 @ 3.20GHz / 2

HOST TYPE: Dell PowerEdge R510. Essentials Plus ESXi 5.1

STORAGE TYPE / DISK NUMBER / RAID LEVEL: OpenIndiana w/ Napp-it/ZFS. 1xGb MRU/ALUA

4 Discs / RAID 10 / OI Server 1
Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.1334861088%
RealLife-60%Rand-65%Read13.8527222142%
Max Throughput-50%Read12.67474214817%
Random-8k-70%Read18.4219671543%
2 Discs / RAID 1 / OI Server 2
Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.22346710818%
RealLife-60%Rand-65%Read13.5140363133%
Max Throughput-50%Read12.10495315411%
Random-8k-70%Read16.3832602533%

5 Discs / RAID Z / OI Server 2

Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.21346810815%
RealLife-60%Rand-65%Read6.6684266520%
Max Throughput-50%Read12.00498815523%
Random-8k-70%Read7.2176625922%

3 Discs / RAID Z / OI Server 1
Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.25346210812%
RealLife-60%Rand-65%Read84.70664520%
Max Throughput-50%Read14.02430213416%
Random-8k-70%Read123.28454322%
3 Discs / RAID Z / OI Server 1. Same 3 discs as above, smaller partitions.
Makes the variation interesting.
Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.14348010815%
RealLife-60%Rand-65%Read43.231198930%
Max Throughput-50%Read14.03430113414%
Random-8k-70%Read58.14884627%
Reply
0 Kudos
jb42
Contributor
Contributor

I feel like I'm falling down the rabbit hole with these Napp-it/OpenIndiana tests. Took some old hardware, clicked a few buttons and my DIY Backup SAN seems to be destroying my purpose-built EQL. But the numbers below don't seem to make sense. Can anyone shed any light?

6 2TB 7.2K Discs / RAID Z2 / OI Server w/ 48gb RAM, 2x 2.33 Ghz Quad-core processor, 1x 1Gb iSCSI NIC.

Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.16348110813%
RealLife-60%Rand-65%Read4.75122409547%
Max Throughput-50%Read12.13494615413%
Random-8k-70%Read4.84120249344%

Same setup, RAID 10.
Test nameLatencyAvg iopsAvg MBpscpu load
Max Throughput-100%Read17.25346110815%
RealLife-60%Rand-65%Read4.471303610150%
Max Throughput-50%Read11.95501215612%
Random-8k-70%Read4.56127989950%
Reply
0 Kudos
JonT
Enthusiast
Enthusiast

JB what multi-path provider and policy is currently in use for both solutions? I would imagine that the DIY box has the NMP provider and some form of DEFAULT "PSP". Is there a special PSP for the EQL or does it use the same one? Also you may be hitting some form of queue limitation on the EQL testing. I have found that there are almost too many layers to check queuing but here are the ones I focus on:

1. HBA Queue - ESXi default is 30 now, used to be 64 backin 4.x days

2. VMKernel Queue - not sure of default?

3. Device Queue (LUN) - Default is 30

4. Guest O/S Queue - Windows has a standard queue limit of 30 as well

If you see ANY of these queues maxxing out, your performance will never climb higher. Most of my testing shows that once you resolve the LUN queues by using more LUN devices in some sort of striped config, the other queues are mostly hardware limited. The HBA queue problem is usually fixed by adding more HBA's (or NIC's in your case). The VMKernel and Guest O/S queue problems are usually due to CPU or PCI limitations of the hardware.

Hope this helps!!

Reply
0 Kudos
PinkishPanther
Contributor
Contributor

Hi folks,

I've been a long time fan and occasional contributer here.  While I have not posted in awhile I have continued to use IO Meter with these parameters over the last few years.

Here's a new data point - I went back to 2010 in this thread searching for the keyword "Linux", no luck.

I have recently discovered in our environment that Linux guests have terrible IO and needed a measure between Linux and Windows so I went back to IO Meter with the dynamo agent sending the results to the Windows GUI.

On the second test all things being equal, same Host R820/512GBRAM/10Gb/iSCSI/PS6510es array, running a Windows and Linux guest with the same resources (4CPU/2GB RAM/ext4-10GB) we found that Windows would get 58MB/sec on test 2 (RealLife-60%Rand-65%Read) but Linux would get 3MB/sec - almost 20x worse.

This was confirmed using a database benchmark program which was able to run on Windows and Linux - same ratio.

Could someone post results from a Windows and Linux guest on the same ESX host?

We have tried dozens of iterations and nothing is making significant differences.

We have a Sev 3 ticket open with VMware from last Thursday where they wanted to recreate it and were supposed to get back to us today, nothing yet.

So we opened a Sev 2 this morning but no one has talked to us yet.

We are at risk of losing 2 customers to this issue.

Ideas?

Thanks in advance.

Reply
0 Kudos
mikeyb79
Enthusiast
Enthusiast

Yeah, that doesn't sound right to me at all. I have a number of Linux guests (Ubuntu) running on an EQL PS4000E and the performance between Linux and Windows VMs is very similar, generally within a few percent.

Can/have you tried creating a volume on the EqualLogic array, creating a new Linux guest, and using the new volume as an RDM?

How about patching to newest on the guest/host OSes and updating VMware Tools?

If I get some time tomorrow I will do some benches on a remote office EQL. Haven't done it in a while anyways.

Reply
0 Kudos
PinkishPanther
Contributor
Contributor

Good thoughts and feedback!

Yes, we did RDM earlier today, better but still only 1/3 of the windows speed (compared to 1/20th) - and not practical for our needs as a permanent solution.

We've tried versions 5.5-6.3 of CentOS and the latest Ubuntu - same issue in them all.

We are at 5.1b on some systems, and 5.1a on others.

Tomorrow we hope to have an iSCSI mapped from a CentOS physical host to compare, I'll post the results.

If necessary we will install 4.1 and test again but the issue is so extreme it must be some obvious issue staring us in the face. It's just hard to see since its so repeatable in our environment without a working system to compare against.

Reply
0 Kudos
JonT
Enthusiast
Enthusiast

Also make sure to have the vmware tools updated, and check your VM hardware version on the Guest Settings. I have seen these being old or not updated cause problems with disk IO on both Windows and Linux VM’s.

JonT

Reply
0 Kudos
PinkishPanther
Contributor
Contributor

Vmtools are from the 5.1b release and hardware is set to 9 on all tested guests.

We have also tried older versions since we don't know when this started.

It will be a big help if someone posts win/Linux results from an ESXi 5.1 system to prove its not impacting everyone! 

Ideally the windows random test numbers should be greater than 30MB/sec

Reply
0 Kudos