VMware Cloud Community
BradI100
Enthusiast
Enthusiast

Help me understand this VMware storage concept

Hi all,
I have been struggling to understand a difference in storage speed that occurs in my VMware environment.

To give an example, yesterday I shutdown a vm and did a "cold" storage migration to different iscsi storage.  The vm moved quickly (for me) at an average write speed of 60MB/s to the new unit.   But once I booted the same VM on the same host, and storage, I found the speeds disappointing.  Using IOMETER on that VM, I could only get a max write speed of 5MB/s.

I've seen this play out in the same way with different storage units time and time again.  So why the big difference?  Does it suggest something is configured wrong?
15 Replies
JohnADCO
Expert
Expert

Write speed of 5MB on what IO meter test parameters exactly?

Does the VM seem not peppy in general?

Do file copies seem OK from within the VM?

0 Kudos
BradI100
Enthusiast
Enthusiast

for iometer settings I typically start with a write of 4k (0% read, 0% random) and 4 targets.  Then I double the targets for each successive test until I hit  - to 32 or so. (Does that sound like the right type of test?)

The copies within the vm are normally so-so, but sometimes the disk latency gets very bad and freezes up the vm for 2-3 seconds.

0 Kudos
kjb007
Immortal
Immortal

How much data was actually moved?  How full was the vm, and how large was it?

Have you watched esxtop during a file copy or a io meter run?  Seeing how the latencies compare between the guest, kernel, and device?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
BradI100
Enthusiast
Enthusiast

The VM is 30GB in size with 16GB in use.

I did not run esxtop but that's a good idea. I will do that and post results.

0 Kudos
BradI100
Enthusiast
Enthusiast

Here are my stats from esxtop.

for the VM: VMNAME           VDEVNAME NVDISK   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s LAT/rd LAT/wr

..........................Sec2             -                   1        1487.30     0.00           1487.30     0.00             5.81        0.00   2.58


Disk adapter

ADAPTR PATH                 NPTH   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd

vmhba33 -                      14           1269.02     0.00       1269.02         0.00         4.95             2.90              0.00          2.91         0.00

Disk Device

DEVICE   PATH/WORLD/PARTITION DQLEN WQLEN ACTV QUED %USD  LOAD   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cm
naa.6001a...       -                             128             -        4       0         3       0.03       1028.40     0.20      1028.21     0.00     4.01     3.71     0.0             0.0

These seem to be pretty standard overall, but I did notice sometimes the read latency would hover at 300+ and at one time it hit 3100

Do any of these send off alarm bells?

0 Kudos
kjb007
Immortal
Immortal

The numbers look pretty good.  I tend to ignore spikes that are way out in left field, unless they occur regularly and/or for prolonged intervals.

Were you doing a copy at this time, or running IOmeter?  Looks like all writes.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
BradI100
Enthusiast
Enthusiast

Thanks.  This was an iometer write test.  But I still don't understand why a cold storage vmotion can write at ~70MB/s but as you see in my write test above I only get ~6MB/s.  Why is there a speed difference so extreme?

0 Kudos
mcowger
Immortal
Immortal

How many outstanding IOs?

--Matt VCDX #52 blog.cowger.us
0 Kudos
BradI100
Enthusiast
Enthusiast

This was 32 outstanding IOs

0 Kudos
BradI100
Enthusiast
Enthusiast

Quick follow up.  (forgive me for being sort of clueless to this testing). But when I set the number of IOs to 20 and increased the workers to 10, I started writing at up to 30MB/s! Wow, we're getting someplace.   Ok, that leaves me with another question.  Does ESXi/vSphere cap storage bandwidth on a per worker basis?

0 Kudos
kjb007
Immortal
Immortal

Not exactly.  You are still limited by queues on every level of the IO path.  The guest maintains disk queues, the host maintains LUN queueus, and depending on your array, the  storage maintains queues of its own.

You're getting throughput by limiting how many IOs you will leave open, but then are allowing multiple IO producers that are doing the same thing.

While one worker is waiting for a particular IO, another is able to perform more IO before it too has to wait.

If you're able to properly take a job and break it up into pieces, then the job will definitely run faster.

That's why IOmeter is a good tool, because you can test all of these things out, but can get mighty tricky to interpret depending on how you set everything up.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
BradI100
Enthusiast
Enthusiast

This is very helpful - thanks!

What I think is a challenge with this storage unit is it SEEMS to view each server app as a single worker thread - no matter how many clients are connected. So as an example, I have a sharepoint type of app that 40 clients use all day long. Still, the reads and writes never go above ~6MB/s.

So when one client is working on a big PDF they snag all of the storage bandwidth - thereby timing out other users.

I do not have this issue with my home built Starwind server.

Do you know of any way to open up this pipe so this storage allocates better on a per user basis?

0 Kudos
kjb007
Immortal
Immortal

What type of device is it, and how many storage units/drives/LUNs are you presenting as datastores?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
BradI100
Enthusiast
Enthusiast

It is a Drobo Elite. I am presenting 3 datastores to it on 2 different LUNS. (one server has 2 hard drives). The other is a Windows 7 VM that keeps time and settings for a security system.

0 Kudos
kjb007
Immortal
Immortal

Do you have RAID configured, and if so, how is it configured?  How many drives are part of the raid set?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos