for iometer settings I typically start with a write of 4k (0% read, 0% random) and 4 targets. Then I double the targets for each successive test until I hit - to 32 or so. (Does that sound like the right type of test?)
The copies within the vm are normally so-so, but sometimes the disk latency gets very bad and freezes up the vm for 2-3 seconds.
How much data was actually moved? How full was the vm, and how large was it?
Have you watched esxtop during a file copy or a io meter run? Seeing how the latencies compare between the guest, kernel, and device?
Here are my stats from esxtop.
for the VM: VMNAME VDEVNAME NVDISK CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s LAT/rd LAT/wr
..........................Sec2 - 1 1487.30 0.00 1487.30 0.00 5.81 0.00 2.58
ADAPTR PATH NPTH CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
vmhba33 - 14 1269.02 0.00 1269.02 0.00 4.95 2.90 0.00 2.91 0.00
DEVICE PATH/WORLD/PARTITION DQLEN WQLEN ACTV QUED %USD LOAD CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cm
naa.6001a... - 128 - 4 0 3 0.03 1028.40 0.20 1028.21 0.00 4.01 3.71 0.0 0.0
These seem to be pretty standard overall, but I did notice sometimes the read latency would hover at 300+ and at one time it hit 3100
Do any of these send off alarm bells?
The numbers look pretty good. I tend to ignore spikes that are way out in left field, unless they occur regularly and/or for prolonged intervals.
Were you doing a copy at this time, or running IOmeter? Looks like all writes.
Thanks. This was an iometer write test. But I still don't understand why a cold storage vmotion can write at ~70MB/s but as you see in my write test above I only get ~6MB/s. Why is there a speed difference so extreme?
Quick follow up. (forgive me for being sort of clueless to this testing). But when I set the number of IOs to 20 and increased the workers to 10, I started writing at up to 30MB/s! Wow, we're getting someplace. Ok, that leaves me with another question. Does ESXi/vSphere cap storage bandwidth on a per worker basis?
Not exactly. You are still limited by queues on every level of the IO path. The guest maintains disk queues, the host maintains LUN queueus, and depending on your array, the storage maintains queues of its own.
You're getting throughput by limiting how many IOs you will leave open, but then are allowing multiple IO producers that are doing the same thing.
While one worker is waiting for a particular IO, another is able to perform more IO before it too has to wait.
If you're able to properly take a job and break it up into pieces, then the job will definitely run faster.
That's why IOmeter is a good tool, because you can test all of these things out, but can get mighty tricky to interpret depending on how you set everything up.
This is very helpful - thanks!
What I think is a challenge with this storage unit is it SEEMS to view each server app as a single worker thread - no matter how many clients are connected. So as an example, I have a sharepoint type of app that 40 clients use all day long. Still, the reads and writes never go above ~6MB/s.
So when one client is working on a big PDF they snag all of the storage bandwidth - thereby timing out other users.
I do not have this issue with my home built Starwind server.
Do you know of any way to open up this pipe so this storage allocates better on a per user basis?
It is a Drobo Elite. I am presenting 3 datastores to it on 2 different LUNS. (one server has 2 hard drives). The other is a Windows 7 VM that keeps time and settings for a security system.