VMware Cloud Community
C3LLC
Contributor
Contributor

Poor Write Performance

All-

We have a 5 node vSAN (hybrid) Proof of Concept running with each node configured as follows:

  • Cisco C240 M4SX
  • Dual Intel E5-2690v4 Proc
  • 384GB RAM
  • Cisco UCS 12GB Modular SAS Controller (UCSC-SAS12GHBA)
  • Cisco VIC 1227 MLOM (dual 10gbps uplinks)
  • Two Disk Groups each with
    • 1x Intel DC P3700 NVMe 800GB Cache Tier
    • 5x Cisco 1.2TB HDD

Right now, there are just 3 small VMs running on the entire cluster, one of which is my virtual desktop.

The issue I'm seeing is that, using IOMeter, I am getting what I believe to be very poor write performance.   Using a 4kb 0% Read, 0% Random test (which is all writes) I am getting about 2000 IOPs and a throughput of about 8.25MBPS.    If I flip this around to 4kb 100% Read, my results jump to 39-40k IOPs and a throughput of 160MBps.

Any ideas why is my write performance so poor?   Again, there is basically zero running on this cluster.

Thoughts?

0 Kudos
5 Replies
zdickinson
Expert
Expert

Good morning, I assume you're running IOMeter on one VM.  If you spin up a few more and do the write test can each VM do 2,000 IOPS?  In that setup I would expect, in a perfect world, to hit 20,000 write IOPS with 10,000 to 15,000 write IOPs to be realistic.  If you run IOMeter on 5 VMs, can you achieve 10,000 total IOPS?  Thank you, Zach.

Edit:  I would also add that FTT and Stripe Width can greatly affect write performance on a VM.

0 Kudos
kastlr
Expert
Expert

Hi,

what disk type did you use for the VM running IOMeter, thin, lazy or eager zeroed thick?

And did you allow IOMeter to create the testfile prior the test run?

If not using eager zeroed thick, each write to an untouched block will be holded until the ESXi kernel completes the wipe block, than the application data will be written.

VMware did recommend to use HCIbench to test VSAN performance.

Regards,

Ralf


Hope this helps a bit.
Greetings from Germany. (CEST)
0 Kudos
C3LLC
Contributor
Contributor

Ok, so we installed HCIBench and retested.   The results were much better:

VMs = 5

IOPS = 76229.09 IO/s

THROUGHPUT = 297.78 MB/s

LATENCY = 1.3038 ms

R_LATENCY = 1.3424 ms

W_LATENCY = 1.2648 ms

=============================

Resource Usage:

CPU USAGE = 11.96%

RAM USAGE = 7.47%

VSAN PCPU USAGE = 4.0007%

Parameters used were:

10 VMDKs

50 Working Set Percentage

2 Threads Per Disk

4K Block Size

50% Read

100% Random

Ran Test for 3 minutes

A couple of observations:

- Client Cache Hit Rate is always 0?

- All 5 hosts reporting vmknic errors, specifically DupAckRx and DupDataRx

- All 5 hosts report 0 pnic errors

Thoughts?

0 Kudos
C3LLC
Contributor
Contributor

Anyone?

0 Kudos
zdickinson
Expert
Expert

Good morning, where do you see the client cache hit rate of zero?  What does the bandwidth on the vSAN network look like?  If you're close to 100%, that would explain the errors.  That throughput and IOPS look good to me.  Thank you, Zach.

0 Kudos