VMware Cloud Community
Anchie
Contributor
Contributor

HCIBench best practices

Hey all,

Would anyone be able to share some HCIBench performance best practices? I am new to VMWare world and started using HCIBench recently but getting such crappy performance Smiley Sad

I am trying to figure out the best ratio of VMs to VMDKs and the sizes, queue depths, etc. I have 1 large datastore (8TB) that I am testing against and I have tried from 4 to 32 VMs on it, with equal or half the amount of VMDKs and was playing with queue depths, from 32 to 8.

When I say crappy I mean, I had latencies over 200ms on average, to best case 1.5ms. But I think it can do better than that Smiley Happy Would anyone have any recommendations for this? Any links or documentation you have for me to do some reading? And here I am only trying to get the best performance, so please disregard any rules regarding  maintenance and all that.

Tags (2)
3 Replies
TheBobkin
Champion
Champion

Hello @Anchie,

Welcome to Communities.

"I am new to VMWare world"

This forum is a good place to ask questions, storagehub.vmware.com/t/vmware-vsan/ is good for basic to in-depth vSAN information and labs.hol.vmware.com is a great resource for getting hands-on with creating/smashing clusters.

"started using HCIBench recently but getting such crappy performance "

What are you running this on?

There will obviously be a stark difference between a 4-node All-flash cluster with 2TB storage each spread over 2x 1TB disks with a decent sized (600GB) cache-tier device vs a 2-node Hybrid stretched-cluster with each node having 1x 4TB capacity-tier drive and a 200GB cache-tier device.

The goal of using HCIBench is not to achieve a 'high-score' but to get an estimate of the capabilities of a cluster - what workloads it performs well at and can handle, and what it cannot.

If the thing is falling apart on easy-run then maybe you need to consider the design of the cluster and whether it is fit for purpose.

"was playing with queue depths, from 32 to 8"

Are you speaking of AQLEN and/or DQLEN here?

Queue Depths should be set to the maximum supported that these devices can handle (32 here assuming SATA).

"And here I am only trying to get the best performance"

Is this a pre-production cluster that you are testing?

What is the sizing, IO size and profile and IOPS demand of the workload that is expected to be put on this cluster?

Bob

Reply
0 Kudos
Anchie
Contributor
Contributor

Hi TheBobkin,

Thanks so much for your response. I have one test cluster with all flash Enterprise SSDs, and I am using about 60% of the total storage: I have one large volume which I mapped to one datastore and it is 8TB.

From there in HCIBench I decide how many VMs and VMDKs per VM to configure. I did not play with any other settings like AQLEN, so I am assuming they are all default and would need to verify what they are. I am trying to achieve some type of measurement I use to size up or down, like how many VMs or VMDKs I can use that can do 10K per VM and be in sub-millisecond latency range.

The best I have achieved so far is 2 VMs/2 VMDKs, (32 DQLEN - I think that's configurable in HCIBench) and a requested IORate of 10K per VM. I am wondering and seeking to understand the ratio of VMDKs per VM. Obviously, if I reduce the DQLEN to 16 or 8, I can run to more VMs and still be in the 10K/1ms range. Perhaps there is more to tune on the storage side, but I am also wondering what I can do in my configuration to help achieve more while still staying under 1ms.

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello Anchie​,

"I have one large volume which I mapped to one datastore and it is 8TB."

I don't follow you here.

vSAN Storage works like this: each node in the cluster has locally-attached drives which are used to create disk-groups, these consist of a cache-tier fronting SSD/NVMe then 1-7 capacity-tier drives (either SSD or HDD) to store data on. The storage capabilities of these 'local' disk-groups are then pooled as a shared resource for the whole cluster ( the vsanDatastore).

As I said in my previous comment: this 8TB being split over 2x 4TB capacity-tier drives on 2 nodes vs the same being split over multiple smaller drives on more hosts may play a large part on what can be expected from a performance perspective. ​

Bob