VMware Cloud Community
srodenburg
Expert
Expert

vSAN performance bottleneck is where?

Bonjour,

I have a customer running a vSAN 7 U1 , 2-node vSAN all-flash system (direct-connect / back2back) in a remote site and they get poor write-latencies (for an all-flash) inside the VM's (folks are complaining). Witness Appliance RTT = 13ms. HBA & Flashdrives are high-end 12G SAS (all on HCL) but the CPU's are rather low spec and I would like to know if i'm correct in my assumptions:
(I already talked them out of using compression or dedup+compression because of those CPU's).

I see the following average latencies reported (all from same time-stamp):
vm: 14ms
owner: 11ms
client: 4ms
disk: 2.5ms

The flash-drives latencies are not too bad (good enough for their use-case) but I'd say the CPU's are simply too light ?

Labels (1)
Reply
0 Kudos
7 Replies
Geogee
Enthusiast
Enthusiast

What are the CPU specs?

Speed, actual speed (Sleep states etc.)

For the All-Flash configurations, you can see Higher flash latencies, when the SSD drives are not under consistent load. because SSD are going to sleep and the wake up time is seen with higher latencies.

If you are seeing the VM latecies higher than Witnes host RTT, there seems to be something also with your configuration., because witness is used only during split braing scenario, not during normal work.

What policy are you using?

Do you have Witness out of Cluster? Do you really have the configuration as 2-Node cluster?

Reply
0 Kudos
srodenburg
Expert
Expert

CPU = Xeon D-1541 @ 2.10GHz (8-core) with an 3:1 vCPU pCPU overcommit. CPU Ready % is very very low.
Sleepstates are disabled (as they should be) and SSD's don't go to sleep in servers (where did you get THAT from...). They can idle but when a bunch of VM's are working, storage devices never get any rest anyway so forget about idling, let alone ever getting a chance to go to sleep...

Policy is FTT=1 (what else) and the witness is out of cluster obviously (it's not even possible to have it inside a stretched cluster).

Reply
0 Kudos
Geogee
Enthusiast
Enthusiast

One of my customers have been facing with that issue during LOW load of Flash drives. Not into sleep, but into idle, you are right.

Reply
0 Kudos
srodenburg
Expert
Expert

If there is "no load" to "low load", there cannot be an issue with latency (because who would care). I'm talking about VM's that are doing a fair amount of work. If you read my opening post again, you can see that the DISK latency is low. However, the VM's are "feeling" much higher latencies and i'm pretty sure that the "software raid engine" (vSAN) is suffering from under-spec'd CPU's for what is asked from them. This whole talk about SSD's is moot as the issue is much higher up i'm sure.

I analysed what the controller and disks are doing and know the issue is not "down there".

Reply
0 Kudos
depping
Leadership
Leadership

I would expect it to be a CPU issue indeed, considering it is XEON-D it would most definitely not surprise me. The systems weren't really created for high load with various VMs. What you could always do is create a test VM with RTT=0 and align the storage with compute just to see if it gets better or not?  When testing our nano-edge appliances, we did see similar behavior, and mainly caused by the lack of CPU power unfortunately.

I would also toggle the "site read locality" switch, this should be enabled for stretched, but disabled for 2-node non-stretched.

Reply
0 Kudos
srodenburg
Expert
Expert

Hey Duncan,

Site read locality is enabled (it's a stretchy 2-node). I turned it off temporarily and ran a benchmark (several times with and then without read loc.) and the writes are not affected (obviously) but the read-latency was about 20% worse (as we now have both nodes participating in the reads). As the people in that remote site already went home for the evening and it's not backup-time yet, I was free to mess about with all those VM's sitting mostly idle so there was little interference (benchmark results where pretty consistent).

I'll try the FTT=0 test next just to see and chime in with the results.

Something else and out of curiosity: is the processing of checksums offloaded to hardware (AESNI like) ?
(checksums are not encryption but maybe the hardware can be used in the same fashion)

Reply
0 Kudos
depping
Leadership
Leadership

yes, checksums are offloaded normally (CRC32 algo is used).

Reply
0 Kudos