I have a customer running a vSAN 7 U1 , 2-node vSAN all-flash system (direct-connect / back2back) in a remote site and they get poor write-latencies (for an all-flash) inside the VM's (folks are complaining). Witness Appliance RTT = 13ms. HBA & Flashdrives are high-end 12G SAS (all on HCL) but the CPU's are rather low spec and I would like to know if i'm correct in my assumptions:
(I already talked them out of using compression or dedup+compression because of those CPU's).
I see the following average latencies reported (all from same time-stamp):
The flash-drives latencies are not too bad (good enough for their use-case) but I'd say the CPU's are simply too light ?
What are the CPU specs?
Speed, actual speed (Sleep states etc.)
For the All-Flash configurations, you can see Higher flash latencies, when the SSD drives are not under consistent load. because SSD are going to sleep and the wake up time is seen with higher latencies.
If you are seeing the VM latecies higher than Witnes host RTT, there seems to be something also with your configuration., because witness is used only during split braing scenario, not during normal work.
What policy are you using?
Do you have Witness out of Cluster? Do you really have the configuration as 2-Node cluster?
CPU = Xeon D-1541 @ 2.10GHz (8-core) with an 3:1 vCPU pCPU overcommit. CPU Ready % is very very low.
Sleepstates are disabled (as they should be) and SSD's don't go to sleep in servers (where did you get THAT from...). They can idle but when a bunch of VM's are working, storage devices never get any rest anyway so forget about idling, let alone ever getting a chance to go to sleep...
Policy is FTT=1 (what else) and the witness is out of cluster obviously (it's not even possible to have it inside a stretched cluster).
If there is "no load" to "low load", there cannot be an issue with latency (because who would care). I'm talking about VM's that are doing a fair amount of work. If you read my opening post again, you can see that the DISK latency is low. However, the VM's are "feeling" much higher latencies and i'm pretty sure that the "software raid engine" (vSAN) is suffering from under-spec'd CPU's for what is asked from them. This whole talk about SSD's is moot as the issue is much higher up i'm sure.
I analysed what the controller and disks are doing and know the issue is not "down there".
I would expect it to be a CPU issue indeed, considering it is XEON-D it would most definitely not surprise me. The systems weren't really created for high load with various VMs. What you could always do is create a test VM with RTT=0 and align the storage with compute just to see if it gets better or not? When testing our nano-edge appliances, we did see similar behavior, and mainly caused by the lack of CPU power unfortunately.
I would also toggle the "site read locality" switch, this should be enabled for stretched, but disabled for 2-node non-stretched.
Site read locality is enabled (it's a stretchy 2-node). I turned it off temporarily and ran a benchmark (several times with and then without read loc.) and the writes are not affected (obviously) but the read-latency was about 20% worse (as we now have both nodes participating in the reads). As the people in that remote site already went home for the evening and it's not backup-time yet, I was free to mess about with all those VM's sitting mostly idle so there was little interference (benchmark results where pretty consistent).
I'll try the FTT=0 test next just to see and chime in with the results.
Something else and out of curiosity: is the processing of checksums offloaded to hardware (AESNI like) ?
(checksums are not encryption but maybe the hardware can be used in the same fashion)