Hi All,
Having some vSAN write performance issues, I would appreciate your thoughts.
The basic spec;
5x vSAN ready nodes, 2x AMD EPYC 7302 16-Core Processor, 2TB RAM, 20x NVMe disks across 4 disk groups. 4x Mellanox 25GbE Networking, Jumbo frames configured E2E.
When running any workloads, including HCIBench we are observing really poor write performance. See below, 30 minutes of 30+ms write latency. Reads are through the roof 400k+ IOPS, writes between 20-40k IOPS depending on parameters. Took 12 hours to consolidate a 10TB snapshot the other day!
Things I have tried:
Notes:
Any ideas please we really expected better.
Hi Brian
Seem to be facing this exact issue with Splunk on VSAN, I see 512K blocks getting thrown at vSAN and confirmed with our Linux admins that the max_sectors_kb still defaults at 512, which seems to be the issue (will conform next week)
I am curious though, we default pvscsi in our puppet configs to settings per https://kb.vmware.com/s/article/2053145
vmw_pvscsi.cmd_per_lun=254 |
vmw_pvscsi.ring_pages=32 |
What made you move away from the large-scale IO pvscsi settings after setting max_sectors_kb to 64K ?