We have recently deployed VSAN at two new datacenter locations, the first will become our primary DC and the second will be our DR site.
Both sites is using new HPE hardware as of May 2016.The primary site contains four identical VSAN hosts that were spec'd very generously:
dual Xeon E5-2699 v3 (18core/36thread)
512GB RAM
4x 800GB SSD, 12G SAS (model HP MO0800JEFPB)
20x 1.2TB HDD, 10K 10G SAS (model HP EG1200JEHMC)
4x 10GBe NIC
VSAN config on each host is four disk groups of 800GB SSD + 5 HDD.
VSAN traffic on each host is handled by 2x 10GBe NICs working together in an LACP LAG group. The NICs connect to the facility switching infrastructure which the service provider has set up for our organization using a specific VLAN for VSAN traffic. We have no visibility into the switching infrastructure so I cannot provide data from the switches at this point in time.
When running the built-in VSAN Storage Performance Test using workload type "Basic sanity test, focus on flash cache layer) I'm seeing poor results in the Throughput MB/s and Max Latency. Results below are averaged across all tested components (10 components are displayed per host)
IOPS | Throughput MB/s | Average Latency (ms) | Maximum Latency (ms) |
---|
2224 | 8.7 | 0.66 | 304 |
Maybe I am not fully understanding this test, or the test is not truly representative of real-world performance, but 8.7MB/s seems extremely slow for the hardware and network we have in place. Also max latency of 304ms seems excessively high. Monitoring tools we use start throwing alerts when storage latency exceeds 50-75ms so these results aren't looking all that good.
Can anyone help me understand why this is happening? Again, not familiar with how this test stacks up against real-world performance, so if others have had similar poor results while using the built-in benchmark I would be interested to hear about those scenarios as well.
Currently running vCenter/ESXi 6.0 vanilla, I've recommended to management that we upgrade the environment to 6.0 Update 2 to expose the new VSAN Performance Service which should provide better real-world data but for now we would like to better understand the results of the built-in benchmark test.