Hello,
What protocol is used in VMware vSAN?
iSCSI? FCoE? FCIP?
If you're referring to a host's IOPS to the physical HDD/SSDs then is all local storage so you're riding the bus from your storage controller to the disks.
If you're talking about vSAN communications within the cluster, then it uses a proprietary protocol.
https://blogs.vmware.com/virtualblocks/2015/05/29/20-common-vsan-questions/
It's proprietary protocol called Reliable Datagram Transport (RDT) More info: https://download3.vmware.com/vcat/vmw-vcloud-architecture-toolkit-spv1-webworks/index.html#page/Stor... + https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vcat/architecting-vmware-virtual-s...
Thanks, but where can I read a more detailed description of the protocol?
Does it run on top of ethernet frames? or on top of IP? or over another protocol?
@vm7user , RDT uses TCP protocol for data-transmission between nodes (port 2233) and UDP for cluster membership (port 12321 in modern versions). As this is a proprietary protocol you aren't going to find the specification details on external-facing pages, if you have internal-VMware access these are relatively easily found. If you are working for a 3rd-party developer that wants/needs to understand how this works then likely avenue is contacting SDK support teams to see what information they can provide.
I am very surprised that the slow TCP/IP network stack is used. This introduces high latencies.
@vm7user, if you think microseconds (uSec) on network transmission is the main cause of end-application (e.g at the vscsi/VM/OS-level) latency then I would advise you do more research on the entire stack that data for end-consumers of any storage solution has to traverse.
Will vSAN utilise/support iWARP (or any other form of RDMA) in the future? Very likely yes as this is this is an obvious path forward once enough hardware is in the field supporting these (and enough consumers using enough to not be niche).
10-30 uSec make no difference in real-life cases because of the end-to-end latency of All-Flash storage on the guest level is 0,5-5ms on average. Also just deviation of latency of fast enterprise NVMe SSD (not mean, but 95-99th percentile) ~ 10s-100s us.
More than that - there were public benchmarks of vSAN on RDMA and TCP/IP a few years ago that doesn't show a huge difference - http://www.yellow-bricks.com/2018/09/05/hci2476bu-tech-preview-rdma-and-next-gen-storage-tech-for-vs...
I am planning vSAN using Optane SSD with 10 us latency.
The extra latency in the network stack and software is critical.
Hi,
I recommend you to use PCI NVME for cache and normal SSD for capacity drives. In real world Full ssd vsan configs beats the full NVME optane configs. Using all Optane NVME drives does not come with the big performance difference . One of my customer have all Optane NVME 6 node they can only get 150K IOPS in 1ms aggregated latency with 16k block size. The other customer have 6 all flash ssd nodes and they get the 200k with the same tests in 1ms. The tests were done by DellEMC team .
@ahmet_kececiler This is surprising. What was the Dell/VMware response to this? Are the configs of each deployment confirmed and verified so that no other issues could be occurring?
They said everything is normal and performance is just as expected. I really surprised. Also ı have a deep knowledge both VSAN and Nutanix platform i never accept the results by myself. But they said ok and the customers happy with that but i am not.
All the configs made by DellEMC and Support team checked that and everything is fine with the installations.
And one more thing about NVMEs none of the Nutanix guys suggests using NVME drives they prefer to use ssd drives instead of NVMEs all the times .
Just an FYI for @vm7user , starting with 7.0 Update 2, vSAN now also supports RDMA.
Yes i really want to try RDMA with NVME drives.
Hi,
when you say data between nodes, does it also include virtual machine files data i.e. virtual disk files and such?
Lately I have been getting RDT checksum errors on my hosts in the cluster, so I am trying to figure out what it can transfer, and where can errors ocure (NICs, vmhba adapters, physical switches, ...).