Storage Performance: VMFS and Protocols

Introduction

VMware's customers are always asking us about the storage stack. Without exception, the two most common questions about our storage system performance are:

Which storage protocol performs best?
Does VMFS scale to meet the demands of many servers and VMs?

This document will contain a few of the points needed to help understand this issue.

Storage Protocols

VMware published a paper comparing storage protocols in 2008. This paper detailed the two key characteristics of ESX's storage stack:

The hypervisor is easily able to drive the storage connection to link speed.
Configurations where protocol management happens in the HBA (Fibre Channel and HW iSCSI) are more CPU efficient.

On the first note, take the following graph, taken from page three of the paper:

Note that in this case all four test cases drive the storage to link speed. That's 2 Gb/s with the Fibre Channel HBA and 1 Gb/s with the other three. In short, if throughput is your goal, make decisions based on link speed. If you check through the rest of the paper, you'll see that response time is similar for all of the configurations, as well. But you will see slight differences in throughput in some of the protocols.

This brings us to the second point from above: less work is done by the CPU when protocol management can be off-loaded to the HBA. This means that FC and HW iSCSI HBAs will have additional CPU cycles for the VMs' work. It can also explain the slight differences in throughput in the other graphs in the paper. The efficiency results quoted in the paper are here:

The increased overheads of running software iSCSI or NFS are due to the VMkernel managing those protocols. It's worth noting that the proliferation of iSCSI in the enterprise has led VMware to spend considerable effort to improve the efficiency of SW iSCSI. Expect its efficiency to improve dramatically in the following releases.

VMFS Scalability

Many in the industry erroneously believe that VMFS won't scale as storage demands grow. Often SCSI reservations and disk locking are cited as the technical-sounding but vaguely-supported reason for this claim. It's worth sampling data from our scalable storage performance paper to debunk this myth.

This chart is a favorite in our world-wide tours as we address VMFS scalability. It's was first introduced in a VMFS scalability blog article that went live in February of 2008. It shows the results of using 64 hosts to generate a variety of traffic on a single VMFS volume. And it's a wealth of information on VMFS and storage access patterns. For instance:

The aggregate number of random writes, in cyan in the middle, maintains perfectly flat linear scalability as the host count grows from 1 to 64.
The aggregate number of random reads is initially limited by the few disks being accessed but ultimately matches the throughput of random writes as many disks come to bear to serve the large number of random reads.
The sequential read activity, which highlights the strengths of today's arrays, demonstrates the largest total throughput which only slightly drops as the array manages so many connections.
But the sequential read activity drops off dramatically as hosts are added.

This last example showing degradation in aggregate sequential read capability is an artifact of the workload that is very important to database administrators: multiple sequential reads approximate random activity. Why is this? As many hosts request more and more sequential data, the array interleaves these requests to maintain response times. This means that the sequential accesses get "shuffled" which results in a random access pattern.

In short, VMFS has no scalability problems as many hosts drive tremendous amounts of traffic to a single volume. If the data isn't convincing enough, consider the following: there are no SCSI reservations used during normal data access. This means that there are no scalability limitations as a result of virtual machine storage access. A word of caution, though: the file system is locked during administrative operations that change the metadata on the volume. This means that virtual machine creation or destruction can will result in file system locks. Perform these operations off of peak hours.

All

Storage Performance: VMFS and Protocols