VMware Cloud Community
amosperrins
Contributor
Contributor

iSCSI VAAI in Windows guests

I was wowed by the difference between our VAAI & non-VAAI storage and was wondering if guests can leverage these capabilities? We've got the following setup:

  • ESXi 6.0u3
  • vSphere 6.5
  • HP MSA 2040 SAN (iSCSI, 4x1Gbps MPIO, IOPS=1)
  • Windows Server 2012 R2 (guest OS)

We're seeing expected speeds when cloning/copying VMs (all I/O is offloaded to the array, copies are ~260MB/s) but in-guest file copies in Server 2012 R2 are still going via the SAN fabric through the ESXi host (average transfer speed is about 50MB/s, standalone reads & writes are >300MB/s). I'm assuming this is expected behaviour but is there any way to leverage the iSCSI primitives (specifically XCOPY) in the guest OS via the ESXi I/O stack or would I have to ditch my VMFS datastores and expose the LUNs directly to the Windows OS? Have I fundamentally misunderstood how VAAI works?

Thanks for your time! Smiley Happy

3 Replies
Rsahi
Enthusiast
Enthusiast

I dont think Wi2k12 has native capability to Leverage VAAI, as VAAI capability is specifically for vsphere environment, (vsphere api for array integration), esxi to be precise. In case you want to improve the performance within VMs enable vvols, in VVOLs all the offloading is done to storage

There are currently three areas where VAAI enables vSphere to act more efficiently for certain storage-related operations:

Copy offload. Operations that copy virtual disk files, such as VM cloning or deploying new VMs from templates, can be hardware-accelerated by array offloads rather than file-level copy operations at the ESX server. This technology is also leveraged for the Storage vMotion function, which moves a VM from one data store to another. VMware’s Full Copy operation can greatly speed up any copy-related operation, which makes deploying new VMs a much quicker process. This ca n be especially beneficial to any environment where VMs are provisioned on a frequent basis or when many VMs need to be created at one time.

Write same offload. Before any block of a virtual disk can initially be written to, it needs to be “zeroed” first. (A disk block with no data has a Null value; zeroing a disk block writes a zero to it to clear any data that may already exist on that disk block from deleted VMs.) Default “lazy zeroed” virtual disks (those zeroed on demand as each block is initially written to) do not zero each disk block until it is written to for the first time. This causes a slight performance penalty and can leave stale data exposed to the guest OS. “Eager-zeroed” virtual disks (those on which every disk block is zeroed at the time of creation) can be used instead, to eliminate the performance penalty that occurs on first write to a disk block and to erase any previous VM data that may have resided on those disk blocks. The formatting process when zeroing disk blocks sends gigabytes of zeros (hence the “write same” moniker) from the ESX/ESXi host to the array, which can be both a time-consuming and resource-intensive process. With VMware’s Block Zeroing operation, the array can handle the process of zeroing all of the disk blocks much more efficiently. Instead of having the host wait for the operation to complete, the array simply signals that the operation has completed right away and handles the process on its own without involving the host.

Hardware-assisted locking. The VMFS file system allows for multiple hosts to access the same shared LUNs concurrently, which is necessary for features like vMotion to work. VMFS has a built-in safety mechanism to prevent a VM from being run on or modified by more than one host simultaneously. vSphere employs “SCSI reservations” as its traditional file locking mechanism, which locks an entire LUN using the RESERVE SCSI command whenever certain storage-related operations, such as incremental snapshot growth, occur. This helps to avoid corruption but can delay storage tasks from completing as hosts have to wait for the LUN to be unlocked with the RELEASE SCSI command before they can write to it. Atomic Test and Set (ATS) is a hardware-assisted locking method that offloads the locking mechanism to the storage array, which can lock at individual disk blocks instead of the entire LUN. This allows the rest of the LUN to continue to be accessed while the lock occurs, helping to avoid performance degradation. It also allows for more hosts to be deployed in a cluster with VMFS data stores and more VMs to be stored on a LUN.

There are currently three areas where VAAI enables vSphere to act more efficiently for certain storage-related operations:

·         Copy offload. Operations that copy virtual disk files, such as VM cloning or deploying new VMs from templates, can be hardware-accelerated by array offloads rather than file-level copy operations at the ESX server. This technology is also leveraged for the Storage vMotion function, which moves a VM from one data store to another. VMware’s Full Copy operation can greatly speed up any copy-related operation, which makes deploying new VMs a much quicker process. This ca n be especially beneficial to any environment where VMs are provisioned on a frequent basis or when many VMs need to be created at one time.

·         Write same offload. Before any block of a virtual disk can initially be written to, it needs to be “zeroed” first. (A disk block with no data has a Null value; zeroing a disk block writes a zero to it to clear any data that may already exist on that disk block from deleted VMs.) Default “lazy zeroed” virtual disks (those zeroed on demand as each block is initially written to) do not zero each disk block until it is written to for the first time. This causes a slight performance penalty and can leave stale data exposed to the guest OS. “Eager-zeroed” virtual disks (those on which every disk block is zeroed at the time of creation) can be used instead, to eliminate the performance penalty that occurs on first write to a disk block and to erase any previous VM data that may have resided on those disk blocks. The formatting process when zeroing disk blocks sends gigabytes of zeros (hence the “write same” moniker) from the ESX/ESXi host to the array, which can be both a time-consuming and resource-intensive process. With VMware’s Block Zeroing operation, the array can handle the process of zeroing all of the disk blocks much more efficiently. Instead of having the host wait for the operation to complete, the array simply signals that the operation has completed right away and handles the process on its own without involving the host.

·         Hardware-assisted locking. The VMFS file system allows for multiple hosts to access the same shared LUNs concurrently, which is necessary for features like vMotion to work. VMFS has a built-in safety mechanism to prevent a VM from being run on or modified by more than one host simultaneously. vSphere employs “SCSI reservations” as its traditional file locking mechanism, which locks an entire LUN using the RESERVE SCSI command whenever certain storage-related operations, such as incremental snapshot growth, occur. This helps to avoid corruption but can delay storage tasks from completing as hosts have to wait for the LUN to be unlocked with the RELEASE SCSI command before they can write to it. Atomic Test and Set (ATS) is a hardware-assisted locking method that offloads the locking mechanism to the storage array, which can lock at individual disk blocks instead of the entire LUN. This allows the rest of the LUN to continue to be accessed while the lock occurs, helping to avoid performance degradation. It also allows for more hosts to be deployed in a cluster with VMFS data stores and more VMs to be stored on a LUN.

Vendor support for VAAI

Currently, the vStorage APIs for Array Integration provide benefits only for block-based storage arrays (Fibre Channel or iSCSI) and do not support NFS storage. Vendor support for VAAI has been varied, with some vendors, such as EMC, embracing it right away and other vendors taking longer to integrate it into all their storage array models. To find out which storage arrays support specific vStorage API features, you can check the VMware Compatibility Guide for storage/SANs

amosperrins
Contributor
Contributor

Thanks for the reply Rsahi. I'm just investigating whether VVOLs are supported on the MSA 2040 (VASA seems to be) - this is a live system but we've got some quiet time coming up in just over a week so I might be able to do some testing then. Thanks for pointing me in the right direction! Smiley Happy

0 Kudos
amosperrins
Contributor
Contributor

Hi Rsahi,

It turns out our MSA 2040 doesn't support VVOLs - I'll just have to stick with datastores for now. Thanks for the info though! Smiley Happy

0 Kudos