VAAI used for FT disk scrubbing?

DSeaman · ‎12-29-2010

I've been performing a number of tests in our vSphere 4.1 environment and our 3PAR T400 array to quantify VAAI performance. I've verified that VAAI is enabled on the array, and vSphere leverages VAAI for some operations such as creating a EZT disk and storage vMotion. I can create EZT disks at 10GB/sec (yes, bytes).

However, today I enabled FT for a several powered off VMs which disk not have EZT disks, so vCenter started the conversion process. I was monitoring the array host facing ports, and vCenter, and a tremendous amount of data was being sent from the ESX host to the array (400MB/sec) during the scrubbing process. On the back end of the array there was almost no I/O, since the 3PAR controllers have zero detect and thus don't physically write all the zeros to disk.

Now I would have expected VAAI to kick in during the scrubbing process, using the write zero command, and thus have very minimal traffic over the SAN and disk array host ports.

Using the vCenter GUI to create a new VMDK disk I can select FT compatibility, and in that case it DOES use VAAI and almost instantly creates a EZT VMDK.

Does the automated FT disk scrubbing process not utilize the VAAI write zero command?

Derek Seaman

lamw · ‎12-29-2010

Did you monitor esxtop during EZT process when enabling FT for the VM? I'm assuming the VM and all it's virtual disks are residing on your 3PAR T400 array? Did you verify that all 3 VAAI SCSI primatives are in fact enabled on your ESX(i) 4.1 host?

DSeaman · ‎12-29-2010

I have no easy way to run extop (ESXi and no usable vMA), so no, did not do that. Yes all of the VMs are on datastores hosted on the T400. The ESXi hosts have not had their configuration changed, and I verified under DataMover that both options are set to the default, 1.

I've verified 'write zero' works when creating a new VMDK using FT mode.

Derek Seaman

lamw · ‎12-29-2010

If you can enable Tech Support Mode, then you can use the local esxtop to verify whether or not the Write Same primitive is being used. You can also verify by looking at the logs, here is a VMware KB article regarding VAAI - http://kb.vmware.com/kb/1021976

Confirm in the KB that you don't fall into any of the caveats

DSeaman · ‎12-29-2010

Like I've said, I know VAAI write same works perfectly fine except in the automated disk scrubbing scenario. It's very clear write same is not being used during the vCenter automated disk scrubbing because it floods the SAN with traffic.

Derek Seaman

lamw · ‎12-29-2010

I just confirmed on my 3PAR T800 that VAAI write same SCSI primative is in fact utilized when enabling FT on a VM that requires it's disks to be scrubed. You can easily see this by viewing esxtop VAAI stats. For my test on a dummy 40GB zeroedthick disk, I was getting ~45MB/s for "MBZERO/s" and no IO was going through the host

If you're seeing data go through the host, then something may not be setup correctly or you fall into one of those caveats mentioned in the VMware KB that prevents VAAI use.

DSeaman · ‎12-29-2010

My system does not fall under any of the caveats.

Derek Seaman

lamw · ‎12-29-2010

Ok, I think I see what you're seeing ... after I started the scrubbing, I noticed it was taking awhile and then view the IO stats for the host and it looks like it is getting both MB/s R/W along with VAAI MBZERO/s. I was seeing ~80 R/W through the datamover in the host and ~40 via VAAI but this is still extremely slow which tells me it's still not leveraging VAAI 100%.

While this is still going, I just created a dummy 10GB eagerzeroedthick and noticed the MBZERO/s stats jump to ~490MB/s which I know that operation is completely offloaded to the array.

This is kind of odd indeed but I don't use FT in my enviornment and hence my original assumption that VAAI would be used. It is being used but it seems like not all IO operation is being offloaded.

DSeaman · ‎12-29-2010

Thanks, at least that verifies what I am seeing on my end. We don't use FT either, but for other reasons I wanted all of the VMDKs to be EZT. These VMs were created from a template, and there's no GUI method I can see to specify EZT when cloning.

Derek Seaman

lamw · ‎12-29-2010

I was hoping you could EZT via vmkfstools and then enable FT, but it looks like using vmkfstools -k yields the same result. Now I'm wondering if this is an operation that is not supported by VAAI or if it's a bug?

DSeaman · ‎12-29-2010

Personally, I think this would be a bug. My understanding is that VAAI should kick in whenever the kernel wants to write lots of zeros. Clearly turning a lazy zeroed disk into a EZT disk writes lots of zeros.

Derek Seaman

lamw · ‎12-29-2010

So here is some additional testing:

1 Creating a full 10GB eagerzeroedthick disk which took 24seconds and I can verify EZT operation is being VAAI offloaded

time vmkfstools -c 10G test1.vmdk -d eagerzeroedthick
Creating disk 'test1.vmdk' and zeroing it out...
Create: 100% done.
real    0m24.193s
user    0m0.000s
sys     0m0.020s

2. Creating a 10GB "zeroedthick" disk which will be then be used to inflated to eagerzeroedthick disk

vmkfstools -c 10G test2.vmdk

3. Inflating 10GB disk to eagerzeroedthick, this is basically zeroing out the disk but as you can see it takes ~4min. From esxtop, I can see IO going through the ESX host and VAAI write same primtive is being used. Though it seems like VAAI operation is rate limited somehow gated by the reads, since MBWRITE/s & MBZERO/s seem to match and its approximately the same with the MBREAD/s.

time vmkfstools -k test2.vmdk
Eagerly zeroing: 100% done.
real    4m15.034s
user    0m0.090s
sys     0m24.250s

4. Next, I take the disk that I just inflated and performed a "clone" and configuring the disk format as "eagerzeroedthick" and this operation is VAAI offloaded and takes ~35seconds

time vmkfstools -i test2.vmdk -d eagerzeroedthick test3.vmdk
Destination disk format: VMFS eagerzeroedthick
Cloning disk 'test2.vmdk'...
Clone: 100% done.
real    0m35.379s
user    0m0.010s
sys     0m0.060s

So my assumption is when it needs to zero out on an existing VMDK, it's not able to fully offload to the array and requires the VMkernel to perform some of that work since I can see VAAI still being used during this inflation process. This is where I'm unclear if this is by design or a limitation? I agree this sounds like a bug since everyone has been marketing FT enablement to be much faster with write same. I agree it definitely is super fast when you're talking about creating a new eagerzeroedthick disk, but for FT/inflate use case I'm just not sure.

I haven't found anyone else to confirm/deny these claims, but I would be interested to see anyone else can reproduce these behaviors.

DSeaman · ‎12-29-2010

Your EZT test seems to be MUCH slower than the results on my T400.

http://derek858.blogspot.com/2010/12/3par-vaai-write-same-test-results-upto.html

In 24 seconds I could create a 240GB EZT disk, and a 70GB disk in 7 seconds. Basically a consistent 10GB/sec.

Derek Seaman

lamw · ‎12-29-2010

I don't have zero detect enable for the VV, I've seen that give a huge boost as well when you're doing EZT

DSeaman · ‎12-29-2010

Ah ya, that's the difference! Zero detect kicks butt.

Derek Seaman

All

VAAI used for FT disk scrubbing?