VMware Cloud Community
SyenceLabb
Contributor
Contributor

vSAN deduplication and compression interoperability with OS level deduplication

Hi -

I'm looking for any official guidance from VMware when it comes to having a deduplication technology running on top of the deduplication actions taking place inside the vSAN kernel. All I could find was a reddit post stating it would just be a waste of CPU cycles and not worth the hassle turning on in two places. I went ahead and tried to test this inside of a test vSAN 6.6 environment that has 10 hosts with two Windows Server 2016 DFS file servers. Each file server has deduplication turned on at the OS level running against all the volumes that is serving up data. Each file server contains 40 TB each.

What I'm noticing is a lot of temporary overhead (94TB's of 220TB total before dedupe and compression) during resync operations against objects that's been running for days. I currently doubt this has anything to do with deduplication being on within Windows (I recently has a lot of data moving around), but I would like to be pointed to some literature on this to rule out any weird quirks.

Thanks.

0 Kudos
1 Reply
GreatWhiteTec
VMware Employee
VMware Employee

Hi SyenceLabb,

As far as literature goes, you can find a lot of the official documents at StorageHub.vmware.com Here is some info about DD/C Storage and Availability Technical Documents

The Dedupe ratio/efficiency will depend on many variables, such as number of VMs, amount of unique data, number and size of disk groups, etc. Although DD/C is enabled at the cluster level, it dedupes and compresses at the Disk Group level during the destage process. If 4K blocks cannot be deduped/compress to smaller chunks (<=2K blocks), then we don't waste resources trying to dedupe/compress for a tiny gain in space.

Now, the temporary overhead you are seeing is the space "reserved" while objects are being moved. See KB for more info VMware Knowledge Base

Hope this helps

0 Kudos