VMware Cloud Community
RKDRnD
Enthusiast
Enthusiast

Help with understanding vSAN capacity utalization (sizing)

Please explain my vSAN volume storage utilization.

- Two-node vSAN cluster (with a witness appliance)

- Two Fault domains.

- Primary level of failures to tolerate = 1

- RAID-1

1 running VM (Windows) - 40GB HDD Thin-provisioned (Guest used space = 19GB), 4GB RAM.

vSAN datastore total capacity = 13.97 TB (Capacity tier = 1.9TB SSDs x 😎

Used capacity of the datastore is reported as 1.34 TB. Why?

5 Replies
TheBobkin
Champion
Champion

Hello RKDRnD​,

"1 running VM"

Note that not only powered-on VMs use space, not only does but VMs that are stored on the vsanDatastore and are not registered in inventory also consume space, as does any other data stored there e.g. ISO images etc. .

"Used capacity of the datastore is reported as 1.34 TB. Why?"

Other things can use a relatively substantial amount of space such as Deduplication overhead which is allocated all up-front (*should* be incremental in latest version e.g. 6.7 U2 but I have not confirmed this). The capacity breakdown in the UI it should inform you what is using the space:

Cluster > Monitor > vSAN > Capacity

Bob

RKDRnD
Enthusiast
Enthusiast

Thanks for that. It seems like 757GB is reserved for Dedupe overhead. However still, even if I add up all the file sizes of the files in the vsan datastore, they don't add-up (not even close) to the remainder of the used capacity; to neither before dedupe, nor of course to after dedupe.  Please clarify one other thing - what do the files sizes reported in the Files tab of the vSAN datastore represent? Are the file sizes actual or after de-dupe?  For every VM directory, I see a second directory with identical contents; I assume this is my RAID-1 mirror? However, I cannot make sense of the vmdk sizes in each of the directories. Even right after the VM provisioning (before it was powered on for the first time) the vmdk size is more then twice the size of what it is in the template from which the VM was provisioned, which sits on a non-vsan datastore. Why? on both ends the policy is set to thin-provisioning, so I don't expect the vmdk to inflate on vsan; or do I?

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello RKDRnD​,

"Please clarify one other thing - what do the files sizes reported in the Files tab of the vSAN datastore represent? Are the file sizes actual or after de-dupe?"

There should be a breakdown for before/after dedupe sizes - though if you have barely anything on it then you are unlikely to be generating much/any savings until more data is added.

"For every VM directory, I see a second directory with identical contents; I assume this is my RAID-1 mirror?"

No - one of these is essentially just a symlink that points the same content e.g. folder 'VMname' points to that VMs namespace Object 12345678-1234-5678-90ab-123456789012

"I cannot make sense of the vmdk sizes in each of the directories"

Depending how/where you are looking these may not be accounting for replica-data, snapshots etc. - e.g. if you SSH to a host and cd to the directory it will just show MB/KB as there are no -flat.vmdk on vSAN - the best location for seeing how much data a VM is using including all associated Objects and their data-replicas is to view the Summary space usage for the VM.

"the vmdk size is more then twice the size of what it is in the template from which the VM was provisioned, which sits on a non-vsan datastore. Why?"

Presumably because you are storing it as a redundant FTT=1,RAID1 Object now compared to before when you were storing it as a non-redundant vmdk - FTT=1 using RAID1 uses twice the space as two copies of the data are stored.

Bob

Reply
0 Kudos
RKDRnD
Enthusiast
Enthusiast

Thanks again Bob. That makes sense ... I think. So the "Storage Usage" figure in the VMs Summary tab - is actual size or de-duped size?

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello RKDRnD​,

"in the VMs Summary tab - is actual size or de-duped size?"

It doesn't include for dedupe/compression savings as far as I am aware.

There are a number of indicators that this would logically be the case, such as:

- Which VM/vmdk would account for  data that was then deduped from? Something would have to own this usage and thus would cause confusion e.g. two perfectly identical VMs, one with relatively astronomically higher usage.

- Periodic variation in deduplication factor/compression due to data movement or changes to the data to dedupe/compress better/worse could cause constant fluctuation in perceived/reported VM usage.

Bob

Reply
0 Kudos