VMware Cloud Community
georgemason
Contributor
Contributor

Should I enable VSAN sparse swap?

Hi,

I have been looking at ways to optimise our use of storage on our VSAN cluster, as the amount of data being stored seems disproportionately high (we have about 30TB of storage and our "VM overreserved" is running around 3.6TB). On checking it appears that since we are using the default RAID1 config (FTT=1), with the default settings we are holding 2x copies of the swap files for each VM, which is running to quite an amount of disk space as some VMs have > 32GB of RAM.

As such I'm considering enabling sparse swap, but I'm not 100% clear on the implications of this, other than the obvious benefits of lower storage use.

Specifically, I am interested to know:

- With only a single copy of swap - what happens if the storage in that host dies and that object is inaccessible? Presumably the VM must crash as it can't access the pages of RAM that have been swapped out?

- We don't over-commit RAM, so presumably regardless of the above, this is a fairly low-risk change for us

- When disabling this cluster-wide, presumably it will result in a large amount of data being removed from VSAN, which will necessitate a disk rebalance. Are there any other considerations?

Thanks very much in advance for any insight into this setting.

George

0 Kudos
3 Replies
TheBobkin
Champion
Champion

Hello George,

"our "VM overreserved" is running around 3.6TB"

You should check whether anything other than vswp Objects are Thick-provisioned or OSR=100(Object Space Reservation)

This can be checked from your Storage Policies that are in use but also via the output from #esxcli vsan debug object list  .

"some VMs have > 32GB of RAM."

"- We don't over-commit RAM, so presumably regardless of the above, this is a fairly low-risk change for us"

If you are not over-provisioning physical memory, you could also consider allocating memory reservations - Thick-provisioned vswp Objects are sized as allocated memory minus reservation so the more reserved the less space they consume (+ it doesn't have to be full reservation if you are over-provisioning).

"- With only a single copy of swap - what happens if the storage in that host dies and that object is inaccessible? Presumably the VM must crash as it can't access the pages of RAM that have been swapped out?"

"sparse" or thin-provisioned vswp doesn't reduce the FTT of the Objects, it just doesn't reserve the space required if your VM needed to swap data to vswp due to memory contention, basically it makes them Thin-provisioned so if you had issues with no available space then potentially these could be impacted causing the VM to be stunned/crash (thus why reservations might be beneficial).

"- When disabling this cluster-wide, presumably it will result in a large amount of data being removed from VSAN, which will necessitate a disk rebalance. Are there any other considerations?"

This requires setting this on all hosts (as vswp attributes are dictated from host policies not normal Storage Policies) and power-cycling the VMs to take effect is also required - vswp Objects are transient and are deleted when a VM is powered-off so no/little rebalance should be required, they are also relatively small and thus should be fairly well distributed amongst the clusters disks.

Bob

0 Kudos
georgemason
Contributor
Contributor

Hi Bob,

This is super useful, thanks very much.

I tried to run "esxcli vsan debug object list" but I get an unknown command error. Is there something I need to enable first?

Thanks again

George

0 Kudos
TheBobkin
Champion
Champion

Hello George,

Happy to help and you are most welcome.

esxcli vsan debug was only added in ~6.6 (6.5 U1) - if you are on an earlier version this is not available.

You can always use RVC e.g. from cluster level (and no resource pools in use) >vsan.vm_object_info ./resoucePools/vms/*

Or generate the data on a host using:

# python /usr/lib/vmware/vsan/bin/vsan-health-status.pyc > /tmp/healthOut.txt

Bob

0 Kudos