Have seen references to reserving up to 30% as slack space in VSAN. That seems quite excessive on top of the overhead of Erasure coding.
What is the slack space really used for ?
Is this slack space in VSAN-AF a tunable ? i.e. can a knowledgable user set it to lower value as long as the implications are well understood ?
The reasoning for this much advised slack-space is to ensure that there is enough space to rebuild vSAN Objects in the event of a failure/loss of access to some data-components or to reconfigure the data (e.g. applying a new Storage Policy).
Obviously this depends on the configuration - for instance a 4-node cluster with RAID5 Objects won't be able to rebuild data in the event of 1 node permanently failing (need 4 available FDs), but you still need temporary available space for reconfiguring these (e.g. switching to RAID1 or increasing Stripe Width). The larger the cluster the lower the proportion of capacity that is required to be able to rebuild a failed node, so technically from a recovery perspective less than 30% is really required if there are a larger number of nodes in the cluster, though if going lower than ~30-25% pay mind to how changes to Storage Policies are made (e.g. don't try to change all Objects from R5 to R1 at once).
With regard to "slack" at the SSD-level, this is used for reliability to replace dead/dying blocks and is pre-configured by manufacturer (and/or by user) and is not tunable from vSAN (if that is what you were referring to).
Thanks Bob for your response. Very helpful.
I was actually referring to the VSAN's advisory slack space (not the SSD mfg. recommended slack) .
Understand based on your response that this could vary depending on the # of nodes in the cluster..but beyond that is there tunable in VSAN to set this threshold to a lower value (and pay the repercussions like less likelihood to rebuild etc till the failure is repaired ) ?
"I was actually referring to the VSAN's advisory slack space (not the SSD mfg. recommended slack)."
Yes, I was thinking, but covering both bases in case there was confusion.
"is there tunable in VSAN to set this threshold to a lower value"
There is nothing to tune as such - the space is available to use regardless of the advised recommendation, e.g. it is possible to fill a vsandatastore up to 90%+ but in the long-term this is not a good idea for the reasons in my previous comment. If one did plan to use a vSAN cluster at utilization levels above 80% then they should consider setting the /VSAN/ClomRebalanceThreshold to a higher value than the default (80%) so that data doesn't get unnecessarily moved around as individual disk usage varies (obviously never set this to something like 99% and leave it as this will result in having a very bad day).