VMware Cloud Community
andvm
Hot Shot
Hot Shot
Jump to solution

Change Object format

Hi,

One thing I noticed after upgrading vSAN on disk format version from 10 to 14, I got a warning to Change the Object format for a number of components. Cormac explains this well in:

https://cormachogan.com/2021/02/09/vsan-7-0u1-object-format-health-warning-after-disk-format-v13-upg...

Keep in mind however that this process will consume the additional space as listed in the warning.

My understanding is that it does so to satisfy the new component structure without compromising data resilience during this process.

What I am not sure is if it has checks inbuilt that auto fail if there is not enough slack space available rather than proceeding to fill in most of the disks/potentially causing impacts to VMs running on same vSAN Cluster? 

Labels (1)
0 Kudos
1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

@andvm, So CLOM won't start placing components if there is insufficient space/fault domains to do so, so yes - whether the relayout task will fail/timeout on encountering such a state may vary (e.g. I recall in earlier builds of 7.0 U1, in some circumstances this would just sit at X% when it should have just failed/timedout and stated 'need more space').

In addition to this, in later builds of vSAN there is a 95% disk full threshold that it won't resync components to a disk beyond.

 

The only times I have ever seen space issues encountered with this are where there are relatively large Objects compared to the vsanDatastore used/free in small clusters e.g. if you had a 2-node cluster with 15TB used of 20TB and 10TB of that used was only large vmdk then it would fail (as it would need 10TB free to do a deep-reconfig of the Object), workarounds for this would be to either move this vmdk off vsanDatastore and back or temporarily FTT=0 it.

View solution in original post

0 Kudos
5 Replies
TheBobkin
Champion
Champion
Jump to solution

@andvm, So CLOM won't start placing components if there is insufficient space/fault domains to do so, so yes - whether the relayout task will fail/timeout on encountering such a state may vary (e.g. I recall in earlier builds of 7.0 U1, in some circumstances this would just sit at X% when it should have just failed/timedout and stated 'need more space').

In addition to this, in later builds of vSAN there is a 95% disk full threshold that it won't resync components to a disk beyond.

 

The only times I have ever seen space issues encountered with this are where there are relatively large Objects compared to the vsanDatastore used/free in small clusters e.g. if you had a 2-node cluster with 15TB used of 20TB and 10TB of that used was only large vmdk then it would fail (as it would need 10TB free to do a deep-reconfig of the Object), workarounds for this would be to either move this vmdk off vsanDatastore and back or temporarily FTT=0 it.

0 Kudos
andvm
Hot Shot
Hot Shot
Jump to solution

good to know that it has these checks. 

yes in fact I realised (remembered) of this consumption behaviour after I have seen a 9TB consumption VM turn into an 18TB consumed space whilst this process took place.

0 Kudos
andvm
Hot Shot
Hot Shot
Jump to solution

btw see the Cluster backend performance stats whilst these operations were taking place for last 24hrs

Even though I see a good portion of time where latency was in red area, looking at the VM stats it does not seem to have impacted their latency

andvm_0-1646984388138.png

andvm_1-1646984515771.png

 

 

0 Kudos
TheBobkin
Champion
Champion
Jump to solution

@andvm, This Object relayout is actually what is needed for vSAN to be able to do future deep-reconfig of Objects without having to consume the whole Object space temporarily - it is a restructuring of the RAID-tree so that it can reconfigure sub-component at a time to use less space when doing this in any future config changes.

 

The latency you see there is actually the latency being applied to the resync traffic so that VM IOs get priority e.g. resync fairness scheduler working as intended.

0 Kudos
BurakKutukcu
Contributor
Contributor
Jump to solution

Hello @TheBobkin ,

Same issues i have but my vSAN datastore capacity 580 TB and free space is 150 TB. I have multiple VMs with large objects (10TB, 30 TB, 40 TB vmdk). My policy is RAID 1 and network speed is 20 gbit.

There is warning about object health upgrade more than 350 TB on the skyline health.

How should i approach ? 

 

Thanks.

0 Kudos