VMware Cloud Community
ManivelR
Hot Shot
Hot Shot
Jump to solution

VSAN disk group testing

Hi All,

VSAN rigorous testing.I have some doubts again,Could some one please clarify my doubts if possible?

My setup is as follows:-

3 ESXi servers with 6.7.0

7 test windows/Linux VMs are running including VCSA 6.7.0

VSAN version is 6.7.0 with allflash configurations.

DG---->2* 2TB SSD disk/per ESXi host(each ESXi host has one disk group and 1* 2TB is going for cache tier and another 1* 2TB is going for capacity tier).Totally 3 ESXi servers   6 TB-->cache tier and another 6 TB--->capacity Tier

In storage policy:- FTT is set as 1 and selected RAID 1 configuration.

Last week testing scenario is as follows:-

* Removed 2 TB(cache disk) from first ESXi server and inserted 1 TB cache disk(non-uniform configuration).After inserting non-uniform 1 TB SSD disk and DG configuration,we saw the object sync was in progress,however the 1 TB disk shown as unhealth.

we dont know what happened to that 1 TB SSD disk(new cache disk).Soon after,the VCSA appliance stopped responding.While taking 1st ESXi reboot,all the VMs running shown as invalid.

* While checking the vm status(get all vms) from cli,i shown as skipping.

# vim-cmd vmsvc/getallvms

Skipping invalid VM  '24'

Skipping invalid VM '25'

Skipping invalid VM '28'

We dont know what was the cause behind this and soon after all the VMs went invalid/inaccessible(after ESXi reboot).This is due to non-uniform configuration of cache disk or not ?  Im not pretty sure.

This is the logs of ruby(VCSA) and We could not recover any VMs.

/localhost/VC-VSAN-310/computers> vsan.check_state 0

2018-12-14 08:58:54 +0000: Step 1: Check for inaccessible vSAN objects

Detected 41 objects to be inaccessible

vsan.check_state 0 -r

2018-12-14 09:02:22 +0000: Step 1: Check for inaccessible vSAN objects

Detected 2cc80f5c-f242-c40e-fb39-d4ae52886942 to be inaccessible, refreshing state

Detected 0f9f095c-6e37-8624-d45a-d4ae52886942 to be inaccessible, refreshing state

Detected 2cc80f5c-1a5b-1228-c0ee-d4ae52886942 to be inaccessible, refreshing state

Detected 986a095c-7ce5-572b-f5d5-d4ae52886942 to be inaccessible, refreshing state

After this issue,we have built a new VSAN setup freshly and like to test again with same 7 test VMs.FTT-1 and RAID 1 storage policy(3 ESXi servers only).

1) Remove one cache  SSD disk from first ESXi host and will try to see the impact.By default,we have enabled dedupe and compression.I like to know what will happen if we remove any SSD disk(either cache or capacity from first ESXi server)?

I guess there should not be any impact to all the running virtual machines. Am i correct ? If i use de-duplication and compression on 3 node on all flash cluster,will there be any impact?

I saw this community from Bob.

https://communities.vmware.com/thread/577526

Any feedback or suggestions?

Thanks,

Manivel RR

0 Kudos
1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

Hello Manivel,

"Test Scenario 1

...

I need to evacuate all data to other hosts( (select “Evacuate all data to other hosts”)and then i can start removing the data group.Once DG removed,i can recreate a disk group with new other SSD drives(for example) or with old SSD drives. The same statement applies to capacity disk as well."

If you want to remove a Disk-Group(DG), any device of a deduped DG or the cache-tier device of any DG and want to *move* the data then use Full Data Migration(FDM) option - note that you can do this on the DG level (via Cluster > Configure > vSAN > Disk Management) or via the host level by putting the host in MM with FDM (and thus all attached DGs to this node are evacuated as FDM).

"Test Scenario 2:-

Assuming that there is a physical cache SSD disk failure(dedupe DG).As per your message,we cannot evacuate any data(since there is physical  cache SSD disk failure).In this case, we can ask data center team to remove the correct cache SSD disk from server.Post that, i need to recreate a new disk group with new SSD cache disk.After disk group creation,i can add the existing capacity drive in to this disk group."

Yes, you can swap the cache-tier SSD and recreate the DG with the capacity-tier SSDs but do be sure to verify whether the cache-device failed or a capacity-tier SSD failed - this can be differentiated from vmkernel.log by device failure vs propagated failure (e.g. if in a deduped DG a capacity-device failed this will show as failed then propagated failure to the rest of the devices in the DG). As I said before, if the cache-tier is gone then so is the DG, format of remaining devices will be necessary before recreation of the DG - make sure the data is accessible and/or the cache-tier is actually physically failed before proceeding.

"Once everything is over,data sync will start automatically or through manually."

Through-out dealing with any form of device/DG/host failure one should pay attention to the state of the data: if you have an abrupt DG failure it should start repairing the data back to the FTT as per the Storage Policy(SP), if it doesn't (for whatever reason e.g. abnormal failure behaviour) then resync of the data back to FTT=1 (or whatever is defined in the SP) can always be initiated via the Health UI (Cluster > Monitor > vSAN > Health > Data > Repair Immediately - or via RVC).

"If it is only a failed capacity-tier device(dedupe disk group) then i can remove the DG and recreate it without the failed device and re-add this later."

Yes, check as I said above about propagated failure.

"Am i right?"

In many parts yes, you are getting there buddy :smileygrin: , if you are going to be working with this product I would advise considering studying for the vSAN equivalent of VCP(vmware.com/education-services/certification/vsan-2017-specialist.html), reading depping  and CHogan book(s), reading our vSAN docs and relevant StorageHub pages - we have covered a lot of this already (in more official formats) though of course if you consider any of our documentation unclear do ask to clarify.

It is my understanding that VMware have also started providing online/instructor-led courses for vSAN to customers (previously just employees) which you may consider looking into if interested.

Bob

View solution in original post

0 Kudos
6 Replies
IRIX201110141
Champion
Champion
Jump to solution

1. If you loose the Cache/Buffer device the complete Diskgroup goes offline

2. If you loose a capacity drive the DG stays online (but with only one Cache+Capacity its the same effect like #1)

3. If you loose a capacity drive with Dedup enabled the complete DG goes offline

Some Notes

- Max Buffer Device size in AFA setup is 800GB. all from above isnt used so your 2TB drive is a waste

- I dont like 3 node vSAN Clusters because of FTT=1 and planed MM

- If we changed something within the DG we remove the DG first. Except to adding additional drives

Regards,

Joerg

0 Kudos
ManivelR
Hot Shot
Hot Shot
Jump to solution

Thanks Jeorg for explanation.

I have some question.

1) For example,If any disk getting failed(cache or capacity),then i need to remove properly from vSphere client and then i need to ask data center team to remove the disk from server. Is this correct method ?  i should not go and remove the disk directly without evacuating data.

Thanks,

Manivel R

0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello Manivel,

"* Removed 2 TB(cache disk) from first ESXi server and inserted 1 TB cache disk(non-uniform configuration).After inserting non-uniform 1 TB SSD disk and DG configuration,we saw the object sync was in progress,however the 1 TB disk shown as unhealth."

If you didn't add this to a newly configured Disk-Group then I am not sure what you were expecting to happen here - if the cache-tier device is removed/dies then the Disk-Group (DG) goes with it, one cannot just shove a new SSD in there and expect it to magically recreate the metadata (that describes the data on the capacity-tier drives of this DG) that was on the original cache-tier device.

Additionally there may be implications depending on how these devices are referenced e.g. naa.<numbers> vs mpx.vmhba0:C0:T0:L0 .

"Detected 41 objects to be inaccessible"

If you had inaccesible Objects then they either a) had a double-failure, b) were FTT=0 or c) were non-compliant with Storage Policy when the issue occurred - in this situation it is very important to look closer at the state of the data and what components are absent or stale (and then determine why) e.g. using vsan.obj_status_report -t and vsan.vm_object_info <path_to_vm> and/or vsan.object_info <path_to_cluster> <Object_UUID> .

"1) Remove one cache  SSD disk from first ESXi host and will try to see the impact.By default,we have enabled dedupe and compression.I like to know what will happen if we remove any SSD disk(either cache or capacity from first ESXi server)?"

Correct, removing/losing any device in the DG will propogate failure to all devices in the DG.

"1) For example,If any disk getting failed(cache or capacity),then i need to remove properly from vSphere client and then i need to ask data center team to remove the disk from server. Is this correct method ?  i should not go and remove the disk directly without evacuating data.

Thanks,"

If you have a physically failed device in a deduped DG then you can't evacuate the data, only rebuild it from the remaining copy/copies - if it is only a failed capacity-tier device then you can remove the DG and recreate it without the failed device and re-add this later (provided your controller and firmware support hot-adding devices).

@Joerg

"- Max Buffer Device size in AFA setup is 800GB. all from above isnt used so your 2TB drive is a waste"

Actually it's 600GB (applicable only to AF as it is Write-buffer only) - though it is always strongly advised to go a bit bigger to account for dead/dying blocks to be dynamically replaced with wear-leveling.

"- I dont like 3 node vSAN Clusters because of FTT=1 and planed MM"

Amen - one cannot rebuild in place in a 3-node cluster with one host gone.

Bob

0 Kudos
ManivelR
Hot Shot
Hot Shot
Jump to solution

Thanks much Bob again for your detailed input. I got your point bit and need some more clarity.

Test Scenario 1:-

Our disk group has one cache disk and one capacity disk(all flash with depupe and compression enabled).Assuming that,im going to remove the cache disk(there is no physical SSD disk failure-Cache disk)Before removing the disk group, I need to evacuate all data to other hosts( (select “Evacuate all data to other hosts”)and then i can start removing the data group.Once DG removed,i can recreate a disk group with new other SSD drives(for example) or with old SSD drives. The same statement applies to capacity disk as well.

  • If deduplication and compression are enabled, you can add a capacity drive to a disk group, but you cannot remove an individual drive. You must remove the entire disk group when removing or replacing cache and/or capacity drives. The disk group must be recreated using the new drive(s).

Test Scenario 2:-

Assuming that there is a physical cache SSD disk failure(dedupe DG).As per your message,we cannot evacuate any data(since there is physical  cache SSD disk failure).In this case, we can ask data center team to remove the correct cache SSD disk from server.Post that, i need to recreate a new disk group with new SSD cache disk.After disk group creation,i can add the existing capacity drive in to this disk group.Once everything is over,data sync will start automatically or through manually.

If it is only a failed capacity-tier device(dedupe disk group) then i can remove the DG and recreate it without the failed device and re-add this later.

Am i right?

Thanks,

Manivel RR

0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello Manivel,

"Test Scenario 1

...

I need to evacuate all data to other hosts( (select “Evacuate all data to other hosts”)and then i can start removing the data group.Once DG removed,i can recreate a disk group with new other SSD drives(for example) or with old SSD drives. The same statement applies to capacity disk as well."

If you want to remove a Disk-Group(DG), any device of a deduped DG or the cache-tier device of any DG and want to *move* the data then use Full Data Migration(FDM) option - note that you can do this on the DG level (via Cluster > Configure > vSAN > Disk Management) or via the host level by putting the host in MM with FDM (and thus all attached DGs to this node are evacuated as FDM).

"Test Scenario 2:-

Assuming that there is a physical cache SSD disk failure(dedupe DG).As per your message,we cannot evacuate any data(since there is physical  cache SSD disk failure).In this case, we can ask data center team to remove the correct cache SSD disk from server.Post that, i need to recreate a new disk group with new SSD cache disk.After disk group creation,i can add the existing capacity drive in to this disk group."

Yes, you can swap the cache-tier SSD and recreate the DG with the capacity-tier SSDs but do be sure to verify whether the cache-device failed or a capacity-tier SSD failed - this can be differentiated from vmkernel.log by device failure vs propagated failure (e.g. if in a deduped DG a capacity-device failed this will show as failed then propagated failure to the rest of the devices in the DG). As I said before, if the cache-tier is gone then so is the DG, format of remaining devices will be necessary before recreation of the DG - make sure the data is accessible and/or the cache-tier is actually physically failed before proceeding.

"Once everything is over,data sync will start automatically or through manually."

Through-out dealing with any form of device/DG/host failure one should pay attention to the state of the data: if you have an abrupt DG failure it should start repairing the data back to the FTT as per the Storage Policy(SP), if it doesn't (for whatever reason e.g. abnormal failure behaviour) then resync of the data back to FTT=1 (or whatever is defined in the SP) can always be initiated via the Health UI (Cluster > Monitor > vSAN > Health > Data > Repair Immediately - or via RVC).

"If it is only a failed capacity-tier device(dedupe disk group) then i can remove the DG and recreate it without the failed device and re-add this later."

Yes, check as I said above about propagated failure.

"Am i right?"

In many parts yes, you are getting there buddy :smileygrin: , if you are going to be working with this product I would advise considering studying for the vSAN equivalent of VCP(vmware.com/education-services/certification/vsan-2017-specialist.html), reading depping  and CHogan book(s), reading our vSAN docs and relevant StorageHub pages - we have covered a lot of this already (in more official formats) though of course if you consider any of our documentation unclear do ask to clarify.

It is my understanding that VMware have also started providing online/instructor-led courses for vSAN to customers (previously just employees) which you may consider looking into if interested.

Bob

0 Kudos
ManivelR
Hot Shot
Hot Shot
Jump to solution

Thanks Bob for your detailed explanation.Its cleared now.Sure I will read depping  and CHogan book(s),

Have a good day.

Thank you very much again.

Cheers,

Manivel R

0 Kudos