M4T
Enthusiast
Enthusiast

vSAN failing drive

Jump to solution

Hi community,

if you have a vSAN all flash set up. when a SSD fails either storage or cache can you just replace it with a new one and does it rebuild and carry on working? I'm guessing if its a VM storage it carry's on working as I'm guessing with it being a 2 host flash vSAN its in a RAID1 so if a drive fails it carry's on running the other host?

0 Kudos
1 Solution

Accepted Solutions
TheBobkin
VMware Employee
VMware Employee

Hello M4T​.

"if you have a vSAN all flash set up. when a SSD fails either storage or cache can you just replace it with a new one and does it rebuild and carry on working? "

Regardless of implementation type, if a Cache-tier device fails then the whole Disk-Group is gone and needs to be recreated when the device has been replaced. If a Capacity-tier device fails in a dedupe+compression enabled cluster then similarly the whole Disk-Group has to be recreated without this device and/or with replacement device - this isn't the case in a Hybrid implementation or All-Flash with no dedupe+compression enabled. Either way, if either a Capacity-tier device or entire Disk-Group is failed then the data will be rebuilt on remaining Disks/Disk-Groups provided there is adequate space and on nodes that would be useful (e.g. can't rebuild 2nd data-replica on the same node as 1st data-replica as this would violate the Storage Policy rules).

"I'm guessing if its a VM storage it carry's on working as I'm guessing with it being a 2 host flash vSAN its in a RAID1 so if a drive fails it carry's on running the other host?"

Yes, if these are FTT=1 protected VMs then they should keep running from the remaining data-replica.

Bob

View solution in original post

0 Kudos
1 Reply
TheBobkin
VMware Employee
VMware Employee

Hello M4T​.

"if you have a vSAN all flash set up. when a SSD fails either storage or cache can you just replace it with a new one and does it rebuild and carry on working? "

Regardless of implementation type, if a Cache-tier device fails then the whole Disk-Group is gone and needs to be recreated when the device has been replaced. If a Capacity-tier device fails in a dedupe+compression enabled cluster then similarly the whole Disk-Group has to be recreated without this device and/or with replacement device - this isn't the case in a Hybrid implementation or All-Flash with no dedupe+compression enabled. Either way, if either a Capacity-tier device or entire Disk-Group is failed then the data will be rebuilt on remaining Disks/Disk-Groups provided there is adequate space and on nodes that would be useful (e.g. can't rebuild 2nd data-replica on the same node as 1st data-replica as this would violate the Storage Policy rules).

"I'm guessing if its a VM storage it carry's on working as I'm guessing with it being a 2 host flash vSAN its in a RAID1 so if a drive fails it carry's on running the other host?"

Yes, if these are FTT=1 protected VMs then they should keep running from the remaining data-replica.

Bob

View solution in original post

0 Kudos