VMware Cloud Community
winsolo
Enthusiast
Enthusiast
Jump to solution

vSAN Disk Group - PDL Alarm

VCSA 6.7.0  Build 15976728

VMware ESXi 6.7.0 build-15018017 (Dell EMC VxRail P470F)

Snag_5d817a2.png

Snag_5d84c9c.png

Snag_5da219b.png

Snag_5df3f0e.png

Just a couple of questions

1. Looks like this is a false alarm as there's no PDL. But in order to fix it, this particular disk group needs to be removed and recreated, correct?

2. If yes, should it be done with the host in MM with Ensure accessibility or Full Data Evacuation?

Reply
0 Kudos
1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

Hello winsolo​,

So, as it has been a long long time since I noted such things in logs, but out of interest I had a shallow dig into the vdrb log messages and on some of the cases and PRs I looked at this can be expected behaviour if you have a lot of DLRs (e.g. hundreds) - that being said (and provided this is expected logging behaviour in latest versions etc.), I would advise you take measures to extend how many much data your vmkernel.log retains as only having a few hours of this available will make troubleshooting a variety of issues extremely difficult if not impossible (unless the issue is easily reproduced).

The absence of anything error or latency-related logged for the disk in vsandevicemonitord.log basically just tells us that it wasn't DDH that kicked out the disk (e.g. due to relatively long periods of very high latency).

From the vmkwarning.log it informs that the disk (UUID: 5248e7a7-8c24-8b12-13b8-2da7c262cc52 - naa.5002538a072c9a20) was failed to due to a dedupe I/O write error - as I mentioned above, (unlike vmkernel.log) it doesn't tell us whether this was due to returned hardware error, bad blocks being written to or some other issue, so without knowing this (and if replacing the device is difficult/not possible), I would advise opening a case with my colleagues in vSAN GSS to use an internal tool to perform some actions on the device before adding it back to the Disk-Group (PM me if they are for whatever reason unaware of what the ask is and/or what needs to be done).

Bob

View solution in original post

Reply
0 Kudos
9 Replies
depping
Leadership
Leadership
Jump to solution

check full output of vdq -q to see if there's anything else going on. you could try unmounting and remounting the datastore without moving the data even. but in situations like these, I'd recommend simply contacting support of you have an active contract.

Reply
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello winsolo

Are you sure it didn't fail and remount? Or similarly the path/connection to a disk was lost and returned?

If you can attach/PM the vobd.log and vmkernel.log from the host covering the time when this dropped I can take a look - this can *generally* be determined via:

# grep -i perm /var/log/vobd.log

If it doesn't cover the timestamp then check the older logs:

# zcat /var/run/log/vobd.*.gz | grep -i perm

Then find the correlating vmkernel log (e.g. # zcat /var/run/log/vmkernel.*.gz |grep <timestamp of minute/hour from the vobd log>

Bob

Reply
0 Kudos
winsolo
Enthusiast
Enthusiast
Jump to solution

Hey Bob,

I think the logs have rolled over. The vmkernel.7.gz log no longer has any entires showing T08:21:50. It only starts with T08:23. vmkernel.7.gz Is the last file in /var/run/log/. Bit I'm not sure if it failed and remounted.

[root@host201:~] grep -i perm /var/log/vobd.log

2020-10-06T08:21:50.106Z: [vSANCorrelator] 2425261291458us: [vob.vsan.lsom.diskerror] vSAN device 5248e7a7-8c24-8b12-13b8-2da7c262cc52 is under permanent error.

2020-10-06T08:21:50.106Z: [vSANCorrelator] 2425264283415us: [esx.problem.vob.vsan.lsom.diskerror] vSAN device 5248e7a7-8c24-8b12-13b8-2da7c262cc52 is under permanent error.

2020-10-06T08:21:50.106Z: [vSANCorrelator] 2425261291476us: [vob.vsan.lsom.diskerror] vSAN device 5248e7a7-8c24-8b12-13b8-2da7c262cc52 is under permanent error.

2020-10-06T08:21:50.106Z: [vSANCorrelator] 2425264283565us: [esx.problem.vob.vsan.lsom.diskerror] vSAN device 5248e7a7-8c24-8b12-13b8-2da7c262cc52 is under permanent error.

2020-10-06T08:21:50.107Z: [vSANCorrelator] 2425261291571us: [vob.vsan.lsom.diskpropagatedpermerror] vSAN device 522e22bd-b897-ec17-1391-1b921e50e3f4 is under propagated permanent error.

2020-10-06T08:21:50.107Z: [vSANCorrelator] 2425264283637us: [esx.problem.vob.vsan.lsom.diskpropagatedpermerror] vSAN device 522e22bd-b897-ec17-1391-1b921e50e3f4 is under propagated permanent error.

2020-10-06T08:21:50.107Z: [vSANCorrelator] 2425261291593us: [vob.vsan.lsom.diskpropagatedpermerror] vSAN device 52913675-2337-4322-093f-4233001081a8 is under propagated permanent error.

2020-10-06T08:21:50.107Z: [vSANCorrelator] 2425264283703us: [esx.problem.vob.vsan.lsom.diskpropagatedpermerror] vSAN device 52913675-2337-4322-093f-4233001081a8 is under propagated permanent error.

[root@host201:~]

[root@host201:~] zcat /var/run/log/vmkernel.7.gz | grep T08 | more

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x2929590a prefix len:0x00000020

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x2929590a prefix len:0x00000020

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x2a29590a prefix len:0x00000020

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x2a29590a prefix len:0x00000020

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x2b29590a prefix len:0x00000020

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x2b29590a prefix len:0x00000020

2020-10-06T08:23:00.200Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x2c29590a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x2c29590a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x0000600a prefix len:0x0000000b

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x0000600a prefix len:0x0000000b

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x1306650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x1306650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x1706650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x1706650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x1806650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x1806650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu29:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x1906650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu36:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x1906650a prefix len:0x00000020

2020-10-06T08:23:00.207Z cpu36:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0x0000800a prefix len:0x0000000b

2020-10-06T08:23:00.207Z cpu36:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0x0000800a prefix len:0x0000000b

2020-10-06T08:23:00.207Z cpu36:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0xf4c3990a prefix len:0x0000001e

2020-10-06T08:23:00.219Z cpu36:2103072)vdrb: VdrUpdateRouteLocked:663: CP:[I:0x13880]: Added rt prefix:0xf4c3990a prefix len:0x0000001e

2020-10-06T08:23:00.219Z cpu37:2103072)vdrb: VdrUpdateRouteLocked:731: CP:[I:0x13880]: Deleted rt prefix:0xc878be0a prefix len:0x00000020

Reply
0 Kudos
winsolo
Enthusiast
Enthusiast
Jump to solution

Hey depping

I did check vdq -q but there was nothing suspicious. You meant unmounting and remounting the disk group, right?

Reply
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello winsolo​,

I see no 'unmapped from array' in vobd.log there so it doesn't look to be a H:0x1 event, but we can't really tell if it was a 0x3 0x11 or 0x4 0xX event (other common sense codes that can trigger disk being marked as failed) without vmkernel logging - you *might* get more information from vmkwarning.log, while this doesn't log sense codes it can sometimes provide more info.

Anything in vsandevicemonitord.log at this time?

Does dmesg (in-memory kernel logging) go back any farther?

Just an FYI, that is an unusually chatty amount of NSX vdrb logging - I have only once seen such things being indicative of issues that caused knock-on impact on vSAN (IIRC it was something to to with being way over-subscribed on firewall rules), but regardless I would advise getting that checked out by someone more aware of NSX troubleshooting and/or issues.

Bob

Reply
0 Kudos
winsolo
Enthusiast
Enthusiast
Jump to solution

Hey Bob,

This is all the vsandevicemonitord.log file has for today. dmesg has rolled over too. It has nothing around 2020-10-06T08. But vmkwarning has a little bit of info. We're going to upgrade the NSX-V to the latest version in the very near future, to fix a couple of bugs and I hope the problem we're seeing here will be fixed.

[root@host201:~] grep -i 2020-10-06 /var/run/log/vsandevicemonitord.log

2020-10-06 03:29:06,334 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8822c are [2, 3, 4, 7, 9, 10].

2020-10-06 03:29:06,437 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8a8a8 are [0, 2, 3, 5, 7, 11].

2020-10-06 03:29:06,543 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb884f8 are [3, 4, 5, 8, 9, 10].

2020-10-06 03:29:06,643 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99b0 are [3, 4, 5, 7, 8, 10].

2020-10-06 03:29:06,754 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9950 are [0, 1, 2, 4, 6, 9].

2020-10-06 03:29:06,854 INFO vsandevicemonitord Sample latency intervals for naa.5002538b497df070 are [0, 1, 4, 6, 7, 8].

2020-10-06 03:29:06,951 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99c0 are [1, 2, 3, 7, 10, 11].

2020-10-06 03:29:07,061 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072cb770 are [0, 1, 3, 5, 6, 7].

2020-10-06 03:29:07,173 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9a20 are [0, 3, 7, 8, 10, 11].

2020-10-06 07:09:27,413 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8822c are [0, 2, 3, 4, 8, 9].

2020-10-06 07:09:27,538 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8a8a8 are [2, 3, 7, 8, 9, 11].

2020-10-06 07:09:27,635 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb884f8 are [0, 1, 3, 8, 10, 11].

2020-10-06 07:09:27,743 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99b0 are [5, 6, 8, 9, 10, 11].

2020-10-06 07:09:27,845 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9950 are [0, 1, 7, 8, 9, 10].

2020-10-06 07:09:27,940 INFO vsandevicemonitord Sample latency intervals for naa.5002538b497df070 are [0, 1, 4, 6, 9, 10].

2020-10-06 07:09:28,042 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99c0 are [3, 4, 5, 9, 10, 11].

2020-10-06 07:09:28,141 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072cb770 are [3, 5, 7, 8, 9, 11].

2020-10-06 07:09:28,239 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9a20 are [1, 2, 3, 7, 8, 10].

2020-10-06 10:49:47,335 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8822c are [2, 4, 5, 6, 9, 11].

2020-10-06 10:49:47,445 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8a8a8 are [2, 3, 6, 8, 9, 10].

2020-10-06 10:49:47,556 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb884f8 are [2, 3, 4, 6, 8, 11].

2020-10-06 10:49:47,669 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99b0 are [1, 3, 5, 6, 9, 11].

2020-10-06 10:49:47,785 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9950 are [0, 2, 3, 8, 9, 11].

2020-10-06 10:49:47,894 INFO vsandevicemonitord Sample latency intervals for naa.5002538b497df070 are [1, 3, 4, 6, 7, 10].

2020-10-06 10:49:48,001 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99c0 are [1, 4, 5, 6, 9, 10].

2020-10-06 10:49:48,108 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072cb770 are [0, 2, 5, 6, 8, 11].

2020-10-06 10:49:48,304 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9a20 are [0, 2, 5, 7, 10, 11].

2020-10-06 14:30:07,716 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8822c are [0, 2, 5, 6, 8, 9].

2020-10-06 14:30:07,842 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8a8a8 are [0, 2, 5, 6, 7, 9].

2020-10-06 14:30:07,953 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb884f8 are [1, 2, 4, 9, 10, 11].

2020-10-06 14:30:08,056 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99b0 are [0, 1, 2, 4, 6, 7].

2020-10-06 14:30:08,180 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9950 are [1, 2, 4, 5, 9, 11].

2020-10-06 14:30:08,298 INFO vsandevicemonitord Sample latency intervals for naa.5002538b497df070 are [0, 1, 5, 9, 10, 11].

2020-10-06 14:30:08,406 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99c0 are [0, 1, 2, 5, 7, 10].

2020-10-06 14:30:08,512 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072cb770 are [1, 2, 4, 7, 9, 11].

2020-10-06 14:30:08,611 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9a20 are [4, 5, 8, 9, 10, 11].

2020-10-06 18:10:29,159 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8822c are [3, 4, 5, 8, 10, 11].

2020-10-06 18:10:29,274 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb8a8a8 are [1, 4, 5, 6, 8, 11].

2020-10-06 18:10:29,382 INFO vsandevicemonitord Sample latency intervals for naa.5000cca04fb884f8 are [0, 1, 3, 4, 7, 8].

2020-10-06 18:10:29,486 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99b0 are [2, 3, 5, 6, 9, 10].

2020-10-06 18:10:29,603 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9950 are [1, 2, 8, 9, 10, 11].

2020-10-06 18:10:29,720 INFO vsandevicemonitord Sample latency intervals for naa.5002538b497df070 are [0, 1, 3, 4, 9, 10].

2020-10-06 18:10:29,835 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c99c0 are [2, 4, 6, 7, 9, 11].

2020-10-06 18:10:29,959 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072cb770 are [1, 3, 4, 5, 7, 11].

2020-10-06 18:10:30,080 INFO vsandevicemonitord Sample latency intervals for naa.5002538a072c9a20 are [1, 2, 5, 8, 9, 10].

[root@host201:~]


[root@host201:~] zcat /var/run/log/vmkwarning.*.gz |grep "T08:2"

2020-10-06T08:21:50.102Z cpu48:2099438)WARNING: PLOG: DDPCompleteDDPWrite:5756: Throttled: DDP write failed I/O error callback PLOGDDPWriteCbFn@com.vmware.plog#0.0.0.1, diskgroup 52913675-2337-4322-093f-4233001081a8 txnScopeIdx 4

2020-10-06T08:21:50.106Z cpu48:2099438)WARNING: PLOG: PLOGDDPWriteCbFn:705: DDP write failed on device 5248e7a7-8c24-8b12-13b8-2da7c262cc52:I/O error (ssdPerm: no)elevIo 8

2020-10-06T08:21:50.106Z cpu7:2099104)WARNING: PLOG: PLOGPropagateError:3061: DDP: Propagating error state from original device 5248e7a7-8c24-8b12-13b8-2da7c262cc52

2020-10-06T08:21:50.106Z cpu7:2099104)WARNING: PLOG: PLOGPropagateError:3103: DDP: Propagating error state to MDs in device 52913675-2337-4322-093f-4233001081a8

2020-10-06T08:21:50.106Z cpu7:2099104)WARNING: PLOG: PLOGPropagateErrorInt:3002: Permanent error event on 5248e7a7-8c24-8b12-13b8-2da7c262cc52

2020-10-06T08:21:50.106Z cpu42:2100619)WARNING: LSOM: LSOMEventNotify:7752: vSAN device 5248e7a7-8c24-8b12-13b8-2da7c262cc52 is under permanent error.

2020-10-06T08:21:50.106Z cpu7:2099104)WARNING: PLOG: PLOGPropagateErrorInt:3018: Error/unhealthy propagate event on 522e22bd-b897-ec17-1391-1b921e50e3f4

2020-10-06T08:21:50.106Z cpu7:2099104)WARNING: PLOG: PLOGPropagateErrorInt:3018: Error/unhealthy propagate event on 52913675-2337-4322-093f-4233001081a8

2020-10-06T08:21:50.106Z cpu42:2100619)WARNING: LSOM: LSOMEventNotify:7763: vSAN device 522e22bd-b897-ec17-1391-1b921e50e3f4 is under propagated permanent error.

2020-10-06T08:21:50.106Z cpu42:2100619)WARNING: LSOM: LSOMEventNotify:7763: vSAN device 52913675-2337-4322-093f-4233001081a8 is under propagated permanent error.

2020-10-06T08:22:37.139Z cpu33:2100619)WARNING: LSOM: LSOM_CapUpdate:338: Throttled: Failed to update disk status for disk 522e22bd-b897-ec17-1391-1b921e50e3f4

2020-10-06T08:22:37.139Z cpu33:2100619)WARNING: LSOM: LSOM_CapUpdate:397: Throttled: Failed to update disk status for disk 522e22bd-b897-ec17-1391-1b921e50e3f4

2020-10-06T08:22:54.521Z cpu45:13020383)WARNING: MonLoader: 734: MonLoaderCallout_GetSharedHostPage: Invalid page offset 0 for region 8 vcpu 0

2020-10-06T08:22:54.521Z cpu45:13020383)WARNING: MonLoader: 734: MonLoaderCallout_GetSharedHostPage: Invalid page offset 0 for region 8 vcpu 1

2020-10-06T08:24:24.458Z cpu6:2100619)WARNING: LSOM: RCDoCachedVmfsIo:4733: VMFS read failed. status: No connection

2020-10-06T08:24:24.458Z cpu6:2100619)WARNING: LSOM: RCEnqueueIo:5137: Failed to do VMFS IO. status: No connection

2020-10-06T08:27:42.143Z cpu10:2100619)WARNING: LSOM: LSOM_CapUpdate:338: Throttled: Failed to update disk status for disk 522e22bd-b897-ec17-1391-1b921e50e3f4

2020-10-06T08:27:42.143Z cpu10:2100619)WARNING: LSOM: LSOM_CapUpdate:397: Throttled: Failed to update disk status for disk 522e22bd-b897-ec17-1391-1b921e50e3f4

[root@host201:~]

Reply
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello winsolo​,

So, as it has been a long long time since I noted such things in logs, but out of interest I had a shallow dig into the vdrb log messages and on some of the cases and PRs I looked at this can be expected behaviour if you have a lot of DLRs (e.g. hundreds) - that being said (and provided this is expected logging behaviour in latest versions etc.), I would advise you take measures to extend how many much data your vmkernel.log retains as only having a few hours of this available will make troubleshooting a variety of issues extremely difficult if not impossible (unless the issue is easily reproduced).

The absence of anything error or latency-related logged for the disk in vsandevicemonitord.log basically just tells us that it wasn't DDH that kicked out the disk (e.g. due to relatively long periods of very high latency).

From the vmkwarning.log it informs that the disk (UUID: 5248e7a7-8c24-8b12-13b8-2da7c262cc52 - naa.5002538a072c9a20) was failed to due to a dedupe I/O write error - as I mentioned above, (unlike vmkernel.log) it doesn't tell us whether this was due to returned hardware error, bad blocks being written to or some other issue, so without knowing this (and if replacing the device is difficult/not possible), I would advise opening a case with my colleagues in vSAN GSS to use an internal tool to perform some actions on the device before adding it back to the Disk-Group (PM me if they are for whatever reason unaware of what the ask is and/or what needs to be done).

Bob

Reply
0 Kudos
winsolo
Enthusiast
Enthusiast
Jump to solution

Hey TheBobkin​,

As you recommended, I engaged VMware vSAN GSS and had them fix the issue without replacing the SSD.

Reply
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello winsolo​,

Thanks for following up.

Just for broader awareness, this is only the case for SSDs marked as failed due to unrecovered read errors - other failure reasons e.g. H:0x1 or any 0x4 (AKA the sense code sub-category for 'Hardware error') or devices with high latency are not fixable by GSS (nor anyone).

If this failure type is encountered, the process for preventing re-occurrence can be automated in later builds and is documented here:

VMware Knowledge Base

Bob

Reply
0 Kudos