Re: ESXi 6.7 vSAN host experiences a purple diagno...

junhao666 · ‎09-25-2020

Setup:

1.vCenter 6.7 build 16708996

2.Hosts 6.7 build 15160138

3.VSAN disks on version 10.0

I've been having trouble with that and i'm no idea how to fix it,.

Nawals · ‎09-26-2020

Here is the kb for same issue. However, they saying are fixed in 6.5 VSAN 6.6.1. Please check with VMware support

VMware Knowledge Base

NKS Please Mark Helpful/correct if my answer resolve your query.

TheBobkin · ‎09-26-2020

Hello junhao666,

Welcome to Communities.

A few questions:

- Is this PSOD recursive? (e.g. it occurs every time you reboot the ESXi host)

- If yes, is the data in the remaining nodes in the cluster currently healthy?

- If no, is this a production cluster or a homelab?

Nawals, this PSOD may be occurring for a completely different reason than in the kb you mentioned, but in the same problem area (SSD log is damaged and failing to be read correctly).

Bob

junhao666 · ‎09-26-2020

It's very similar but it's not this point,but thanks your help.

junhao666 · ‎09-26-2020

Hi,Bob

1.The problem occurred while the host was running normally and I need to restart the host to get back to normal

2.The data in the remaining nodes in the cluster is currently healthy and Everything was healthy when I ran the Cluster vSAN Skyline Healty check.

3.It's a Production cluster but can remove the error host

TheBobkin · ‎09-27-2020

Hello,

"1.The problem occurred while the host was running normally and I need to restart the host to get back to normal"

I understand, but I was specifically asking if it is recursive e.g. occurring every time you reboot the host - if it is (and all data is healthy) then it should just be a case of rebooting the host and then during ESXi preboot press shift+o, disable the vSAN modules and remove the currently problematic Disk-Group, then reboot the host normally:

VMware Knowledge Base

Another temporary alternative to removing the Disk-Group is to detach the Cache-tier SSD in question (either physically, via Controller/BIOS settings or from the vSphere UI storage devices page after temporarily disabling vSAN modules).

Bob

junhao666 · ‎09-27-2020

Hi.

Not all data is healthy when the host is rebooted, the machine needs to be started in 60 minutes, or components will be resynchronized

Did you mean there is something wrong with my SSD cache disk?

All

ESXi 6.7 vSAN host experiences a purple diagnostic screen at bora/modules/vmkernel/lsomcommon/ssdlog/ssdopslog.c:130