VMware Cloud Community
junhao666
Contributor
Contributor

ESXi 6.7 vSAN host experiences a purple diagnostic screen at bora/modules/vmkernel/lsomcommon/ssdlog/ssdopslog.c:130

Setup:

1.vCenter 6.7 build 16708996

2.Hosts 6.7 build 15160138

3.VSAN disks on version 10.0

  I've been having trouble with that and i'm no idea how to fix it,.

2d41c6feca8b5f3bfaff7348e53dcf7.png

Reply
0 Kudos
6 Replies
Nawals
Expert
Expert

Here is the kb for same issue. However, they saying are fixed in 6.5 VSAN 6.6.1. Please check with VMware support

VMware Knowledge Base

NKS Please Mark Helpful/correct if my answer resolve your query.
Reply
0 Kudos
TheBobkin
Champion
Champion

Hello junhao666​,

Welcome to Communities.

A few questions:

- Is this PSOD recursive? (e.g. it occurs every time you reboot the ESXi host)

- If yes, is the data in the remaining nodes in the cluster currently healthy?

- If no, is this a production cluster or a homelab?

Nawals, this PSOD may be occurring for a completely different reason than in the kb you mentioned, but in the same problem area (SSD log is damaged and failing to be read correctly).

Bob

Reply
0 Kudos
junhao666
Contributor
Contributor

It's very similar but it's not this point,but thanks your help.

Reply
0 Kudos
junhao666
Contributor
Contributor

Hi,Bob

1.The problem occurred while the host was running normally and I need to restart the host to get back to normal

2.The data in the remaining nodes in the cluster is currently healthy and Everything was healthy when I ran the Cluster vSAN Skyline Healty check.

3.It's a Production cluster but can remove the error host

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello,

"1.The problem occurred while the host was running normally and I need to restart the host to get back to normal"

I understand, but I was specifically asking if it is recursive e.g. occurring every time you reboot the host - if it is (and all data is healthy) then it should just be a case of rebooting the host and then during ESXi preboot press shift+o, disable the vSAN modules and remove the currently problematic Disk-Group, then reboot the host normally:

VMware Knowledge Base

Another temporary alternative to removing the Disk-Group is to detach the Cache-tier SSD in question (either physically, via Controller/BIOS settings or from the vSphere UI storage devices page after temporarily disabling vSAN modules).

Bob

Reply
0 Kudos
junhao666
Contributor
Contributor

Hi.

Not all data is healthy when the host is rebooted, the machine needs to be started in 60 minutes, or components will be resynchronized

Did you mean there is something wrong with my SSD cache disk?

Reply
0 Kudos