VMware Cloud Community
ajlassi123
Contributor
Contributor

Purple screen on ESXi 6.7 u3

Dear experts,

Recently, I got several purple screen on our running production ESXi 6.7u3 server.

After restarting the server is functional but after 2 days server is crashed again (same error).

Version: Cisco-ESXi-6.7U3 build 13806683

Server: Cisco UCS 5108 Server Blade

Anybody faced the same error and any resolution?

Thanks.

The purple screen and the log file blade server are attached.

Labels (2)
Tags (1)
0 Kudos
1 Reply
e_espinel
Virtuoso
Virtuoso

Hello.
According to the hardware log sent by you, it could be a problem with memory DIMMs, CPU and/or mainboard.
There are many such events in the log
read 1 correctable ECC errors on CPU1 DIMM B1
Processor P_CATERR #0x50
It is recommended in this case:
1. Perform a deep internal physical cleaning of the blade server (memory slot connectors, CPU heatsink, change the CPU thermal paste), this should be done by experienced technical personnel.
2. Update all Blade server internal firmware (Bios, Board controller, CIMC, SAS, VIC), clean the internal hardware log.
Anyway these two activities are healthy for your blade server.

Run the blade server and monitor the hardware log for several days, if the events related to DIMM B1 occur again, it should be changed.

You could also directly replace the DIMM B1 with a DIMM from another less important blade server as a test.

 

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
0 Kudos