Hi All,
So, I have Dell R430 with ESXi PSOD issue. The installed ESXi 6.5.0 U3 OS has PSOD when it starts to boot up and sometimes it starts to power on some VM then end up with PSOD.
I try update the BIOS, Firmware, Driver with Platform specific bootable ISO from Dell but still got the same PSOD. sometimes it changes to GP exception 13.
I also want to reinstall the ESXi OS with version 6.5.0 U3 (DellEMC customized) and version 7.0 but they also got PSOD during the state below with vary module each time. Last time it goes to the upgrade/installation state and got PSOD during installation. it says bad c-state. Look like faulty CPU?
Many PSODs are due to a memory DIMM going bad. As soon as an unrecoverable memory error is discovered - the CPU will throw an exception to the OS and ESXi will instantly dump to a purple screen of death - to protect your data from corruption. That's a General Protection fault (GP).
If your server can POST and boot normally - and the PSOD happens randomly after that - I'd bet you have a bad DIMM somewhere. It could also be a faulty CPU or even a voltage regulator that can't maintain enough power to the CPU.
Run MEMTest on your server for a few hours and I bet you'll find the culprit. Dell also has a utility they call 32-bit diagnostic tool that they recommend running to verify the health of the hardware. Here's a link to the kbase article.
Arggh... I disable c-state and C1E in the BIOS and it's back to work like normal... not sure what the hell is going on...
That’s ok, they should be disabled for ESXi to run most efficiently. I couldn’t find a Dell document that states this but HPE and Cisco both have BIOS guides that recommend disabling c-states to prevent the CPU from going to a lower performance mode.
I think you solved your own issue.
