There are existing KBs on this
That one describes what you are seeing, but recommends you collect and analyse the logs to find the root cause using
VMware KB: Extracting the log file after an ESX or ESXi host fails with a purple screen error
However it does appear that it thinks one of your CPUs if faulty.
Do you have hardware monitoring/logs such as an IML on an HP server?
That may also tie in with the above logs to show when a hardware failure/issue/overheat occurred.
Regards
Chris
I will make sure to grab a core dump/extract logs the next time this happens, per your link. In the mean time, I will run some I/O intensive tasks to see if I can get it to 'purple screen'.
I had also looked the KB article before for the error I am experiencing. My 'purple screen' is much smaller. I have downgraded to ESXi 5.1u1, and will step it up to new u2 once I can assume it is stable.
We're seeing the identical PSOD. We've gotten it on 2 Blade (BL460c Gen6) in our server cluster in the past 4 weeks. We've opened a ticket with VMware regarding it. Did you ever manage to get a reolution? Our other datacenter server cluster is on the same version of ESX, but (so far) hasn't had the issue. Those are G7s though.
I would suggest patching your systems to the latest patches and also open SR with VMware to perform RCA, PSOD is not something you can easily troubleshoot, messages can be cryptic. But looking the attached screen you may be having some issues with your memory or memory lookup.
VMware KB: Understanding a Failed to ack TLB invalidate purple diagnostic screen
vfk
Hi,
if you have HP server then maybe this:
Thanks for the link. it doesn't QUITE match our's, but, who knows maybe it's quite similar. Here's what we have seen: