VMware Cloud Community
vmroyale
Immortal
Immortal
Jump to solution

PSOD - Anyone seen this one?

Came in to work this morning to be greeted by my first ever PSOD. I checked out kb 1004250, but I really can't seem to find too much info on this one. Anyone ever seen this or have any guesses on it?

Thanks,

Brian

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
0 Kudos
1 Solution

Accepted Solutions
Troy_Clavell
Immortal
Immortal
Jump to solution

nothing like rolling into the holiday weekend smoothly

View solution in original post

0 Kudos
8 Replies
Troy_Clavell
Immortal
Immortal
Jump to solution

MCE's are always hardware related. I would check CPU first, but it could also be a memory issue. VMware Support won't help with MCE's as they are hardware vendor specefic.

Rohail2004
Enthusiast
Enthusiast
Jump to solution

What were you doing on the host before it crashed? were you doing storage scanning?

0 Kudos
vmroyale
Immortal
Immortal
Jump to solution

Thanks Troy. The Dell OpenManage agents are reporting no problems, the hardware status tab looks clean for this host, and I can't find anything of any real interest in the logs either. Guess I'll roll a few unimportant VMs over on this host and continue to monitor it for a while.

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
0 Kudos
Troy_Clavell
Immortal
Immortal
Jump to solution

also, when the ESX Host comes up, you should have a dump file off /

I usually open them with WinSCP. Sometimes it's obvious what the error is, sometimes not. But look for keywords MCE.

vmroyale
Immortal
Immortal
Jump to solution

From what I can tell, nothing was going on on the host. This happened before work hours began, so there were no manual admin actions going on. In the hour leading up to the HA event, each and every VM on the host had CPU alarms issued as the cpu usage went from green to yellow then red.

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
0 Kudos
Rohail2004
Enthusiast
Enthusiast
Jump to solution

I had the same problem couple of weeks ago, but our storage guy was performing storage scanning and the ESX host crashed.. I rebooted it, came back, and since then it is been running fine. Sent the logs to VMware support and they said it could be because of an old HBA firmware issue..

0 Kudos
vmroyale
Immortal
Immortal
Jump to solution

Used the dump info along with some info I found in kb 1005184. Appears to be something with the CPU data cache on CPU 2. Moving some "less important" VMs back on the host now and will keep on watching it. Good times on a Friday going into a holiday weekend!!

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
0 Kudos
Troy_Clavell
Immortal
Immortal
Jump to solution

nothing like rolling into the holiday weekend smoothly

0 Kudos