Welcome to the Community,
I'd suggest you run a hardware diagnostic on your server. If the system manufacturer does not provide such a tool, then at least run a memory check. A lot of errors are caused by defective memory. (http://kb.vmware.com/kb/831)
André
Hi,
I tried to decode the MCA, but I think I didn't get it. Perheps someone could help me. I attach the kernel-dump.
Would be nice if we could do this together. This is the first time I do this.
Thx for your help.
Some System information for you:
System: ICO (Intel S5000PSL)
CPU: Dual Intel Xeon E5410 @2.33 Ghz
RAM: 16 GB
Running VMs: 6 (all Windows 2003 R2 32bit)
I told my colleagues to run Memtest asap.
Klaus
The dumps aren't always the easiest to decode. MCE's are typically hardware related. Usually DIMMs or CPU. The first thing I would do is ensure all your firmware is current. Then take the VMkernel dump and send it to your hardware vendor.
I would also suggest running a vm-support dump which will incluse the VMkernel dump and send to your hardware vendor as well.
Hi,
I did memtest 86+ for 72h. No Errors.
Now I'm searching for a (freeware) CPU stress test util.
Do you know a goog one?
Thx,
Klaus
Hi,
If you look carefully at the second line of the PSOD screen itself, it explains to a certain degree what the problem is.
Your server has experienced what is known as a Machine-Check exception (#MC). Near the bottom of the screen you should see the information from the registers of the Machine-Check Architecture of the CPU that generated the exception.
Going back to the second line, we also see that VMware ESX has decoded the "MCA Error Code" of the status code. In that it says that a Bus and Interconnect error was seen. Memory tests alone may not help in identifying the problematic hardware. Use the data from the screen and provided it to your hardware vendor to review so they can take the required action to correct the hardware problem that caused this crash in the first place.
I hope this helps.
Faisal Akber