VMware Cloud Community
nics30
Contributor
Contributor

Machine Check Error message on ESX Server.

We have a relatively new ESX Server 3.5 update 2, X4200 Mseries which has the following message in the /var/log/vmkernel and /var/log/vmkwarning files:

cpu4:1128)WARNING: MCE: 230: Machine Check Error: Bank 4

I have done some searching on vmware communities and it looks like it is a hardware related error- I am running the hardware diagnostics tool on the alom in the VTS and will post my findings on what has caused this message.

Does anyone have any relevant info to add onto this?

Reply
0 Kudos
5 Replies
AndreTheGiant
Immortal
Immortal

Use memtest86 to check if memory is fine.

See also:

Decoding Machine Check Exception (MCE) output after a purple screen error

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1005184&sl...

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
Reply
0 Kudos
pramodupadhyay5
Enthusiast
Enthusiast

this error is clearly showing that the Ram of the Bank 4 is faulty...so replace it or try to power on the server with the bank4 RAM.....

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points
Reply
0 Kudos
nics30
Contributor
Contributor

I have run extensive diagnostic tests on the memory (10 hours) and there are no errors reported. The hardware looks like there is nothing wrong with it.

Is there anything else that could be causing this issue?

Reply
0 Kudos
Datto
Expert
Expert

If Memtest86 or Memtest86+ says, after six full tests iterations, that your memory is good, you instead might need a BIOS or firmware update. This assumes all the memory in the machine is identical in manufacture and memory speed/model.

Datto

Reply
0 Kudos
BenConrad
Expert
Expert

This is incorrect, the Bank # does not correspond with a memory bank. Because this server is AMD bank 4 would be: 'Northbridge and DRAM'. You would need to (attempt) to decipher the info in KB 1005184 to see what the status message means, here is an example MCE entry:

vmkernel: 0:09:55:02.520 cpu0:1024)WARNING: MCE: 196: Machine Check Error: Bank 8, Status 8c0000400001009f

Reading the KB you are supposed to be able to decode the status message, I just got the message above and I'm considering sending this off to support. BTW, I have an Intel CPU and the KB says "In this case the CPU manufacturer is Intel and thus, you are not able to determine the significance of the bank number".... great....

Ben

Reply
0 Kudos