VMware Cloud Community
Rprice
Contributor
Contributor

Recent Upgrade to Build-2135321 then PSOD 1-2 weeks later

I have two Dell PowerEdge R805's in the Datacenter and about a month ago used VMWare update manager to upgrade them both to build 2135321.

One of the 2 hosts has had the following PSOD two times since the remediation.

VMware ESX Server 3i

Machine Check Exception: Unable to continue

cr2=0x251cc cr3=0xac8ea000 cr4=0x668

Frame=0x3c4fdc ip=0x62b3e5

Es=0xffffffff ds=0xffffffff fs=0xffffffff gs=0xffffffff

Eax=0xffffffff ebx=0xffffffff ecx=0xffffffff edx0xffffffff

Ebp=0x3c4fed4 esi=0xffffffff edi=0xffffffff err=-1 eflags=0xffffffff

0 : 5269/vmm1 : ZH-A *1 : 4243/vmm0 : ZH-A 2 : 4798/vmm0 : ZZ-A 3 : 3983/vmm0 : ZZ-S

4 : 4501/vmm0 : ZZ-E 5 : 4014/mks : Pass4 6 : 5260/vmm1 : ZZ-T 7 : 4505/vmm1 : ZZ-E

0x3c4fed4 : Panic+0x10 stack : 0x84ea58, 0x0, 0x0

0x3c4feec : MCE_HandleException+0x6b stack: 0x3c4ff58, 0x0, 0x248ebd8

0x3c4ff84: IDT_VMMIntOrMCE+0x7e stack: 0x2d, 0x82000000, 0x3af77d50

0x3c4ffd8: VMKCall+0x12c stack: 0c2d, 0c3af77d50, 0x82000000

0x3c4fffc: VMKVMMEnterVMKernel+0x8e stack: 0x0, 0x0, 0x0

VMK uptime: 10:09:16:19.271 TSC: 1794830457833848

10:09:16:19.269 cpu1:4243)MCE: 169 Machine Check Exception: General Status 0000000000000004

10:09:16:19.269 cpu1:4243)MCE: 193 Machine Check Exception: Bank 0, Status f600000000010015

10:09:16:19.269 cpu1:4243)MCE: 226: Machine Check Exception: Bank 0, Addr 0000f80001780000, Valid TRUE

Starting coredump to disk using slot 1 of 1… 987666665432110 Disk dump successful.

Waiting for debugger… (world 4243)

Debugger is listening on serial port …

Press Escape to enter local debugger

Is this a memory issue or should I wait till it happens again and get a serial cable plugged in to tget to the bottom of this issue? The most annoying part of this PSOD is that HA does not kick in and failover the VM's to my other host even though HA is technically fully functional.

Tags (3)
Reply
0 Kudos
2 Replies
Rprice
Contributor
Contributor

Still looking for help on this issue. This PSOD seems to happen every 2-3 weeks.

Reply
0 Kudos
marcelo_soares
Champion
Champion

If you have a "Machine Check Exception" you are running into hardware problems for sure. Contact your hardware vendor. Usually this problems means a memory bank damaged.

Marcelo Soares

VMWare Certified Professional 310/410

Virtualization Tech Master

Globant Argentina

Consider awarding points for "helpful" and/or "correct" answers.

Marcelo Soares
Reply
0 Kudos