Hi Forum, My host experience purple screen MCE. I took a screen pic and collected zdump however I am not able to interpret the root cause because of my limited understanding of troubleshooting techniques. I have attached the pic and extracted log file. Requesting you help to understand the root cause. Adding some portion of the log below.
2016-09-04T10:45:22.687Z cpu3:33122)<6>hub 2-0:1.0: suspended
2016-09-04T10:46:17.228Z cpu0:32782)World: 9740: PRDA 0x418040000000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0
2016-09-04T10:46:17.228Z cpu1:33079)World: 9740: PRDA 0x418040400000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0
2016-09-04T10:46:17.228Z cpu2:32781)World: 9740: PRDA 0x418040800000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0
2016-09-04T10:46:17.228Z cpu3:35426)World: 9740: PRDA 0x418040c00000 ss 0x4018 ds 0x4018 es 0x4018 fs 0x0 gs 0x0
2016-09-04T10:46:17.228Z cpu1:33079)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)
2016-09-04T10:46:17.228Z cpu0:32782)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)
2016-09-04T10:46:17.228Z cpu2:32781)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)
2016-09-04T10:46:17.228Z cpu1:33079)World: 9743: CR0 0x80050031 CR3 0x17b6bb000 CR4 0x42668
2016-09-04T10:46:17.228Z cpu0:32782)World: 9743: CR0 0x80050031 CR3 0x17ba56000 CR4 0x42668
2016-09-04T10:46:17.228Z cpu3:35426)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)
2016-09-04T10:46:17.228Z cpu2:32781)World: 9743: CR0 0x80050031 CR3 0x16908a000 CR4 0x42668
2016-09-04T10:46:17.228Z cpu3:35426)World: 9743: CR0 0x80050031 CR3 0x17c0e3000 CR4 0x42668
2016-09-04T10:46:17.259Z cpu1:33079)Panic: 634: Panic from another CPU (cpu 1, world 33079): ip=0x418011c780a0 randomOff=0x11c00000:
Machine Check Exception: Fatal (unrecoverable) MCE on PCPU1 in world 33079:helper39-3
System has encountered a Hardware Error - Please contact the hardware vendor
2016-09-04T10:46:17.259Z cpu1:33079)Backtrace for current CPU #1, worldID=33079, rbp=0x410014740008
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9ba80:[0x41801246093b]e1000_intr@<None>#<None>+0x77 stack: 0x0, 0x41801231005d, 0x101c, 0x
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9baa0:[0x41801231005d]Linux_IRQHandler@com.vmware.driverAPI#9.2+0x25 stack: 0x418012310054
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bad0:[0x418011c5a3d6]IntrCookie_DoInterrupt@vmkernel#nover+0x41e stack: 0x780, 0x0, 0x430
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bb80:[0x418011c56940]IDT_IntrHandler@vmkernel#nover+0x104 stack: 0x0, 0x418040400200, 0x0
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bbb0:[0x418011cc7044]gate_entry_@vmkernel#nover+0x0 stack: 0x0, 0x0, 0x0, 0x0, 0x41804040
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bc70:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x4180405
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bcc0:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159ff62b, 0x10
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bd40:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0xffffffffffffffff, 0x
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9be60:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x43054b304240, 0x5d01
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bee0:[0x418011e141be]CpuSched_TimedWaitIRQ@vmkernel#nover+0x7e stack: 0x43054b304240, 0x4
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bf30:[0x418011c503ce]helpFunc@vmkernel#nover+0x5f2 stack: 0x0, 0x43054b3037e0, 0x27, 0x0,
2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0
2016-09-04T10:46:17.259Z cpu1:33079)Panic: 769: Halting PCPU 1.
2016-09-04T10:46:17.290Z cpu3:35426)Panic: 634: Panic from another CPU (cpu 3, world 35426): ip=0x418011c780a0 randomOff=0x11c00000:
Machine Check Exception: Fatal (unrecoverable) MCE on PCPU3 in world 35426:vmm0:win7
System has encountered a Hardware Error - Please contact the hardware vendor
2016-09-04T10:46:17.290Z cpu3:35426)Backtrace for current CPU #3, worldID=35426, rbp=0x0
2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bcf8:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x418040d
2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bd48:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159b4ea2, 0x10
2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bdc8:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0x4391135a7b00, 0x4391
2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bee8:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x410014a2cde0, 0x0, 0xa000
2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bf68:[0x418011e140da]CpuSched_VcpuHalt@vmkernel#nover+0x11e stack: 0xffffffff00002001, 0x
2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bfb8:[0x418011cabe39]VMMVMKCall_Call@vmkernel#nover+0x139 stack: 0x418011cab988, 0x0, 0x4
2016-09-04T10:46:17.290Z cpu3:35426)Panic: 769: Halting PCPU 3.
2016-09-04T10:46:17.352Z cpu2:32781)Panic: 634: Panic from another CPU (cpu 2, world 32781): ip=0x418011c780a0 randomOff=0x11c00000:
Machine Check Exception: Fatal (unrecoverable) MCE on PCPU2 in world 32781:coalesceWorl
System has encountered a Hardware Error - Please contact the hardware vendor
2016-09-04T10:46:17.352Z cpu2:32781)Backtrace for current CPU #2, worldID=32781, rbp=0x0
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bbe0:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x4180409
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bc30:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159b79d1, 0x10
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bcb0:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0xef, 0x439080d04001,
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bdd0:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x0, 0x80069be78, 0x0,
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069be50:[0x418011e144bf]CpuSched_SleepUntilTC@vmkernel#nover+0x8f stack: 0x400002001, 0x4390
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069beb0:[0x418011db9b64]NetCoalesceDefaultWorldCB@vmkernel#nover+0x190 stack: 0x0, 0x0, 0x0,
2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0
2016-09-04T10:46:17.352Z cpu2:32781)Panic: 769: Halting PCPU 2.
2016-09-04T10:46:17.383Z cpu0:32782)Backtrace for current CPU #0, worldID=32782, rbp=0x0
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bcd0:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x4180401
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bd20:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159bc891, 0x10
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bda0:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0x43018bafc634, 0x4180
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bec0:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x0, 0x80071bf68, 0x0,
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bf40:[0x418011e144bf]CpuSched_SleepUntilTC@vmkernel#nover+0x8f stack: 0x20c49ba500002001,
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bfa0:[0x418011dbd5ba]NetCoalesce2WorldCB@vmkernel#nover+0xde stack: 0x4390c00a7100, 0x439
2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0
2016-09-04T10:46:17.413Z cpu0:32782) [45m [33;1mVMware ESXi 6.0.0 [Releasebuild-4192238 x86_64] [0m
Machine Check Exception: Fatal (unrecoverable) MCE on PCPU0 in world 32782:netCoalesce2
System has encountered a Hardware Error - Please contact the hardware vendor
2016-09-04T10:46:17.416Z cpu0:32782)cr0=0x8001003d cr2=0x1bc1ef78 cr3=0x8001b000 cr4=0x216c
2016-09-04T10:46:17.416Z cpu0:32782)Last branch from 0x418011f02533 to 0x418011f025eb
2016-09-04T10:46:17.417Z cpu0:32782)frame=0x4390c071bc10 ip=0x418011f0263a err=18 rflags=0x10202
2016-09-04T10:46:17.418Z cpu0:32782)rax=0x0 rbx=0x418040000000 rcx=0x0
2016-09-04T10:46:17.418Z cpu0:32782)rdx=0x0 rbp=0x0 rsi=0x27f2159bb9c6
2016-09-04T10:46:17.418Z cpu0:32782)rdi=0x43004d0f51f0 r8=0x15 r9=0x0
2016-09-04T10:46:17.419Z cpu0:32782)r10=0x0 r11=0x43004d0c8438 r12=0x418040000200
2016-09-04T10:46:17.419Z cpu0:32782)r13=0x0 r14=0x40 r15=0x0
2016-09-04T10:46:17.419Z cpu0:32782)pcpu:0 world:32782 name:"netCoalesce2World" (S)
2016-09-04T10:46:17.420Z cpu0:32782)pcpu:1 world:33079 name:"helper39-3" (SH)
2016-09-04T10:46:17.420Z cpu0:32782)pcpu:2 world:32781 name:"coalesceWorld-0" (S)
2016-09-04T10:46:17.420Z cpu0:32782)pcpu:3 world:35426 name:"vmm0:win7" (V)
2016-09-04T10:46:17.420Z cpu0:32782)@BlueScreen: Machine Check Exception: Fatal (unrecoverable) MCE on PCPU0 in world 32782:netCoalesce2
System has encountered a Hardware Error - Please contact the hardware vendor
2016-09-04T10:46:17.420Z cpu0:32782)Code start: 0x418011c00000 VMK uptime: 0:04:53:32.503
2016-09-04T10:46:17.421Z cpu0:32782)0x4390c071bcd0:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0
2016-09-04T10:46:17.422Z cpu0:32782)0x4390c071bd20:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159bc891
2016-09-04T10:46:17.423Z cpu0:32782)0x4390c071bda0:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0x43018bafc634
2016-09-04T10:46:17.424Z cpu0:32782)0x4390c071bec0:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0
2016-09-04T10:46:17.425Z cpu0:32782)0x4390c071bf40:[0x418011e144bf]CpuSched_SleepUntilTC@vmkernel#nover+0x8f stack: 0x20c49ba500002001
2016-09-04T10:46:17.426Z cpu0:32782)0x4390c071bfa0:[0x418011dbd5ba]NetCoalesce2WorldCB@vmkernel#nover+0xde stack: 0x4390c00a7100
2016-09-04T10:46:17.427Z cpu0:32782)0x4390c071bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0
2016-09-04T10:46:17.435Z cpu0:32782)base fs=0x0 gs=0x418040000000 Kgs=0x0
2016-09-04T10:46:17.435Z cpu0:32782)3 other PCPUs are in panic.
2016-09-04T10:46:17.228Z cpu3:35426)MC:PCPU3 B:5 S:0xb200000080200e0f M:0x0 A:0x0 5
2016-09-04T10:46:17.228Z cpu0:32782)MC:PCPU0 B:5 S:0xb200001044100e0f M:0x0 A:0x0 4
2016-09-04T10:46:17.228Z cpu2:32781)MC:PCPU2 B:5 S:0xb200000084200e0f M:0x0 A:0x0 5
2016-09-04T10:46:17.228Z cpu1:33079)MC:PCPU1 B:5 S:0xb200001040100e0f M:0x0 A:0x0 4
Thanks in advance!!
Hi,
I checked the logs. Your hardware is broken. Can you check hardware? Cause of this problem is hardware.
Thanks.
Thanks for the reply! Is it possible to pinpoint the component like RAM, CPU, storage etc.?
I think this CPU issue. But you have to run the diagnostic on the server.
Thanks, one more question...what does world 33079:helper39-3 mean in Fatal (unrecoverable) MCE on PCPU1 in world 33079:helper39-3? Did this error originate from a virtual machine or from host itself?