VMware Cloud Community
Virtuali3ed
Enthusiast
Enthusiast

Purple Screen - Requesting help to understand root cause

Hi Forum, My host experience purple screen MCE. I took a screen pic and collected zdump however I am not able to interpret the root cause because of my limited understanding of troubleshooting techniques. I have attached the pic and extracted log file. Requesting you help to understand the root cause. Adding some portion of the log below.

2016-09-04T10:45:22.687Z cpu3:33122)<6>hub 2-0:1.0: suspended

2016-09-04T10:46:17.228Z cpu0:32782)World: 9740: PRDA 0x418040000000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0

2016-09-04T10:46:17.228Z cpu1:33079)World: 9740: PRDA 0x418040400000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0

2016-09-04T10:46:17.228Z cpu2:32781)World: 9740: PRDA 0x418040800000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0

2016-09-04T10:46:17.228Z cpu3:35426)World: 9740: PRDA 0x418040c00000 ss 0x4018 ds 0x4018 es 0x4018 fs 0x0 gs 0x0

2016-09-04T10:46:17.228Z cpu1:33079)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)

2016-09-04T10:46:17.228Z cpu0:32782)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)

2016-09-04T10:46:17.228Z cpu2:32781)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)

2016-09-04T10:46:17.228Z cpu1:33079)World: 9743: CR0 0x80050031 CR3 0x17b6bb000 CR4 0x42668

2016-09-04T10:46:17.228Z cpu0:32782)World: 9743: CR0 0x80050031 CR3 0x17ba56000 CR4 0x42668

2016-09-04T10:46:17.228Z cpu3:35426)World: 9742: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)

2016-09-04T10:46:17.228Z cpu2:32781)World: 9743: CR0 0x80050031 CR3 0x16908a000 CR4 0x42668

2016-09-04T10:46:17.228Z cpu3:35426)World: 9743: CR0 0x80050031 CR3 0x17c0e3000 CR4 0x42668

2016-09-04T10:46:17.259Z cpu1:33079)Panic: 634: Panic from another CPU (cpu 1, world 33079): ip=0x418011c780a0 randomOff=0x11c00000:

Machine Check Exception: Fatal (unrecoverable) MCE on PCPU1 in world 33079:helper39-3

System has encountered a Hardware Error - Please contact the hardware vendor

2016-09-04T10:46:17.259Z cpu1:33079)Backtrace for current CPU #1, worldID=33079, rbp=0x410014740008

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9ba80:[0x41801246093b]e1000_intr@<None>#<None>+0x77 stack: 0x0, 0x41801231005d, 0x101c, 0x

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9baa0:[0x41801231005d]Linux_IRQHandler@com.vmware.driverAPI#9.2+0x25 stack: 0x418012310054

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bad0:[0x418011c5a3d6]IntrCookie_DoInterrupt@vmkernel#nover+0x41e stack: 0x780, 0x0, 0x430

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bb80:[0x418011c56940]IDT_IntrHandler@vmkernel#nover+0x104 stack: 0x0, 0x418040400200, 0x0

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bbb0:[0x418011cc7044]gate_entry_@vmkernel#nover+0x0 stack: 0x0, 0x0, 0x0, 0x0, 0x41804040

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bc70:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x4180405

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bcc0:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159ff62b, 0x10

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bd40:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0xffffffffffffffff, 0x

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9be60:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x43054b304240, 0x5d01

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bee0:[0x418011e141be]CpuSched_TimedWaitIRQ@vmkernel#nover+0x7e stack: 0x43054b304240, 0x4

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bf30:[0x418011c503ce]helpFunc@vmkernel#nover+0x5f2 stack: 0x0, 0x43054b3037e0, 0x27, 0x0,

2016-09-04T10:46:17.259Z cpu1:33079)0x4390c9b9bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0

2016-09-04T10:46:17.259Z cpu1:33079)Panic: 769: Halting PCPU 1.

2016-09-04T10:46:17.290Z cpu3:35426)Panic: 634: Panic from another CPU (cpu 3, world 35426): ip=0x418011c780a0 randomOff=0x11c00000:

Machine Check Exception: Fatal (unrecoverable) MCE on PCPU3 in world 35426:vmm0:win7

System has encountered a Hardware Error - Please contact the hardware vendor

2016-09-04T10:46:17.290Z cpu3:35426)Backtrace for current CPU #3, worldID=35426, rbp=0x0

2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bcf8:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x418040d

2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bd48:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159b4ea2, 0x10

2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bdc8:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0x4391135a7b00, 0x4391

2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bee8:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x410014a2cde0, 0x0, 0xa000

2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bf68:[0x418011e140da]CpuSched_VcpuHalt@vmkernel#nover+0x11e stack: 0xffffffff00002001, 0x

2016-09-04T10:46:17.290Z cpu3:35426)0x43911311bfb8:[0x418011cabe39]VMMVMKCall_Call@vmkernel#nover+0x139 stack: 0x418011cab988, 0x0, 0x4

2016-09-04T10:46:17.290Z cpu3:35426)Panic: 769: Halting PCPU 3.

2016-09-04T10:46:17.352Z cpu2:32781)Panic: 634: Panic from another CPU (cpu 2, world 32781): ip=0x418011c780a0 randomOff=0x11c00000:

Machine Check Exception: Fatal (unrecoverable) MCE on PCPU2 in world 32781:coalesceWorl

System has encountered a Hardware Error - Please contact the hardware vendor

2016-09-04T10:46:17.352Z cpu2:32781)Backtrace for current CPU #2, worldID=32781, rbp=0x0

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bbe0:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x4180409

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bc30:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159b79d1, 0x10

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bcb0:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0xef, 0x439080d04001,

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bdd0:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x0, 0x80069be78, 0x0,

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069be50:[0x418011e144bf]CpuSched_SleepUntilTC@vmkernel#nover+0x8f stack: 0x400002001, 0x4390

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069beb0:[0x418011db9b64]NetCoalesceDefaultWorldCB@vmkernel#nover+0x190 stack: 0x0, 0x0, 0x0,

2016-09-04T10:46:17.352Z cpu2:32781)0x4390c069bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0

2016-09-04T10:46:17.352Z cpu2:32781)Panic: 769: Halting PCPU 2.

2016-09-04T10:46:17.383Z cpu0:32782)Backtrace for current CPU #0, worldID=32782, rbp=0x0

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bcd0:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0, 0x4180401

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bd20:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159bc891, 0x10

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bda0:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0x43018bafc634, 0x4180

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bec0:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x0, 0x80071bf68, 0x0,

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bf40:[0x418011e144bf]CpuSched_SleepUntilTC@vmkernel#nover+0x8f stack: 0x20c49ba500002001,

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bfa0:[0x418011dbd5ba]NetCoalesce2WorldCB@vmkernel#nover+0xde stack: 0x4390c00a7100, 0x439

2016-09-04T10:46:17.383Z cpu0:32782)0x4390c071bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0

2016-09-04T10:46:17.413Z cpu0:32782) [45m [33;1mVMware ESXi 6.0.0 [Releasebuild-4192238 x86_64] [0m

Machine Check Exception: Fatal (unrecoverable) MCE on PCPU0 in world 32782:netCoalesce2

System has encountered a Hardware Error - Please contact the hardware vendor

2016-09-04T10:46:17.416Z cpu0:32782)cr0=0x8001003d cr2=0x1bc1ef78 cr3=0x8001b000 cr4=0x216c

2016-09-04T10:46:17.416Z cpu0:32782)Last branch from 0x418011f02533 to 0x418011f025eb

2016-09-04T10:46:17.417Z cpu0:32782)frame=0x4390c071bc10 ip=0x418011f0263a err=18 rflags=0x10202

2016-09-04T10:46:17.418Z cpu0:32782)rax=0x0 rbx=0x418040000000 rcx=0x0

2016-09-04T10:46:17.418Z cpu0:32782)rdx=0x0 rbp=0x0 rsi=0x27f2159bb9c6

2016-09-04T10:46:17.418Z cpu0:32782)rdi=0x43004d0f51f0 r8=0x15 r9=0x0

2016-09-04T10:46:17.419Z cpu0:32782)r10=0x0 r11=0x43004d0c8438 r12=0x418040000200

2016-09-04T10:46:17.419Z cpu0:32782)r13=0x0 r14=0x40 r15=0x0

2016-09-04T10:46:17.419Z cpu0:32782)pcpu:0 world:32782 name:"netCoalesce2World" (S)

2016-09-04T10:46:17.420Z cpu0:32782)pcpu:1 world:33079 name:"helper39-3" (SH)

2016-09-04T10:46:17.420Z cpu0:32782)pcpu:2 world:32781 name:"coalesceWorld-0" (S)

2016-09-04T10:46:17.420Z cpu0:32782)pcpu:3 world:35426 name:"vmm0:win7" (V)

2016-09-04T10:46:17.420Z cpu0:32782)@BlueScreen: Machine Check Exception: Fatal (unrecoverable) MCE on PCPU0 in world 32782:netCoalesce2

System has encountered a Hardware Error - Please contact the hardware vendor

2016-09-04T10:46:17.420Z cpu0:32782)Code start: 0x418011c00000 VMK uptime: 0:04:53:32.503

2016-09-04T10:46:17.421Z cpu0:32782)0x4390c071bcd0:[0x418011f0263a]Power_HaltPCPU@vmkernel#nover+0x1f2 stack: 0x417fd1e83ea0

2016-09-04T10:46:17.422Z cpu0:32782)0x4390c071bd20:[0x418011e0fc68]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x27f2159bc891

2016-09-04T10:46:17.423Z cpu0:32782)0x4390c071bda0:[0x418011e133bd]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0x43018bafc634

2016-09-04T10:46:17.424Z cpu0:32782)0x4390c071bec0:[0x418011e13f84]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0

2016-09-04T10:46:17.425Z cpu0:32782)0x4390c071bf40:[0x418011e144bf]CpuSched_SleepUntilTC@vmkernel#nover+0x8f stack: 0x20c49ba500002001

2016-09-04T10:46:17.426Z cpu0:32782)0x4390c071bfa0:[0x418011dbd5ba]NetCoalesce2WorldCB@vmkernel#nover+0xde stack: 0x4390c00a7100

2016-09-04T10:46:17.427Z cpu0:32782)0x4390c071bfd0:[0x418011e14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0

2016-09-04T10:46:17.435Z cpu0:32782)base fs=0x0 gs=0x418040000000 Kgs=0x0

2016-09-04T10:46:17.435Z cpu0:32782)3 other PCPUs are in panic.

2016-09-04T10:46:17.228Z cpu3:35426)MC:PCPU3 B:5 S:0xb200000080200e0f M:0x0 A:0x0 5

2016-09-04T10:46:17.228Z cpu0:32782)MC:PCPU0 B:5 S:0xb200001044100e0f M:0x0 A:0x0 4

2016-09-04T10:46:17.228Z cpu2:32781)MC:PCPU2 B:5 S:0xb200000084200e0f M:0x0 A:0x0 5

2016-09-04T10:46:17.228Z cpu1:33079)MC:PCPU1 B:5 S:0xb200001040100e0f M:0x0 A:0x0 4


Thanks in advance!!

IMG_0285.JPG

Tags (1)
0 Kudos
4 Replies
tayfundeger
Hot Shot
Hot Shot

Hi,

I checked the logs. Your hardware is broken. Can you check hardware? Cause of this problem is hardware.

Thanks.

--
Blog: https://www.tayfundeger.com
Twitter: https://www.twitter.com/tayfundeger

vBlogger, vExpert, Cisco Champions

Please, if this solution helped your problem, "Helpful" if it solves your problem "Correct Answer" to mark.
0 Kudos
Virtuali3ed
Enthusiast
Enthusiast

‌Thanks for the reply! Is it possible to pinpoint the component like RAM, CPU, storage etc.?

0 Kudos
tayfundeger
Hot Shot
Hot Shot

I think this CPU issue. But you have to run the diagnostic on the server.

--
Blog: https://www.tayfundeger.com
Twitter: https://www.twitter.com/tayfundeger

vBlogger, vExpert, Cisco Champions

Please, if this solution helped your problem, "Helpful" if it solves your problem "Correct Answer" to mark.
0 Kudos
Virtuali3ed
Enthusiast
Enthusiast

Thanks, one more question...what does world 33079:helper39-3 mean in Fatal (unrecoverable) MCE on PCPU1 in world 33079:helper39-3? Did this error originate from a virtual machine or from host itself?

0 Kudos