VMware Cloud Community
YordanV
Contributor
Contributor

ESX Crash with error: PCPU 30: no heartbeat (2/2 IPIs received)

Hello,

My ESX Server crashed with PDS. I cannot find anything for this error. Could you please help me?

In vmkernel-zdump.1

01:02:54.807Z cpu26:2118715)WARNING: Heartbeat: 781: PCPU 30 didn't have a heartbeat for 8 seconds; *may* be locked up.

01:02:54.807Z cpu30:33460)ALERT: NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0x2fc4fb(0x418015000000):0x1:0x4010](Src 0x1, CPU30)

01:02:54.807Z cpu30:33460)0x4390d5a1b578:[0x4180152fc4fb]NUMALatency_GetNodeHops@vmkernel#nover+0x7 stack: 0x418015055ed8

01:02:54.807Z cpu30:33460)0x4390d5a1b580:[0x418015341f5b]MemDistributeNUMAPolicy@vmkernel#nover+0x333 stack: 0x0

01:02:54.807Z cpu30:33460)0x4390d5a1b6c0:[0x4180153429ed]MemDistribute_Alloc@vmkernel#nover+0x299 stack: 0x6643f6b7

01:02:54.807Z cpu30:33460)0x4390d5a1b820:[0x418015017a60]PagePool_AllocCustom@vmkernel#nover+0x2f0 stack: 0x43004d0c9d38

01:02:54.807Z cpu30:33460)0x4390d5a1b8e0:[0x4180150203d4]vmk_MemPoolAlloc@vmkernel#nover+0x37c stack: 0x4180157a88ad

01:02:54.807Z cpu30:33460)0x4390d5a1bd90:[0x4180157a88ad]fusion_get_seq_num@<None>#<None>+0xd9 stack: 0x4304c5e5ec40

01:02:54.807Z cpu30:33460)0x4390d5a1bea0:[0x41801579dadb]megasas_hotplug_work@<None>#<None>+0x16b stack: 0x0

01:02:54.807Z cpu30:33460)0x4390d5a1bf20:[0x418015021baf]VmkTimerQueueWorldFunc@vmkernel#nover+0x21f stack: 0x0

01:02:54.807Z cpu30:33460)0x4390d5a1bfd0:[0x41801521231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0

..........................

1:03:07.877Z cpu45:33948)^[[45m^[[33;1mVMware ESXi 6.0.0 [Releasebuild-2494585 x86_64]^[[0m

PCPU 30: no heartbeat (2/2 IPIs received)

01:03:07.877Z cpu45:33948)cr0=0x80010031 cr2=0xe5d6b0 cr3=0x10bac49000 cr4=0x42768

.................................

01:03:07.878Z cpu45:33948)@BlueScreen: PCPU 30: no heartbeat (2/2 IPIs received)

01:03:07.878Z cpu45:33948)Code start: 0x418015000000 VMK uptime:

01:03:07.878Z cpu45:33948)Saved backtrace from: pcpu 30 Heartbeat NMI

01:03:07.879Z cpu45:33948)0x4390d5a1b580:[0x418015341cdb]MemDistributeNUMAPolicy@vmkernel#nover+0xb3 stack: 0x0

01:03:07.879Z cpu45:33948)0x4390d5a1b6c0:[0x4180153429ed]MemDistribute_Alloc@vmkernel#nover+0x299 stack: 0x6f5c4fc8

01:03:07.879Z cpu45:33948)0x4390d5a1b820:[0x418015017a60]PagePool_AllocCustom@vmkernel#nover+0x2f0 stack: 0x43004d0c9d38

01:03:07.880Z cpu45:33948)0x4390d5a1b8e0:[0x4180150203d4]vmk_MemPoolAlloc@vmkernel#nover+0x37c stack: 0x4180157a88ad

01:03:07.880Z cpu45:33948)0x4390d5a1bd90:[0x4180157a88ad]fusion_get_seq_num@<None>#<None>+0xd9 stack: 0x4304c5e5ec40

01:03:07.881Z cpu45:33948)0x4390d5a1bea0:[0x41801579dadb]megasas_hotplug_work@<None>#<None>+0x16b stack: 0x0

01:03:07.881Z cpu45:33948)0x4390d5a1bf20:[0x418015021baf]VmkTimerQueueWorldFunc@vmkernel#nover+0x21f stack: 0x0

01:03:07.881Z cpu45:33948)0x4390d5a1bfd0:[0x41801521231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0

01:03:07.884Z cpu45:33948)base fs=0x0 gs=0x41804b400000 Kgs=0x0

01:03:07.807Z cpu30:33460)NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0x341cdb(0x418015000000):0x3:0x4010](Src0x1, CPU30)

01:02:54.807Z cpu30:33460)NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0x2fc4fb(0x418015000000):0x1:0x4010](Src 0x1, CPU30)

01:03:07.807Z cpu30:33460)NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0x341cdb(0x418015000000):0x3:0x4010](Src0x1, CPU30)

01:02:54.807Z cpu30:33460)NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0x2fc4fb(0x418015000000):0x1:0x4010](Src 0x1, CPU30)

01:03:07.887Z cpu45:33948)Backtrace for current CPU #45, worldID=33948, rbp=0x0

01:03:07.887Z cpu45:33948)0x4390e4e1b3a0:[0x418015076eea]PanicvPanicInt@vmkernel#nover+0x37e stack: 0x4390e4e1b438, 0x43004d1

01:03:07.887Z cpu45:33948)0x4390e4e1b430:[0x41801507725e]Panic_WithBacktrace@vmkernel#nover+0x56 stack: 0x4390e4e1b4a0, 0x439

01:03:07.887Z cpu45:33948)0x4390e4e1b4a0:[0x41801533c77b]Heartbeat_DetectCPULockups@vmkernel#nover+0x4f7 stack: 0x300000000,

01:03:07.887Z cpu45:33948)0x4390e4e1b510:[0x4180150881b6]Timer_BHHandler@vmkernel#nover+0xea stack: 0x4390924db098, 0x0, 0x54

01:03:07.887Z cpu45:33948)0x4390e4e1b5a0:[0x418015032254]BH_DrainAndDisableInterrupts@vmkernel#nover+0x78 stack: 0xef, 0x4300

01:03:07.887Z cpu45:33948)0x4390e4e1b630:[0x418015055fa2]IDT_IntrHandler@vmkernel#nover+0x1ce stack: 0x0, 0x41804b400200, 0x0

01:03:07.887Z cpu45:33948)0x4390e4e1b660:[0x4180150c6044]gate_entry_@vmkernel#nover+0x0 stack: 0x0, 0x0, 0x0, 0x0, 0x41804b40

01:03:07.887Z cpu45:33948)0x4390e4e1b720:[0x4180152fdd6c]Power_HaltPCPU@vmkernel#nover+0x1e8 stack: 0x417fd5281ea0, 0x41804b5

01:03:07.887Z cpu45:33948)0x4390e4e1b770:[0x41801520d5b2]CpuSchedIdleLoopInt@vmkernel#nover+0x3f2 stack: 0x5409679d2c4ea5, 0x

01:03:07.887Z cpu45:33948)0x4390e4e1b7f0:[0x418015210af2]CpuSchedDispatch@vmkernel#nover+0x1576 stack: 0x41804d901e00, 0x4180

01:03:07.887Z cpu45:33948)0x4390e4e1b920:[0x418015211710]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0, 0x43121572e221, 0x3201

01:03:07.887Z cpu45:33948)0x4390e4e1b9a0:[0x4180150b556e]WorldWaitInt@vmkernel#nover+0x28e stack: 0x2001, 0x431215727060, 0x8

01:03:07.887Z cpu45:33948)0x4390e4e1ba20:[0x4180155bc616]UserObj_Poll@<None>#<None>+0x106 stack: 0x100004180, 0x54096829651fe

01:03:07.887Z cpu45:33948)0x4390e4e1ba90:[0x4180155dd73e]LinuxFileDesc_Poll@<None>#<None>+0x1ca stack: 0x100000004, 0x4305000

01:03:07.887Z cpu45:33948)0x4390e4e1bef0:[0x4180155b633b]User_LinuxSyscallHandler@<None>#<None>+0xd7 stack: 0xffad4db8, 0x0,

01:03:07.887Z cpu45:33948)0x4390e4e1bf20:[0x41801508d745]User_LinuxSyscallHandler@vmkernel#nover+0x1d stack: 0x0, 0x13b, 0x0,

01:03:07.887Z cpu45:33948)0x4390e4e1bf30:[0x4180150c6044]gate_entry_@vmkernel#nover+0x0 stack: 0x0, 0xa8, 0x1, 0x3e8, 0x1f000

.........................................

Thank you!

Regards

Yordan

3 Replies
hussainbte
Expert
Expert

I suggest opening a case with VMware support.

There are no assumptions that can be made here.

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
0 Kudos
YordanV
Contributor
Contributor

Thank you for your response!

Unfortunately I cannot open a case : ).

0 Kudos
cfizz34vmware
Enthusiast
Enthusiast

I've had a open ticket with vmware as the highest severity "all systems down" for over a week and they have provided nothing to solve this as of yet.  Pretty disappointed.

PSOD.jpg

0 Kudos