VMware Cloud Community
Boblu
Contributor
Contributor

ESXi 6.0 host fails with a purple diagnostic screen

Yes I can see couple of same diagnostic screen with NMI IPI received log, say ESXi 6.0 host fails with a purple diagnostic screen reporting the FTCptWriterFunc function call (212...

with mine, it is related to APD event? this issue has been observed on DELL R730XD and other HARDWARE PLATFORMS, which only use iscsi(10Gib network) for storage access.

Does anyone meet this issue before and can you give some hints for the root cause?

Thanks in advance.

panic.jpg

8 Replies
LucianoPatrão

Hi,

Were those ESXi working properly before??

Did you apply any update or patch??

Also those 10Gb are compatible with ESXi 6.0? Did you check them in HCL list??

Have you use the customize ESXi ISO from Dell to install ESXi?

Even if you have some PSOD screens, after a reboot you can power up the ESXi?? If yes export the dumb file using vmkdump -l dumbfilename, then give the log file a name and post here the log so that we can take a look.

Luciano Patrão

VCP-DCV, VCAP-DCV Design 2023, VCP-Cloud 2023
vExpert vSAN, NSX, Cloud Provider, Veeam Vanguard
Solutions Architect - Tech Lead for VMware / Virtual Backups

________________________________
If helpful Please award points
Thank You
Blog: https://www.provirtualzone.com | Twitter: @Luciano_PT
hussainbte
Expert
Expert

All I can say is that the PSOD is not related to the linked KB.

I suggest you open a ticket with VMware if there is a support agreement.

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
Boblu
Contributor
Contributor

Were those ESXi working properly before??

----this is the first time on this platform

Did you apply any update or patch??

----it's 6.0.0U2

Also those 10Gb are compatible with ESXi 6.0? Did you check them in HCL list??

Have you use the customize ESXi ISO from Dell to install ESXi?

----for Dell HW, yes, for other vendors, no

Even if you have some PSOD screens, after a reboot you can power up the ESXi?? If yes export the dumb file using vmkdump -l dumbfilename, then give the log file a name and post here the log so that we can take a look.

--yes, just one time crap. I will try to get the log and paste it here.

Reply
0 Kudos
LucianoPatrão

Hi,

If you use the same ISO for all faulted installations(and different hw) have you try a different ESXi ISO image? That one could be damage.

Luciano Patrão

VCP-DCV, VCAP-DCV Design 2023, VCP-Cloud 2023
vExpert vSAN, NSX, Cloud Provider, Veeam Vanguard
Solutions Architect - Tech Lead for VMware / Virtual Backups

________________________________
If helpful Please award points
Thank You
Blog: https://www.provirtualzone.com | Twitter: @Luciano_PT
Reply
0 Kudos
JagadeeshDev
Hot Shot
Hot Shot

Hi

I don't think the stack is same as the PSOD screen. Can you please post the complete stack here ?

Also make sure that the driver /Firmware/BIOS are compatible with the version of esxi or upgrade it

Thanks

Jagadeesh

http://www.myitblog.in/
Reply
0 Kudos
Boblu
Contributor
Contributor

It's different ISO image I guess, and it's hard to reproduce, we only observed 2~3 times

Reply
0 Kudos
Boblu
Contributor
Contributor

thanks for your info, can you tell me which log contains this stack information since the vmkernel.log was lost when I was trying to capture the log, however the coredump should be there.

Reply
0 Kudos
Boblu
Contributor
Contributor

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[7m2016-04-11T07:22:09.195Z cpu7:33172)WARNING: Heartbeat: 781: PCPU 21 didn't have a heartbeat for 21 seconds; *may* be locked up.^[[0m                            

^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

^[[31;1m2016-04-11T07:22:09.195Z cpu21:32839)ALERT: NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0x300070(0x41801d400000):0x4390c239bd24:0x4010](Src0x1, CPU21)^

2016-04-11T07:22:09.195Z cpu7:33172)World: 9728: PRDA 0x418041c00000 ss 0x0 ds 0x10b es 0x10b fs 0x0 gs 0x13b                                         

2016-04-11T07:22:09.195Z cpu7:33172)World: 9730: TR 0x4020 GDT 0x4390cca21000 (0x402f) IDT 0x41801d4c8000 (0xfff)                                     

2016-04-11T07:22:09.195Z cpu7:33172)World: 9731: CR0 0x80010031 CR3 0x387c7bb000 CR4 0x42768                                                          

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bc18:[0x41801d700070]RTC_MkRTCTimeVal@vmkernel#nover+0x228 stack: 0x41801d49ad82                       

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bc20:[0x41801d48e817]Util_FormatTimestampUTC@vmkernel#nover+0x3f stack: 0x4390c239bcd8                 

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bca0:[0x41801d4604c6]LogFormatStringV@vmkernel#nover+0xc2 stack: 0x100                                 

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bd20:[0x41801d460d5d]LogWarning@vmkernel#nover+0x55 stack: 0x2e39303a32323a37                          

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239be50:[0x41801d461318]_Warning@vmkernel#nover+0x50 stack: 0x4390c239beb0                                

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239beb0:[0x41801dffed5e]LVMHandleDeviceAPDEvent@<None>#<None>+0x62 stack: 0x430289d7d540                 

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bf00:[0x41801e000fc1]LVMAPDHelper@<None>#<None>+0x4d stack: 0x430289d8e680                       

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bf30:[0x41801d44f872]helpFunc@vmkernel#nover+0x4e6 stack: 0x0                                    

2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bfd0:[0x41801d61231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0                                

2016-04-11T07:22:09.234Z cpu7:33172)Panic: 798: Saved backtrace: pcpu 21 Heartbeat NMI                                                                

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bc18:[0x41801d700070]RTC_MkRTCTimeVal@vmkernel#nover+0x228 stack:                

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bc20:[0x41801d48e817]Util_FormatTimestampUTC@vmkernel#nover+0x3f s               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bca0:[0x41801d4604c6]LogFormatStringV@vmkernel#nover+0xc2 stack: 0               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bd20:[0x41801d460d5d]LogWarning@vmkernel#nover+0x55 stack: 0x2e393               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239be50:[0x41801d461318]_Warning@vmkernel#nover+0x50 stack: 0x4390c23               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239beb0:[0x41801dffed5e]LVMHandleDeviceAPDEvent@<None>#<None>+0x62 st               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bf00:[0x41801e000fc1]LVMAPDHelper@<None>#<None>+0x4d stack: 0x4302               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bf30:[0x41801d44f872]helpFunc@vmkernel#nover+0x4e6 stack: 0x0, 0x4               

2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bfd0:[0x41801d61231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack               

2016-04-11T07:22:09.251Z cpu7:33172)^[[45m^[[33;1mVMware ESXi 6.0.0 [Releasebuild-2494585 x86_64]^[[0m                                                

PCPU 21: no heartbeat (2/2 IPIs received)                                                                                                             

2016-04-11T07:22:09.252Z cpu7:33172)cr0=0x80010031 cr2=0x4613b000 cr3=0x387c7bb000 cr4=0x42768                                                        

2016-04-11T07:22:09.252Z cpu7:33172)pcpu:0 world:33174 name:"vmsyslogd" (U)                       

Reply
0 Kudos