8 Replies Latest reply on Aug 23, 2016 12:05 AM by Boblu

    ESXi 6.0 host fails with a purple diagnostic screen

    Boblu Novice

      Yes I can see couple of same diagnostic screen with NMI IPI received log, say ESXi 6.0 host fails with a purple diagnostic screen reporting the FTCptWriterFunc function call (2127997) | VMware KB

       

      with mine, it is related to APD event? this issue has been observed on DELL R730XD and other HARDWARE PLATFORMS, which only use iscsi(10Gib network) for storage access.

      Does anyone meet this issue before and can you give some hints for the root cause?

      Thanks in advance.

      panic.jpg

        • 1. Re: ESXi 6.0 host fails with a purple diagnostic screen
          Luciano Patrao Hot Shot
          vExpert

          Hi,

           

          Were those ESXi working properly before??

           

          Did you apply any update or patch??

           

          Also those 10Gb are compatible with ESXi 6.0? Did you check them in HCL list??

           

          Have you use the customize ESXi ISO from Dell to install ESXi?

           

          Even if you have some PSOD screens, after a reboot you can power up the ESXi?? If yes export the dumb file using vmkdump -l dumbfilename, then give the log file a name and post here the log so that we can take a look.

          • 2. Re: ESXi 6.0 host fails with a purple diagnostic screen
            hussainbte Expert
            vExpert

            All I can say is that the PSOD is not related to the linked KB.

            I suggest you open a ticket with VMware if there is a support agreement.

            • 3. Re: ESXi 6.0 host fails with a purple diagnostic screen
              Boblu Novice

              Were those ESXi working properly before??

              ----this is the first time on this platform

              Did you apply any update or patch??

              ----it's 6.0.0U2

              Also those 10Gb are compatible with ESXi 6.0? Did you check them in HCL list??

               

              Have you use the customize ESXi ISO from Dell to install ESXi?

              ----for Dell HW, yes, for other vendors, no

              Even if you have some PSOD screens, after a reboot you can power up the ESXi?? If yes export the dumb file using vmkdump -l dumbfilename, then give the log file a name and post here the log so that we can take a look.

              --yes, just one time crap. I will try to get the log and paste it here.

              • 4. Re: ESXi 6.0 host fails with a purple diagnostic screen
                Luciano Patrao Hot Shot
                vExpert

                Hi,

                 

                If you use the same ISO for all faulted installations(and different hw) have you try a different ESXi ISO image? That one could be damage.

                • 5. Re: ESXi 6.0 host fails with a purple diagnostic screen
                  JD Hot Shot
                  vExpert

                  Hi

                   

                  I don't think the stack is same as the PSOD screen. Can you please post the complete stack here ?

                   

                  Also make sure that the driver /Firmware/BIOS are compatible with the version of esxi or upgrade it

                   

                  Thanks

                  Jagadeesh

                  http://www.myitblog.in/
                  • 6. Re: ESXi 6.0 host fails with a purple diagnostic screen
                    Boblu Novice

                    It's different ISO image I guess, and it's hard to reproduce, we only observed 2~3 times

                    • 7. Re: ESXi 6.0 host fails with a purple diagnostic screen
                      Boblu Novice

                      thanks for your info, can you tell me which log contains this stack information since the vmkernel.log was lost when I was trying to capture the log, however the coredump should be there.

                      • 8. Re: ESXi 6.0 host fails with a purple diagnostic screen
                        Boblu Novice

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[7m2016-04-11T07:22:09.195Z cpu7:33172)WARNING: Heartbeat: 781: PCPU 21 didn't have a heartbeat for 21 seconds; *may* be locked up.^[[0m                            

                        ^[[7m2016-04-11T07:22:09.195Z cpu21:32839)WARNING: LVM: 5001: Unknown APD event type (0) for device <naa.60000000000000e00000000000010004:1>^[[0m                     

                        ^[[31;1m2016-04-11T07:22:09.195Z cpu21:32839)ALERT: NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0x300070(0x41801d400000):0x4390c239bd24:0x4010](Src0x1, CPU21)^

                        2016-04-11T07:22:09.195Z cpu7:33172)World: 9728: PRDA 0x418041c00000 ss 0x0 ds 0x10b es 0x10b fs 0x0 gs 0x13b                                         

                        2016-04-11T07:22:09.195Z cpu7:33172)World: 9730: TR 0x4020 GDT 0x4390cca21000 (0x402f) IDT 0x41801d4c8000 (0xfff)                                     

                        2016-04-11T07:22:09.195Z cpu7:33172)World: 9731: CR0 0x80010031 CR3 0x387c7bb000 CR4 0x42768                                                          

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bc18:[0x41801d700070]RTC_MkRTCTimeVal@vmkernel#nover+0x228 stack: 0x41801d49ad82                       

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bc20:[0x41801d48e817]Util_FormatTimestampUTC@vmkernel#nover+0x3f stack: 0x4390c239bcd8                 

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bca0:[0x41801d4604c6]LogFormatStringV@vmkernel#nover+0xc2 stack: 0x100                                 

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bd20:[0x41801d460d5d]LogWarning@vmkernel#nover+0x55 stack: 0x2e39303a32323a37                          

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239be50:[0x41801d461318]_Warning@vmkernel#nover+0x50 stack: 0x4390c239beb0                                

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239beb0:[0x41801dffed5e]LVMHandleDeviceAPDEvent@<None>#<None>+0x62 stack: 0x430289d7d540                 

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bf00:[0x41801e000fc1]LVMAPDHelper@<None>#<None>+0x4d stack: 0x430289d8e680                       

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bf30:[0x41801d44f872]helpFunc@vmkernel#nover+0x4e6 stack: 0x0                                    

                        2016-04-11T07:22:09.195Z cpu21:32839)0x4390c239bfd0:[0x41801d61231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0                                

                        2016-04-11T07:22:09.234Z cpu7:33172)Panic: 798: Saved backtrace: pcpu 21 Heartbeat NMI                                                                

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bc18:[0x41801d700070]RTC_MkRTCTimeVal@vmkernel#nover+0x228 stack:                

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bc20:[0x41801d48e817]Util_FormatTimestampUTC@vmkernel#nover+0x3f s               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bca0:[0x41801d4604c6]LogFormatStringV@vmkernel#nover+0xc2 stack: 0               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bd20:[0x41801d460d5d]LogWarning@vmkernel#nover+0x55 stack: 0x2e393               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239be50:[0x41801d461318]_Warning@vmkernel#nover+0x50 stack: 0x4390c23               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239beb0:[0x41801dffed5e]LVMHandleDeviceAPDEvent@<None>#<None>+0x62 st               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bf00:[0x41801e000fc1]LVMAPDHelper@<None>#<None>+0x4d stack: 0x4302               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bf30:[0x41801d44f872]helpFunc@vmkernel#nover+0x4e6 stack: 0x0, 0x4               

                        2016-04-11T07:22:09.234Z cpu7:33172)pcpu 21 Heartbeat NMI: 0x4390c239bfd0:[0x41801d61231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack               

                        2016-04-11T07:22:09.251Z cpu7:33172)^[[45m^[[33;1mVMware ESXi 6.0.0 [Releasebuild-2494585 x86_64]^[[0m                                                

                        PCPU 21: no heartbeat (2/2 IPIs received)                                                                                                             

                        2016-04-11T07:22:09.252Z cpu7:33172)cr0=0x80010031 cr2=0x4613b000 cr3=0x387c7bb000 cr4=0x42768                                                        

                        2016-04-11T07:22:09.252Z cpu7:33172)pcpu:0 world:33174 name:"vmsyslogd" (U)