I am new to this blog, I ask a little question, I hope you can help me. Strangely after placing a network board on my ESXi server, it soon started to have PSOD, it is likely that the fault is caused by the new hardware on my server but I would like to confirm it with your help
For PSOD situations, you must collect dump files and for doing this, I explain it here: What is the VMKernel Core Dump - Part I
But if you want to check ESXi log files, beside the vmkernel.log you must check vmksummary.log too. (And also please check and separate time duration the failure has been occured to investigate the issue correctly)
If there is a screenshot of PSOD, can you send it?
When I review log files, I see the following error. Which ilo version do you use? The following KB addresses this problem.
I don't know what ilo version you are using, but I don't say anything, but the error "LINT1 / NMI" is usually caused by interrupt remapper functionality. But your ESXi version is very new and the hardware is Gen9. Therefore, I think that one of the two links above will solve your problem.
2019-10-02T03:20:15.510Z cpu0:2102719)@BlueScreen: LINT1/NMI (motherboard nonmaskable interrupt), undiagnosed. This may be a hardware problem; please contact your hardware vendor.
2019-10-02T03:20:15.510Z cpu0:2102719)Code start: 0x41800cc00000 VMK uptime: 15:15:40:00.924
2019-10-02T03:20:15.511Z cpu0:2102719)0x450980002c60:[0x41800cd0ac15]PanicvPanicInt@vmkernel#nover+0x439 stack: 0x1
2019-10-02T03:20:15.511Z cpu0:2102719)0x450980002d00:[0x41800cd0ae48]Panic_NoSave@vmkernel#nover+0x4d stack: 0x450980002d60
2019-10-02T03:20:15.511Z cpu0:2102719)0x450980002d60:[0x41800cd078cd]NMICheckLint1@vmkernel#nover+0x196 stack: 0x0
2019-10-02T03:20:15.512Z cpu0:2102719)0x450980002e20:[0x41800cd07982]NMI_Interrupt@vmkernel#nover+0xb3 stack: 0x0
2019-10-02T03:20:15.512Z cpu0:2102719)0x450980002ea0:[0x41800cd43ffc]IDTNMIWork@vmkernel#nover+0x99 stack: 0xa13f6a6db8ce2
2019-10-02T03:20:15.512Z cpu0:2102719)0x450980002f20:[0x41800cd454f0]Int2_NMI@vmkernel#nover+0x19 stack: 0x0
2019-10-02T03:20:15.513Z cpu0:2102719)0x450980002f40:[0x41800cd60066]gate_entry@vmkernel#nover+0x67 stack: 0x0
2019-10-02T03:20:15.513Z cpu0:2102719)0x451a2499bcc0:[0x41800cc9b272]Power_ArchSetCState@vmkernel#nover+0x106 stack: 0x0
2019-10-02T03:20:15.513Z cpu0:2102719)0x451a2499bcf0:[0x41800cf04222]CpuSchedIdleLoopInt@vmkernel#nover+0x333 stack: 0x450185588000
2019-10-02T03:20:15.514Z cpu0:2102719)0x451a2499bd60:[0x41800cf06ef5]CpuSchedDispatch@vmkernel#nover+0x12aa stack: 0x451a00000001
2019-10-02T03:20:15.514Z cpu0:2102719)0x451a2499beb0:[0x41800cf083db]CpuSchedWait@vmkernel#nover+0x2f4 stack: 0x450100000000
2019-10-02T03:20:15.514Z cpu0:2102719)0x451a2499bf40:[0x41800cf08b54]CpuSched_VcpuHalt@vmkernel#nover+0x12d stack: 0x0
2019-10-02T03:20:15.515Z cpu0:2102719)0x451a2499bfa0:[0x41800cd356e6]VMMVMKCall_Call@vmkernel#nover+0xf7 stack: 0x0
2019-10-02T03:20:15.515Z cpu0:2102719)0x451a2499bfe0:[0x41800cd59e5d]VMKVMM_ArchEnterVMKernel@vmkernel#nover+0xe stack: 0x41800cd59e50
Thanks for the prompt response. The ILO version is 2.10, attached evidence. I fulfill the requirements.
I found this KB https://kb.vmware.com/s/article/56357 But I'm not sure if this KB is the right solution.
My current adapter
Hp QLogic Inc. QLogic 57810 10 Gigabit Ethernet Adapter
Version: 126.96.36.199-1 OEM.6188.8.131.5235516
I think you gave me the wrong information. You wrote that you are using Gen 9, but the server you're showing is Gen 8. Gen 8 servers can get this PSOD from LINT1 / NMI error.
Can you write the exact ESXi version you used? With build number. I also see that the firmware versions are very old. You need to do the firmware update.
Your System ROM version looks very old (2016). In 2016, the ESXi 6.7 was not even released
If you want ESXi to run stably, bios and firmware updates must be up to date. HPE can clearly tell you the cause of this problem, but firmware update will certainly solve the problem.
Moderator note: Moved to ESXi