my ESXI on HP Gen10, Crash randomly after heavy work finished. every 10 days or 15 days. at first, my ESXi was installed on SD-CARD, I read community and replace it with two RAID-1 HPE HDD, But After one week, ESXI crash every day or twice a day.
I installed "VMware vRealize Log Insight" to catch the problem log on another server, but no log save for crash time.
what is this problem from? I very confuse and need help, please
Gen10, Firmware Fully Upgraded to 05/0/2021. 512GB RAM, 2x Gold 6252, Raid HPE, SSD Enterprise HPE
Ciao what is the exact model of HPE? Is the host connected to external storage (ISCSI, SAS etc ..)?
Usually, this purple screen is given by an HW or SW error.
Do you have the ILO on the HPE server and are there any reports on it?
I recommend you open a call to HPE HW Support and VMware.
I know this is from software or hardware, but don't know those code point to which problem.
ILO report all things are ok and green.
I need a VMware expert who can debug this purple page attached to my first question
PSOSDs are very, very complex and relatively hard to diagnose (especially without a clear stacktrace like in your screenshot), hence time-consuming and might need engineering involvement. Collect the dump with a log bundle and file a SR as Fabio recommended for further analysis.
And iLO not reporting errors doesn't necessarily mean there're no hardware faults.
What brand and model is the physical server?
What disk controller does it have?
Is the VMware vSphere installed with a custom image from the manufacturer?
Use the following command to verify if the image is standard (VMware) or customized by a vendor (Lenovo, IBM, HPE, Dell)
# esxcli software profile get
Run the following commands to verify how the ESXi is recognizing the controller and what NVMe drivers you have installed on the ESXi host.
[root@esxi:~] esxcli storage core adapter list
[root@esxi:~] esxcfg-scsidevs -a
[root@esxi:~] esxcli software vib list | grep -i nvme
Please paste your results in this post.