Hi,
ESXi 6.7/VCSA 6.7
One of my ESXi 6.7U2 hosts stopped responding and all VMs on it were unreachable early this morning. The on-call person rebooted the host and is not sure whether there was a PSOD. I have coredump configured on the local USB SD card but I don't see anything on the designated partition. Is there a way to look up what caused the unresponsive ESXi host after the reboot? vmkernel.log file has not much info in it other than it indicates the reboot did occur.
Thanks,
It really depends where the error occurred. Your on-call person isn't sure there was a PSOD and you don't know why it stopped responding, so you really have no idea where to begin looking. It's also possible in didn't occur within ESXi at all. This is one of the leading reasons why it's great to have something like Log Insight where you can just ship all logs off to it and do the digging in one big stream rather than hunting and pecking and hoping you stumble onto the cause.
Not if you don't have logs or a core dump saved.
Thanks. I guess for whatever reason, the coredump was not generated or maybe there was no purple screen of death.
Are there specific logs and/or keywords to look into? I checked vmkernel.log and other vmk*.log but there were no specific error log entries right before the time of event.
Thanks,
It really depends where the error occurred. Your on-call person isn't sure there was a PSOD and you don't know why it stopped responding, so you really have no idea where to begin looking. It's also possible in didn't occur within ESXi at all. This is one of the leading reasons why it's great to have something like Log Insight where you can just ship all logs off to it and do the digging in one big stream rather than hunting and pecking and hoping you stumble onto the cause.
Thanks for the advice. I will look into Log Insight and establish a process for future events like this.