I experiencing an issue, a esxi 6 host was disconnected from my VCSA 6.0, but all VM in that host are working fine. DUCI was hang and I have not enable ssh. so I am no way just reboot esxi host. but one week later , another esxi host in same cluster occurred with the same issues.
after check vCenter log some messages findings:
- Issue detected on 10.1.xx.xxx in it-dc01: Virtual machine creation may fail because agent is unable to retrieve VM creation options from the host (N5Vmomi5Fault17HostCommunication9ExceptionE(vmodl.fault.HostCommunication)).
- Issue detected on 10.1.xx.xxx in it-dc01: Bootbank cannot be found at path '/bootbank'
- Issue detected on 10.1.xx.xxx in it-dc01: hostd detected to be non-responsive
- app1(VM) on host 10.1.12.200 in it-dc01 is disconnected.
Have you found a solution for this issue? I have a host with the same error. I have vsphere 6 update 2
Is ESXi installed on local drive or SD card?
can you make sure there are no issues with storage connectivity if it locally installed?
can you migrate the VMs from this host to another hosts in vCenter and put this in maintenance mode
provide more details on ESXi intall type.
I am also having this issue. Did you ever by chance find a resolution? My hosts have local drives with ESXi 6.0.0 U3 on them, Cisco B200 M4s. No drive issues that I can ascertain. A host reboot fixes the issue but comes back.
I am seeing similar issues with ESXi 6.5, but due to the impact it's difficult to troubleshoot as we have to power cycle to recover the disconnected VMs. Our pattern is:
* vCenter marks a host as disconnected. At this point, no issues are observed with VMs
* Some time later (a few hours maybe) VMs get wonky - they stop accepting or making some network connections, but others are fine. For instance, I can connect to a VM with ssh/rdp, but it cannot connect to other VMs.
* Significant time later (12-24 hours) VMs start going offline. At this point I cannot ssh/rdp, they stop responding to pings, etc.
We have tried restarted services.sh on the hosts, but only two methods of actual recovery work:
* Rebooting the host via DCUI takes nearly an hour, mostly at the "Shutting down VMs" stage
* Power cycling the host "frees" the VMs within a few minutes at which point they restart elsewhere.
We have HA configured, but whatever is wrong never triggers HA events; VMs do not restart elsewhere, they just float ethereally until they die. Only power cycling the disconnected host fixes it.
No pattern is observed - disconnected hosts can happen just days apart, but is usually about a month apart. Sometimes it's the same host twice in a row, other times it's a host that hasn't had the issue before. This has been going on since 6.0 days, not sure which update, but since about Spring 2017.
In my previous experience, I have seen this with in these possible scenarios.. If you are using SD card as the ESXi bootable media, possible card issue. You can remove the SD card and reinsert (also update firmware) on the ESXi host.
Secondary, if you are using boot from SAN, there is a possibility of delay in reaching SAN can lead to this behavior... You can try the steps in this KB VMware Knowledge Base
We are not booting from SAN, just from SD cards. They have no problems on reboots, so seems like a spurious error, but I guess that's the nature of bugs. Will look into reseating them soon-ish, though only one node has had multiple failures like this. Thanks.