Currently there is one host production that is disconnected from vCenter 6. I have tried to restarting the Management Agent via SSH with command /etc/init.d/hostd restart, /etc/init.d/vpxa restart and services.sh restart but no luck, the host is still disconnected. Then, i tried to restarting the Management Agent via DCUI but the services didnt start. I cant access the host via vSphere Client, Web Client and SSH. The host is connected, i can ping to the host and the VMs is still running. Any help will be appreciated. Thanks.
Have you unmapped/removed unlying storage from the ESXi host.
If yes, you can map it back again and rescan the storage. I have seen that the hostd service can come back to a responsive state.
if you cannot undo what has been done, then schedule a downtime ASAP and reboot the host.
Hi!
Do you have SSH enabled on host? If not - try to enable it from DCUI and connect with SSH.
Then check hostd and vpxa logs.
You need to have them both up and running.
if you can share the vpha.log and hostd.log
also run the below command and check if it gives an output
vim-cmd vmsvc/getallvms
if you cannot connect to host using vsphere client then the hostd service is not running.
from the hostd log I can see that there are some storage errors.
have you unmapped underlying storage/LUNs.?
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.614187707163ef001efed3c214436e23 device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb000000000000065 device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb000000000000066 device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000e0 device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000e8 device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000e9 device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000ea device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000eb device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000ec device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000002ba device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000002bf device not found
2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb000000000000089 device not found
Is there any other method beside rebooting the host? because there are VM Production on that host
Have you unmapped/removed unlying storage from the ESXi host.
If yes, you can map it back again and rescan the storage. I have seen that the hostd service can come back to a responsive state.
if you cannot undo what has been done, then schedule a downtime ASAP and reboot the host.
Well. i guess i have to reboot the Host during off bussiness hours.
Thanks for your suggestions,
