VMware Cloud Community
fadholi
Contributor
Contributor
Jump to solution

Restarting Management Agents hung

Currently there is one host production that is disconnected from vCenter 6. I have tried to restarting the Management Agent via SSH with command /etc/init.d/hostd restart, /etc/init.d/vpxa restart and services.sh restart but no luck, the host is still disconnected. Then, i tried to restarting the Management Agent via DCUI but the services didnt start. I cant access the host via vSphere Client, Web Client and SSH. The host is connected, i can ping to the host and the VMs is still running. Any help will be appreciated. Thanks.

Restarting Management Agent Hung.jpg

1 Solution

Accepted Solutions
hussainbte
Expert
Expert
Jump to solution

Have you unmapped/removed unlying storage from the ESXi host.

If yes, you can map it back again and rescan the storage. I have seen that the hostd service can come back to a responsive state.

if you cannot undo what has been done, then schedule a downtime ASAP and reboot the host.

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/

View solution in original post

Reply
0 Kudos
7 Replies
Finikiez
Champion
Champion
Jump to solution

Hi!

Do you have SSH enabled on host? If not - try to enable it from DCUI and connect with SSH.

Then check hostd and vpxa logs.

You need to have them both up and running.

hussainbte
Expert
Expert
Jump to solution

if you can share the vpha.log and hostd.log

also run the below command and check if it gives an output

vim-cmd vmsvc/getallvms

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
Reply
0 Kudos
fadholi
Contributor
Contributor
Jump to solution

i cant access the host via vSphere Client, Web Client, ESXi DCUI and SSH. Here is the log.

Reply
0 Kudos
hussainbte
Expert
Expert
Jump to solution

if you cannot connect to host using vsphere client then the hostd service is not running.

from the hostd log I can see that there are some storage errors.

have you unmapped underlying storage/LUNs.?

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.614187707163ef001efed3c214436e23 device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb000000000000065 device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb000000000000066 device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000e0 device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000e8 device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000e9 device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000ea device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000eb device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000000ec device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000002ba device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb0000000000002bf device not found

2018-02-14T03:18:30.007Z error hostd[26080B70] [Originator@6876 sub=Default] CreateStorageStructure: naa.60050764008181adb000000000000089 device not found

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
Reply
0 Kudos
fadholi
Contributor
Contributor
Jump to solution

Is there any other method beside rebooting the host? because there are VM Production on that host

Reply
0 Kudos
hussainbte
Expert
Expert
Jump to solution

Have you unmapped/removed unlying storage from the ESXi host.

If yes, you can map it back again and rescan the storage. I have seen that the hostd service can come back to a responsive state.

if you cannot undo what has been done, then schedule a downtime ASAP and reboot the host.

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
Reply
0 Kudos
fadholi
Contributor
Contributor
Jump to solution

Well. i guess i have to reboot the Host during off bussiness hours.

Thanks for your suggestions,

Reply
0 Kudos