We have a HP DL 380 G7 server with ESXi 5.1 installed, The host has grayed out in the vCenter about an hour ago and the "not responding" appeared in front of its name.
I can ping the host and all VMs are running and responding.
I searched and founded that "restarting the management agents" may solve the problem (KB1003490), then:
What should I do to solve the problem?
I have faced it , and I have did these things, please try at yours :
1. Login dcui from ssh client
type : dcui
next do the same step like when you login at DCUI.
2. If it doesn't work, please check your hosts file at ESXI, make sure the "127.0.0.0 localhost.localdomain localhost" hostname is exist.
in SSH client type: vi /etc/hosts
3. Do restart hostd and vpxa service, and makesure those service is running.
in SSH client type:
/etc/init.d/hostd status if status is not running, please type : /etc/init.d/hostd start
/etc/init.d/vpxa status if status is not running, please type : /etc/init.d/vpxa start
please try those step, if it's not working, tell us ..
cheers
can you connect it through vsphere client, i mean without VC? Also, check VMkernal and messages...
Thank you for answers, but the problem happened about 6 months ago and I did the followings:
- I found that the host might crash soon (I searched and found a similar situation which resulted to the PSOD)
- There was no way to contact with the host (The SSH service was not running by default, and the DCUI not responding, as I mentioned in the first post)
- I decided to shut down all the VMs (through RDP to each VM) and reboot the host, but when some VMs has been powered off, the error disappeared and every thing backed to normal. We had updated the host from 4.1 U1 to 5.1 U1, and maybe it made the issue.
Finally, I migrated all VMs and made a fresh installation of ESXi 5.1 U1 on the host.
Try below cmds to check/start hostd service
/etc/init.d/hostd status >>>> /etc/init.d/hostd start
Check hostd status , if it's not started then check /var/log/vmkernel.log , /var/log/hostd.log and /var/log/vpxa.log (take action accordingly as per error in log)
When esxcli cmd not working you can run localcli cmd instead of esxcli .
# localcli network firewall get
# localcli network firewall set --enable false
# localcli system maintenanceMode set --enable true
# localcli vm process list
# localcli vm process kill -w <World ID> -t soft (Shutdown VMs)