I have two ESX3.5 hosts in a cluster. One of the hosts lost communication with the VC and is displayed as "disconnected" (along with its virtual machines). VMs are still running fine and can manage them when pointing VIC directly to affected host.
Cannot SSH to the host (get "Server unexpectedly closed network connection").
Cannot login at the console (get "INIT: cannot fork, retry..." repeatedly).
Cannot vMotion the VMs to the running host as all VMs in question are "disconnected" and most options are greyed out.
what happens if you right click on the host in question and choose "connect"?
if that doesn't work, can you can access to the ESX console directly, or through iLO, DRAC....
if so, type:
service mgmt-vmware restart
It steps me through the Add host wizard and then fails with a "A general system error occured: internal error" message.
see my above reply.... I edited it.
When I try to login at the ESX console directly (through our KVM system), it repeatedly displays "INIT: cannot fork, retry..." over and over again. It displays this on almost every screen (ALT-F1, ALT-F2, ALT-F3, etc.)
We don't have iLO setup within our environment.
According to the problem, I give you these tips:
1) maintaining the ESX version
2) when the error is from the ESX console, run as root:
a) vdf -Thl
b) ps -ef | grep hostd
c) cd /var/run/vmware
d) ls -l vmware-hostd.PID watchdog-cimserver.PID
e) cat vmware-hostd.PID
f) kill -9 (# pid)
g) rm vmware-hostd.PID watchdog-cimserver.PID
h) service mgmt-vmware restart
i) /etc/init.d/vmware-webAccess restart
j) /etc/init.d/vmkhalt restart
k) service vmware-vpxa restart
l) etc/init.d/nscd restart (flush DNS)
All is well again. Not sure what happened. Went back to the KVM console and was able to get a "stable" login and command prompt (no "cannot fork" errors). Successfully restarted the mgmt-vmware services and add the host back to the cluster. The host and its VMs are now connected to VC and manageable.
Thanks all for your assistance.
glad to see you got it resolved.
please consider awarding by marking answers as "correct" or "helpful"