VMware Cloud Community
hohnsie
Contributor
Contributor

Host cluster network issue

eHi everyone,

On one of our host clusters (contains 4 hosts) we are getting the following Configuration Issues message: vSphere HA agent for this host has an error: the vSphere HA agent is not reachable from the vCenter server.

What we have tried so far.

  1. Turning off HA on the cluster, turning back on.
  2. Right clicking on separate hosts and Reconfigure for vSphere HA.
  3. Running /etc/init.d/vpxa restart and /etc/init.d/hostd restart on command line.  Note that the host now reports “not responding” in the vCenter, but is operational.
  4. Restarting the vCenter.

I haven’t tried rebooting a host yet because I cannot migrate the machines off it.  Possibly something I’ll look at afterhours.

On the vCenter I cannot see any of the host configuration for the affected cluster.  But if I connect to the host directly I can configure (I use the vSphere client and/or web client)

Here are the settings on one of the affected hosts.  (attachment 1)

Here are the settings we expect (from a working host) (attachment 2)

Now the problem I have is the host doesn’t seem to know about the distributed switch/port group for me to add these back in.  It only knows about our main VLAN that it’s currently using. (attachment 3)

Thanks for reading.  Any ideas would be much appreciated.

4 Replies
sk591
Enthusiast
Enthusiast

Since the host is marked as not responding, let's first focus on that.

Check the vmkernel.log file for the affected host and search for "hostd detected to be non-responsive" entry. If present, the hostd process is hung and a host restart is something worth trying.

A strong indicator that the hostd is hung after being marked as Not Responding, is if you run the esxcli command and it hangs however localcli works at the same time.

sc2317
Enthusiast
Enthusiast

Hi,

If hosts are showing not responding from vCenter server then Please check if you are able to make connection from vCenter to ESXi and vice versa on port 902.

Try also restarting management agents and then management network. It will not have impact on running machines, so you can try that in office hours.

hohnsie
Contributor
Contributor

Thank you both, I will try these suggestions and report back.

Reply
0 Kudos
hohnsie
Contributor
Contributor

Unfortunately I cannot retrieve the vmkernel.log files from any of the hosts in the affected cluster.  Also tried restarting management agents and management network.  I will look to schedule in a reboot of the hosts.

Reply
0 Kudos