stefanopilla
Contributor
Contributor

ESXi Host not responding on VCenter but hosts and VMs are online

Hi,

I'm keep seeing the "not responding" message on VCenter randomly and I can't figure out what's going on. The VCenter seems to loose connection to the Hosts but they hosts are 100% healthy and works fine (VMs are also online and no issues).

It happens randomly and it get back to normal randomly as well without touching anything (sometime after 10seconds, sometime after 15 mins and sometimes after hrs).

I have searched through the logs on the host but I can't see anything related to this lost connection.

Any specific log file that I can check? I checked hostd.log, vpxa.log and syslog.log on the hosts.

I followed this KB (VMware Knowledge Base) but it's not an SSL Timeout issue.

Tried to restart the hostd and vpxa on the hosts with no luck.

In the VCenter the only error that I see is this but no much info to troubleshoot:

pastedImage_7.png

Any Idea on what could be or where I can start to look to get more information on this?

I have VMware ESXi, 6.7.0, 14320388 deployed on a bare metal server in a cloud hosting provider.

Any help or suggestions would be greatly appreciated!

10 Replies
Nawals
Expert
Expert

Hi,

This issue occurs when the UDP heartbeat message sent by ESX/ESXi host is not received by vCenter Server. if vCenter Server does not receive the UDP heartbeat message, it treats the host as not responding. This behavior can be an indication of a congested network between the ESX/ESXi host and vCenter Server.

Follow this VMware KB https://kb.vmware.com/s/article/1005757

NKS Please Mark Helpful/correct if my answer resolve your query.
0 Kudos
stefanopilla
Contributor
Contributor

Hi Nawals,

thank you very much for the response.

Unfortunately this doesn't solve the issue. I already tried that solution (forgot to mention in my initial post). I tried to set 120 and higher values for the  config.vpxd.heartbeat.notRespondingTimeout  key but I still have the same issue.

Anything else you can think of as a root of the problem?

Thank you

0 Kudos
Nawals
Expert
Expert

Is there any changes happen on ESXi or vCenter? If yes, Please share it so will look into from that prospective. Also,Your vCenter and ESXi host are on the same subnet ?

NKS Please Mark Helpful/correct if my answer resolve your query.
0 Kudos
stefanopilla
Contributor
Contributor

No changes on both VCenter and ESXi happens and they are not on the same subnet (but there's a good connection between them).

Just as an additional information I ran a ping test for 24hrs between VCenter and the hosts I had 0 packet loss! I can run it again but the network is quite stable also between VMs that are on different networks.

Thank you

0 Kudos
Alex_Romeo
Leadership
Leadership

Hi,

Something similar happened to me, but on a single host. VCenter never saw one of the hosts anymore, but the virtual machines were working fine. After trying various solutions, to solve it it was sufficient to remove it from the vcenter and register it again.

ARomeo

Blog: https://www.aleadmin.it/
0 Kudos
stefanopilla
Contributor
Contributor

Hi,

thank you for your feedback. Indeed the only workaround to quickly have the hosts available again (I do this when I have an urgency) is to remove the hosts and re-add them but after a while it happens again. In addition, with that solution I lose the folders that I create in the "VMs and Templates" section so I have to re-create them from scratch. Smiley Sad

I have 2 Hosts (6.7 and 6.5) (I'm adding a third one esxi6.7 host this week) and both got disconnected at the same time. They are all on different subnets.

Thank you

0 Kudos
berndweyand
Expert
Expert

happened to me when dns was down. try nslookup the host from vcsa

Alex_Romeo
Leadership
Leadership

HI,

Bewe's answer seems good to me ... try to verify that DNS works.

ARomeo

Blog: https://www.aleadmin.it/
0 Kudos
stefanopilla
Contributor
Contributor

I checked the DNS and both Record A and reverse are correctly configured and both VCenter and Hosts are able to resolve each other.

Checked the DNS configuration and they are using the same DNS Server.

I tried to upgrade VCenter to 6.7.0.44200 hoping that this will solve the issue. I just upgraded and it's stable now...but we ll see in the next couple of days!

Thank you for your help guys

0 Kudos
Alex_Romeo
Leadership
Leadership

Well! let's wait.

ARomeo

Blog: https://www.aleadmin.it/
0 Kudos