Re: ESXi host connectivity with vCenter Server

goranmw1 · ‎05-29-2019

I have health check warning in vCenter 6.7 health saying:

This issue occurs when the UDP heartbeat message sent by ESX/ESXi hosts is not received by vCenter Server. If vCenter Server does not receive the UDP heartbeat message, it treats the host as not responding. ESX/ESXi hosts send heartbeats every 10 seconds and vCenter Server has a window of 60 seconds to receive the heartbeats. This behavior can be an indication of a congested network between the ESX/ESXi hosts and vCenter Server. Click the Ask VMware link above for more details and a resolution.

I have checked everything, it seems there are no network connectivity issues.

Which log can help me to find which exact host have this "missing UDP heartbeat message" issue?

Tnx.

MikeStoica · ‎05-30-2019

Check this KB

irvingpop_chef · ‎06-26-2019

which KB, MikeStoica ?

asajm · ‎06-26-2019

goranmw1

VMware Knowledge Base

If you think your queries have been answered
Marking this response as "Solution " or "Kudo"
ASAJM

MikeStoica · ‎06-26-2019

Sorry, this one VMware Knowledge Base

HappeeDays · ‎08-19-2019

Did you manage to find a resolution? I have a similar issue where the online health monitor under "Network Health Checks" is reporting "ESXi host connectivity with vCenter Server". I do not get any disconnects from the hosts, and the network connectivity seems fine?

maslan81 · ‎08-26-2019

I have the same problem Any solution?

Alex_Romeo · ‎08-26-2019

Hi,

VMware Knowledge Base

Alessandro Romeo

Blog: https://www.aleadmin.it/

HappeeDays · ‎08-26-2019

As per VMware Knowledge Base increasing the heartbeat timeout on the vcentre does resolve the issue, and the warning disappears. This though masks the issue and does not explain the underlying issue why the additional timeout of 120 instead of default 60 is required? I've got my network guy to see if he can spot anything..

bmstewart · ‎12-19-2019

I have the same exact issue. I've got 3 different hosts on 3 different networks, but they're all able to talk to each other.

This is a brand new vSphere 6.7 U3 setup with the latest patches. All hosts are running ESXi 6.7 with the latest patches.

I don't see any issue or loss of connectivity in the vSphere client itself, but the Skyline Health thing always reports the ESXi host connectivity issue.

I've monitored network traffic at one of our firewall appliances, and I see the UDP packets are not being blocked in any way, but I see very few of them. I certainly do not see one every 10 seconds. I see one from a given host IP to the VCSA IP about once every 2 minutes. No, the network is not congested. They're all in the same rack in the same data center, and they're the only things connected to that rack's switch.

coopersmith77 · ‎01-10-2020

HappeeDays - I agree with your assessment. I'm having the exact same issue: Phantom Skyline Health Warning? Please let me know if your network guy spots something that might help me with my environment.

coopersmith77 · ‎01-10-2020

bmstewart I have almost the exact same environment (all 3 hosts are on the same network) and am having the exact same issue. See my discussion post:Phantom Skyline Health Warning? If you happen to find a solution, please share it with me.

coopersmith77 · ‎01-10-2020

goranmw1 - Did you figure out which log to look at? I'm having the exact same issue: Phantom Skyline Health Warning? I'd really like to know how to resolve this warning.

HappeeDays · ‎01-10-2020

coopersmith77 sorry, no cigar unfortunately. It is an intermittent issue, and at the moment not seen in my environment. As there's no issue or problem i'm ignoring it (don't think I've said that before!)

bmstewart · ‎01-10-2020

coopersmith77

I ended up following the steps at VMware Knowledge Base to set it to a 2 minute interval (a value of 120) instead.

My only guess is that the hosts aren't actually sending heartbeats as frequently or steadily as they should (once every 10 seconds), or that the default timer before throwing an alert isn't actually 1 minute as it should be.

After making this change I've received no more of these alerts. All I have now are the following:

An occasional warning within VAMI about memory usage being high (we chose "tiny", have about 30 VMs and 3 hosts, and VAMI shows up to 83% usage of its 10 GB sometimes, even though the host shows the VCSA only taking up 2.5 GB). I could add more memory (from 10 to 12 I guess) and maybe specifically allocate more for the vsphere UI service. (Re: vSphere UI Health Alarm )
An occasional stateless sensor alert for memory (gray/unknown, sensor type -1, etc.). Possibly related to VMware Knowledge Base ? Though I've applied that patch (and did the after patch work on one host) and it didn't help.
The host log having an entry complaining about the scratch partition size. VMware Knowledge Base This one shows up immediately after installing 6.7. VMware just not update the default scratch partition size for whatever new requirements 6.7 has? Is this going to be something I have to live with unless I redo a host

I mention all of these here simply because they're behaviors I see on a new, clean install of 6.7. Maybe I should have gone with 6.5.

coopersmith77 · ‎01-13-2020

HappeeDays Okay. Thanks for your reply.

coopersmith77 · ‎01-13-2020

bmstewart Thank you for your reply. I'll keep your alerts in mind as I monitor my environment. Yeah, as far as trying to keep our environments current, I guess that's what we get for living on the edge.

BB9193 · ‎12-18-2020

I am also having this issue. We're running a brand new vSAN cluster at 6.7 U3 and after I recently restarted the cluster we started seeing this heartbeat warning. I'm also seeing the warning about vCenter running low on memory, but we're using the "tiny" recommendation of 2 cores and 10 GB of RAM and we only have about 40 VM's. I'm not seeing any congestion and we're not even using the cluster for production yet.

Is this just a reporting anomaly and is the ultimate fix just to increase the timeout?

Mick64 · ‎05-06-2021

This issue can occur when the vsphere appliance is connected to a different switch than the management network.

Ours was doing this as management was on one virtual switch and LAN was on a second virtual switch.

Changing the appliance nic to the switch with the management network fixed the issue.