503 Service Unavailable errors in ESXi Host client...

DRAGONKZ · ‎11-01-2021

Is anyone else getting "503 Service Unavailable (Failed to connect to endpoint: [xxxxxxxxxxxxx] _serverNamespace = / action = Allow _port = 8309)" errors on their ESXi hosts after upgrading to vSphere 7.0 U3 or U3a?

I've got multiple ESXi hosts all having the same problem, and even my newly built vSAN witness appliance is having the same problem after upgrading to 7.0 U3/U3a (It was deployed as 7.0 U2).

The hosts appear as disconnected in vCenter and the web client on each host displays the error above during the time the servers have this issue.

There appears to be no impact to VM networks or other functionality during this time.

Every 12-48 hours all of my hosts experience this problem, and the only way to fix them is to reboot the hosts and they will work normally for xx period of time before going back to this state.

There are not other changes to my environment for the last few months besides going to U3/U3a, so I'm suspecting this is causing the issue.

Anyone else in the same situation?

Thanks

Lalegre · ‎11-01-2021

Hey @DRAGONKZ,

Over the community I saw people having some bugs after upgrading to 7.0 U3 which could be your case or not. However most of them were solved updating to U3a.

Is your hardware listed as on the HCL for 7.0 U3?
What do the hostd / vpxa log says?
Have you updated your vCenter server first?

DRAGONKZ · ‎11-02-2021

Hi Lalegre,

Correct, my hardware is Cisco B200M4 and C220M5, and all components are on the HCL 🙂

vCenter is on 7.0 U3a, the C220M5 and vSan witness appliance is also on 7.0 U3a.

The B200M4 servers are still on 7.0 U3.

Next time one crashes I'll gather those logs and have a look.

Thanks

alantz · ‎11-02-2021

I've read where NTP is an issue, it could be time drift causing you issues. I'm a little leery of upgrading my UCS Blades to U3 at the moment. I ugraded vcenter , but not my ESXi hosts, they are still U2.

--Alan--

kbasshambeta · ‎01-21-2022

I tried many solutions people have posted here and it seems a combination of them worked for me.

I restarted mgmt agents, ran "services.sh restart" a few times, checked the date on the box.

I changed the hostname of the machine (simply added a -). Restarted mgmt, that alone didnt seem to do anything. After that I re-ran "services.sh restart" again and the machine instantly was back talking in vCenter. So for me it was a combo of hostname change (i changed it back btw) and then running services.sh restart again.

Kinnison · ‎01-21-2022

Comment deleted...

All

503 Service Unavailable errors in ESXi Host client after 7.0 U3