VMware Cloud Community
kmcd03
Contributor
Contributor

High CPU utilization on witness ESXi appliance caused connectivity error

Our vSAN 6.6.1 stretched cluster (12+1) had an error that it lost connectivity to the witness ESXi host.

We're using the virtual appliance for witness at third site.  Confirmed I could ping between hosts and witness using the vmk inteface used by vSAN.

At the same noticed the CPU utilization on the witness VM was at 100%.  I can see in vROPs the CPU utilization of the witness VM jumped from ~8% to ~65% for eight hours.  It then jumped 100%.

I was going to reboot the witness ESXi host.  I ran task to generate a support bundle beforehand.  Generating support bundle caused VM CPU to drop back to normal.  And host disconnect errors on the cluster are gone.  All health checks are green.

I'll open ticket with GSS but asking if anyone else seen this?

Thanks!

Reply
0 Kudos
1 Reply
bmrkmr
Enthusiast
Enthusiast

This will not really help you, but I just saw a similar thing.

The CPU usage of one (out of 6) witness appliances increased from a steady 40%, suddenly to over 80% (for no apparent reason, nobody made changes on Saturday...)

It kept that 80% usage level until Tuesday, then there was the only suspicous event (/bin/hostd crashed). Only some time later the vSAN cluster complained about the connection to the appliance.

Appliance was rebooted, running at 20% usage steadily since Tuesday.

Reply
0 Kudos