VMware Cloud Community
NickMurphyGM
Contributor
Contributor

virtual san health alarm 'all hosts contributing stats'

Just upgraded to vSan 6.2 from 6.0 now showing an alarm virtual san health alarm 'all hosts contributing stats'.  There are 8 nodes in the cluster and 7 are showing in the "Hosts Not Contributing Stats" window.. All other heath check items pass including the proactive tests.  Currently no VM's in the cluster.  The KB that the Ask Vmware button displays is just about useless https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21444...

Is this something that will change over time, maybe when there are vm's consuming resources?

contributing stats alarm.JPG

Reply
0 Kudos
9 Replies
Bleeder
Hot Shot
Hot Shot

Same problem here.  Did you also have to manually create the stats object (vsan.perf.stats_object_create)?

Reply
0 Kudos
NickMurphyGM
Contributor
Contributor

I did not have to manually create the stats object, after turning on the performance service it automagically created it.

Reply
0 Kudos
kingdonb
Contributor
Contributor

I have this issue too, in a cluster with 7 nodes, four are reported as not contributing stats under the warning "All hosts contributing stats".

There are other health issues because I have strayed a bit from the HCL, but none of them are unexplained like this one.  I am afraid that my performance statistics for the SAN are incorrect, because some of the hosts are evidently not contributing stats.

" There are no known causes for this health check to fail. If it fails consistently, contact VMware Support and file a Support Request. "

That's fine, but I believe the support agreement I have does not entitle me to debugging assistance or answers for this.  This has been failing consistently from some time ago, I've been expanding my cluster with heterogeneous nodes and have built it up to once maximum of 8 nodes.  The 8th node was removed, and its disk group was destroyed, because it was hilariously under-powered by comparison and I suspected keeping it in the cluster was actually dragging down overall performance.  This "hosts contributing stats" failure seems to strike without regard for whether a host is over- or under-powered.  I have some of my largest nodes and the smallest node in this list.

The VSAN seems to work fine, going on over a month now.  I haven't been able to determine what those four of seven have in common, they do not seem to have anything in common.  During peak throughput I do suspect that my cluster's performance is being under-reported because of some hosts not contributing stats.  I would like to find the answers for this though!

Reply
0 Kudos
A13x
Hot Shot
Hot Shot

Same issue here and there is no know fix for this. I am not too sure if its worth me just disabling the performance service since most host are not contributing to the stats. Rebooting hosts/ vcenter does not seem to resolve this. I am baffled as to what it might be. I am very tempted to disable and enable the service but then I will lose what little stats I have

Reply
0 Kudos
elu88
Contributor
Contributor

Here is how I fixed this problem in my vSAN environment - Eddie's Blog: vSAN Performance Service “Hosts Not Contributing Stats” Fix

Hope it works for others too.

Reply
0 Kudos
GreatWhiteTec
VMware Employee
VMware Employee

If all the hosts are not on the same network partition, they are not checked. I would start looking for potential network partitioning issues.

Reply
0 Kudos
admin
Immortal
Immortal

Greetings!

Please disable vSAN Performance Service and enable it back. It should re-trigger all the hosts to contribute to vSAN Performance Stats.

If the problem is not solved, please try restarting vSAN services on all the vSAN nodes. You can restart all ESXi services using below command:

# pwd

/

# service.sh restart

It will restart all the services on ESXi hosts (including vSAN services). Let me know how it goes.

_________________________

Was your question answered correctly? If so, please remember to mark your question as "Correct" or "Helpful" when you get the correct answer. This helps others searching for a similar issue.


Cheers!

-Shivam

Reply
0 Kudos
aaronwsmith
Enthusiast
Enthusiast

I just resolved this issue in our environment.  The ESXi hosts had external-CA signed certificates installed in /etc/vmware/ssl/rui.crt.  But /etc/vmware/ssl/castore.pem file was empty (upgraded host from ESXi 5.5 -> 6.x.)  It needed the base-64/PEM encoded certificates of the CA (and in our case also the intermediate-CA chained together) added to this file.  You can chain them together in the same file.  Once I did this on all ESXi hosts in the cluster, the stats started collecting.

The single ESXi host that is NOT listed as having issues contributing stats is the designated stats master for the cluster.  The RVC command vsan.perf.cluster_info can show you who the stats and CMMDS masters are in the cluster, and whether any issues exist with stats collection.  In this case, no issues were identified, but stats couldn't be shared with the master because SSL certificates from each host couldn't be verified without the CA certs residing in the above mentioned castore.pem file.

I'll publish a blog with details as soon as I can.  Meanwhile hope this quick-summary fix helps everyone here.

Reply
0 Kudos
mthiha207au
Enthusiast
Enthusiast

I had the same issue.. I have 3 nodes and 2 nodes were not contributing. I followed the "Eddie's blog".

My case was due to the issues with certificate. I renewed the certificate of 2 hosts (rebooted - may not be necessary) and all back to normal.

Reply
0 Kudos