8 Replies Latest reply on Jan 21, 2019 9:37 PM by Beingnsxpaddy

    "All hosts contribution stats" warning

    Wolfman017 Lurker

      Hi,

       

      I have a stretched cluster of 2 ESXi 6.5u2 (SERVER1 and SERVER2) + 1 Witness (WITNESS1).

      When I check the vSAN health screen everything is OK except one which is in Warning : "All hosts contributing stats". The warning says that the host SERVER2 is not contributing to stats.

       

      I followed this tutorial to determine which server was the stats master : vSAN hosts not contributing stats reports - vSAN Health check fails - vhabit

      My stats Master is SERVER1

       

      If I check the vsanmgmt.log file, I have this warning :

      2019-01-09T14:21:49Z VSANMGMTSVC: WARNING vsanperfsvc[Collector-1] [statscollector::RetrieveRemoteStats] Error happened during RetrieveRemoteStats of host IP_of_SERVER2, type: <class 'socket.timeout'>, message: timed out

      2019-01-09T14:21:50Z VSANMGMTSVC: WARNING vsanperfsvc[Collector-1] [statscollector::RetrieveRemoteStats] Error happened during RetrieveRemoteStats of host IP_of_WITNESS1, type: <class 'socket.timeout'>, message: timed out

       

       

      Communications between servers work well. I checked it with vmkping. The rest of the vSAN works perfectly.

       

      Any idea ?

        • 1. Re: "All hosts contribution stats" warning
          TheBobkin Master
          vExpertVMware Employees

          Hello Wolfman017,

           

           

          First step would be to restart vsanmgmtd and confirm that this is functional:

          # ps | grep vsan

          #/etc/init.d/vsanmgmtd stop

          Check that the processes are gone (in 6.7 there will be other 'vsan' named processes but don't mind these):

          # ps | grep vsan

          #/etc/init.d/vsanmgmtd start

          When you are testing ping are you using vmkping and specifying what interface to go over? (-i vmkX)

           

           

          Bob

          • 2. Re: "All hosts contribution stats" warning
            Wolfman017 Lurker

            Hi,

             

            I already stopped/restarted the service while activating the debug mode. I even rebooted both nodes of the cluster, and it keeps doing the same.

             

            When doing the vmkping, I specify the vSAN VMKernel port.I did not specify it, but the IP addresses I talk about (when testing vmkping and in the log) are the vSAN IP, not the management IP.

            • 3. Re: "All hosts contribution stats" warning
              Darking Novice

              Aloha!

               

              I would try to look into the log with the debug settings, in case you havnt.

               

              Here is the settings you need to make on the MASTER node:

               

              VMware Knowledge Base

               

               

              I have a very similar case running with support at the moment, in our 8 node stretched cluster, and my debug events look like this:

               

              2019-01-10T12:13:41Z VSANMGMTSVC: DEBUG vsanperfsvc[Collector-Main] [statscollector::_FetchAndCalculateStats] waiting for stats readiness

              2019-01-10T12:13:44Z VSANMGMTSVC: WARNING vsanperfsvc[Collector-2] [statscollector::RetrieveRemoteStats] Error happened during RetrieveRemoteStats of host 172.29.242.13, type: <class 'OSError'>, message: [Errno 113] No route to host

               

              For some very odd reason it is trying to use the interface that is running dedicated witness traffic. which only has traffic allowed to the witness site, and not intersite.

               

              I would assume it would either use the vmware management network or the VSAN network.. but nope.. mine is trying the witness traffic network.

               

              when i hear back from support i'll post it here.

              • 4. Re: "All hosts contribution stats" warning
                Wolfman017 Lurker

                The servers communicate on the vSAN network. What do you call "witness network" ? The "witnessPg" port group ?

                         

                 

                Here is the log with the debug activated :

                 

                2019-01-10T13:54:17Z VSANMGMTSVC: WARNING vsanperfsvc[Collector-1] [statscollector::RetrieveRemoteStats] Error happened during RetrieveRemoteStats of host VSAN_IP_of_SERVER2, type: <class 'socket.timeout'>, message: timed out

                2019-01-10T13:54:17Z VSANMGMTSVC: DEBUG vsanperfsvc[Collector-1] [statscollector::RetrieveRemoteStats] Traceback (most recent call last):    File "/build/mts/release/bora-10642691/bora/build/vsan/release/vsanhealth/usr/lib/vmware/vsan/perfsvc/statscollector.py", line 676, in RetrieveRemoteStats    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/VmomiSupport.py", line 557, in <lambda>    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/VmomiSupport.py", line 363, in _InvokeMethod    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/SoapAdapter.py", line 1303, in InvokeMethod    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/SoapAdapter.py", line 1369, in GetConne

                2019-01-10T13:54:17Z VSANMGMTSVC: WARNING vsanperfsvc[Collector-2] [statscollector::RetrieveRemoteStats] Error happened during RetrieveRemoteStats of host VSAN_IP_of_WITNESS, type: <class 'socket.timeout'>, message: timed out

                2019-01-10T13:54:17Z VSANMGMTSVC: DEBUG vsanperfsvc[Collector-2] [statscollector::RetrieveRemoteStats] Traceback (most recent call last):    File "/build/mts/release/bora-10642691/bora/build/vsan/release/vsanhealth/usr/lib/vmware/vsan/perfsvc/statscollector.py", line 676, in RetrieveRemoteStats    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/VmomiSupport.py", line 557, in <lambda>    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/VmomiSupport.py", line 363, in _InvokeMethod    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/SoapAdapter.py", line 1303, in InvokeMethod    File "/build/mts/release/bora-10719125/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/SoapAdapter.py", line 1369, in GetConne

                • 5. Re: "All hosts contribution stats" warning
                  Darking Novice

                  Im referring to the witness traffic LAN.

                   

                  Its a new function in 6.7 U1 im running, i forgot you are not running this release. sorry about that.

                   

                  in 6.5 the VSAN network is used both for witness and VSAN traffic.

                   

                   

                  have you tried a vmkping?

                   

                  vmkping -I <interface of VSAN> destination-of-other-host

                   

                  else i would check if a firewall has been setup that is blocking (both some sort of physical firewall or on the esxi hosts)

                  • 6. Re: "All hosts contribution stats" warning
                    Darking Novice

                    Hi wolfman017

                     

                    any Update on your issue?

                     

                    i created a case a week ago with gss, and unfortunately I have not yet received a analysis or resolution on the problem.

                     

                    they are telling me they will report in tomorrow but level 2 tech and his senior were stumped :/

                    • 7. Re: "All hosts contribution stats" warning
                      Wolfman017 Lurker

                      Hi,

                       

                      No, no news. I sent the issue to our TAM, and it is not a known issue.

                       

                      We have another vSAN Cluster on the same vCenter, and I have the same issue.

                      I just build another vSAN Cluster (exactly the same as the one with the issue) on another vCenter, and I don't have the issue.

                      • 8. Re: "All hosts contribution stats" warning
                        Beingnsxpaddy Enthusiast

                        Did you try restarting the management services on the ESXi Host, and can you add a new host in same cluster to see the behaviour.