5 Replies Latest reply on Mar 6, 2018 2:53 PM by wbabineaux

    Old Data Receiving

    wbabineaux Novice

      Hi all,

       

      I have a Large 6.6.1 cluster. We have 5 Master/Data nodes and 4 Remote Collectors. There are 4 vCenter adapters and 1 vSAN adapter.

      I switched from the MPSD to the vSAN adapter this past Monday. I have removed the old MPSD adapter and have confirmed the objects it monitored have been removed as well.

       

      Since switching over to the vSAN adapter, I have noticed that pretty much all of the objects in that instance have a collection status of "Old Data Receiving". Some objects are collecting, some have old data. I know the "Old Data Receiving" means data is not current and it is behind by 5 polling cycles.

       

       

      My questions:

      - Does this status simply mean there is no new data?

      - Should I be concerned with not receiving actual data to trigger the alerts?

       

       

      Thanks for any and all help!

        • 1. Re: Old Data Receiving
          jasnyder Expert
          vExpert

          Dumb question but the VSAN performance service is turned on for the VSAN clusters?

           

          From the 6.5 docs:

           

          When you create a vSAN cluster, the performance service is disabled. Turn on vSAN performance service to monitor the performance of vSAN clusters, hosts, disks, and VMs.

           

          About this task

           

          When you turn on the performance service, vSAN places a Stats database object in the datastore to collect statistical data. The Stats database is a namespace object in the cluster's vSAN datastore.

           

          Prerequisites

           

          • All hosts in the vSAN cluster must be running ESXi 6.5 or later.
          • Before you enable the vSAN performance service, make sure that the cluster is properly configured and has no unresolved health problems.

           

          Procedure

           

          1. Navigate to the vSAN cluster in the vSphere Web Client navigator.
          2. Click the Configure tab.
          3. Under vSAN, select Health and Performance.
          4. Click Edit to edit the performance service settings.
          5. Select the Turn On vSAN performance service check box.
          6. Select a storage policy for the Stats database object and click OK.
          1 person found this helpful
          • 2. Re: Old Data Receiving
            wbabineaux Novice

            Sorry for the late response, I wanted to get some information from VMware regarding this.

             

            I checked that the health/performance service was on, and it is. I have spoken to VMware about this and it seems there are a bunch of customers who are having the same issues. Vmware has stated it is a bug in the code, where the metrics are not being collected in a timely manner and is timing out. This is why the metrics are not up-to-date.

             

            I have installed the hotfix they provided for the vSAN adapter, but it seems to be the same issue. Most, if not all, customers are still seeing the same issue, but some metrics are coming in correctly, where most are still behind in the collection.

            Engineering is looking into this more and there is supposed to be another Hotfix that will be released, no date on this as of yet.

             

            I will update as I get back more information.

            • 3. Re: Old Data Receiving
              vMan Enthusiast
              vExpert

              I have the same issue at my large customer, also have the new hot fix applied and the same issue.

               

              Back to engineering to tweak the call to the vSAN API

              • 4. Re: Old Data Receiving
                wbabineaux Novice

                I wanted to come back and revisit this as we might have this resolved, for now. I installed the vSAN hotfix (2.0.0.7192536) for our Large environments. Actually, all of our vROPs clusters are Large environments, but we do have some vSAN adapters with 200-900 objects.

                 

                Along with the hotfix above, I also put in some workarounds that were suggested by VMware engineering: We made changes to the below file

                 

                File:  /usr/lib/vmware-vcops/user/plugins/inbound/VirtualAndPhysicalSANAdapter3/conf/config.properties

                 

                 

                Original File:

                 

                # Frequency the resource collection should take place

                RESOURCE_DISCOVERY_FREQUENCY = 5

                 

                # vCenter resources cache update frequency

                VCENTER_RESOURCE_DISCOVERY_FREQUENCY = 5

                 

                # Pool sizes for discovery & collection

                DISCOVERY_POOL_SIZE = 3

                COLLECTION_POOL_SIZE = 5

                 

                # VIM Client read timeout (ms)

                VIMCLIENT_READ_TIMEOUT = 120000

                 

                 

                 

                 

                Edited File:

                # Frequency the resource collection should take place

                RESOURCE_DISCOVERY_FREQUENCY = 5

                 

                # vCenter resources cache update frequency

                VCENTER_RESOURCE_DISCOVERY_FREQUENCY = 5

                 

                # Pool sizes for discovery & collection

                DISCOVERY_POOL_SIZE = 3

                COLLECTION_POOL_SIZE = 5

                 

                # VIM Client read timeout (ms)

                # VIMCLIENT_READ_TIMEOUT = 120000

                 

                 

                # VMWARE WORKAROUNDS TO VSAN BUG

                CIM_SERVICE_PROTOCOL = none

                VIMCLIENT_READ_TIMEOUT = 900000

                 

                 

                 

                I have this now installed in 2 cluster which were seeing the "Old Data Receiving" issue. I can now say that we seem to be receiving metrics, although, some of the vSAN metrics seem to be 20 mins behind, which I would expect since we increased the READ_TIMEOUT. I will wait for the weekend to pull a full log bundle to confirm the cluster is not seeing the errors pertaining to the vSAN bug.

                 

                 

                I hope everyone else who was affected has this resolved now. I am looking forward to the new vROps release with fixes.

                • 5. Re: Old Data Receiving
                  wbabineaux Novice

                  Just a note:

                   

                  The above workarounds actually disable the SMART metrics for the vSAN adapters. I have checked here: Metrics for vSAN Cache Disk

                  It looks like disabling the SMART metrics might not be in the best interest, might need to revisit this.

                   

                  Has anyone else had any other issues after applying the HF or any workarounds?

                   

                   

                   

                  Thanks!