3 Replies Latest reply on Feb 1, 2018 5:47 AM by beefy147

    vSphere Replication Add-On Server flooding network

    burchell99 Novice

      We have deployed vSphere Replication 6.1.1 and it has been working with approx 60 replications for 6 months without any issues

       

      Last week we deployed additional add on servers in a remote branch (managed by the same vCenter) and now all hosts within the vCenter globally are sending hundreds of MB per hour to the Add-On appliances. This is causing network bandwidth issues due to the volume of traffic from the 50 hosts globally.

       

      I believe an initial mapping exercise takes place for the appliance to understand which hosts have access to which datastores but its sending 3GB per day from each host and has no sign of ending!

       

      Has anyone experienced this before? I have a P2 support ticket open but am reaching out to the community to see if this is expected behavior / a bug or something else entirely

       

      Any advice greatly appreciated

        • 1. Re: vSphere Replication Add-On Server flooding network
          vbrowncoat Expert
          vExpertVMware Employees

          Why did you add additional VR Server(s) to your remote site? If you configured replication of VMs to or from the remote site did you select the VR server at the destination when you configured them?

           

          The traffic flow for VR is this: Source VM > Source Host (host VM is running on) > Destination VR appliance or server > Destination Host > Destination Storage

           

          What this means is that to replicate to a remote site you want the VR server deployed at the remote site and replication configured to use it.

           

          Please provide some additional details about your topology and configuration so we can get a better idea of your traffic flows.

          • 2. Re: vSphere Replication Add-On Server flooding network
            burchell99 Novice

            Thanks for the reply

             

            additional VR servers were added to remote sites because the replication is happening in that country which is thousands of miles away from the primary data centres

             

            I configured the replication using the add-on VR server and it works as expected

             

            We have the embedded primary vSphere replication servers both in the UK and working fine with 60+ replications. The add-on appliances are in a remote country with currently 1 replication configured to use the add on.

             

            my problem is the network traffic being sent to discover all hosts and datastores mappings within that vCenter (not replication traffic, just initial discovery)

             

            support note the logs have: "750+ repetitions for the same host-datastore combination" and have asked me to increase CPU and RAM resources on the add on appliance which I can do but find it hard to believe its the root cause

             

            Anyone seen this before? I am continuing to work with support. We previously had vSphere rep 5.5 setup in the same topology and configuration and had no experience like this

             

            here is an extract of the log analysis (host names have been changed)

            ---------------------------------------

             

            Searching for the references for specific hosts show 750+ repetitions for the same host-datastore combination:

             

            e.g.

              grep "accessible true" _var_log_vmware_hbrsrv*log | awk '{print $7,$9}'| grep host-87| sort  |uniq -c

                 765 host-87 /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be

                 765 host-87 /vmfs/volumes/59369879-25e5f99c-9f65-0025b5621000

                 765 host-87 /vmfs/volumes/5936989f-c24c1d19-5832-0025b5621000

                 765 host-87 /vmfs/volumes/593698d0-435294c6-e47a-0025b5621000

                 765 host-87 /vmfs/volumes/59369909-177fb6d1-f680-0025b5621000

                 765 host-87 /vmfs/volumes/59369957-19dd6dd8-2326-0025b5621000

                 765 host-87 /vmfs/volumes/5936999d-dc17bf42-b1e4-0025b5621000

                 765 host-87 /vmfs/volumes/593699e2-d8eefcf2-f053-0025b5621000

                 765 host-87 /vmfs/volumes/59369a2e-de9ebc18-f769-0025b5621000

                 765 host-87 /vmfs/volumes/59369a7a-602c5e98-6a21-0025b5621000

                 765 host-87 /vmfs/volumes/59369acb-bf8cad98-0401-0025b5621000

                 765 host-87 /vmfs/volumes/59369cd7-24490508-1d64-0025b5621014

                 765 host-87 /vmfs/volumes/59369d15-0947c92c-5c02-0025b5621014

                 765 host-87 /vmfs/volumes/59369d5d-3d62e08a-2502-0025b5621014

                 765 host-87 /vmfs/volumes/59369d95-85aea7f9-0059-0025b5621014

                 765 host-87 /vmfs/volumes/59369dcc-22a7847b-7463-0025b5621014

                 765 host-87 /vmfs/volumes/59369dfc-38dc64dc-b2b5-0025b5621014

                 765 host-87 /vmfs/volumes/59369e28-eead2b5c-c8e7-0025b5621014

                 765 host-87 /vmfs/volumes/59369e53-df715bb8-9c97-0025b5621014

                 765 host-87 /vmfs/volumes/59369e85-f865fd52-37e3-0025b5621014

                 765 host-87 /vmfs/volumes/59369ed9-f78276b8-4d27-0025b5621014

                 765 host-87 /vmfs/volumes/598d610a-cce9c00a-5462-0025b5621154

                 765 host-87 /vmfs/volumes/59e1d92e-ec292b0d-f16c-0025b5621014

                 765 host-87 /vmfs/volumes/59e1d980-188f20ae-6d3b-0025b5621014

                 765 host-87 /vmfs/volumes/59e1da19-34f9854e-6e0a-0025b5621014

                 765 host-87 /vmfs/volumes/59e1da4e-e5d7659b-1bb4-0025b5621014

                 765 host-87 /vmfs/volumes/59e1da7e-fea1c702-f398-0025b5621014

                 765 host-87 /vmfs/volumes/59e1dab2-3261a01f-f7de-0025b5621014

                 765 host-87 /vmfs/volumes/59eba7e6-fdbf9d18-17e6-0025b56210dc

                 765 host-87 /vmfs/volumes/59eba81f-482d1508-256a-0025b56210dc

                 765 host-87 /vmfs/volumes/59eba842-a5935f28-29c7-0025b56210dc

                 765 host-87 /vmfs/volumes/5a09f7e7-95af91e1-3811-0025b5621168

                 765 host-87 /vmfs/volumes/5a09f94d-ee363e1d-d5fc-0025b5621168

                 765 host-87 /vmfs/volumes/5a0a20ad-be9563f0-739f-0025b5621168

             

             

            Entries repeat in this kind of pattern:

                 _var_log_vmware_hbrsrv-216.log:2018-01-16T16:03:23.692Z verbose hbrsrv[7F4E524DC760] [Originator@6876 sub=Host opID=hs-init-75f6efae] Host: host-87 Datastore: /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be -> 59302ddb-6aec7dc4-4b6f-0025b56210be (name esx01-localstorage) accessible true

            _var_log_vmware_hbrsrv-216.log:2018-01-16T16:03:27.607Z verbose hbrsrv[7F4E4B123700] [Originator@6876 sub=Host] Host: host-87 Datastore: /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be -> 59302ddb-6aec7dc4-4b6f-0025b56210be (name esx01-localstorage) accessible true

            _var_log_vmware_hbrsrv-216.log:2018-01-16T16:03:32.498Z verbose hbrsrv[7F4E4B42F700] [Originator@6876 sub=Host] Host: host-87 Datastore: /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be -> 59302ddb-6aec7dc4-4b6f-0025b56210be (name esx01-localstorage) accessible true

            _var_log_vmware_hbrsrv-216.log:2018-01-16T16:03:37.162Z verbose hbrsrv[7F4E4B7BD700] [Originator@6876 sub=Host] Host: host-87 Datastore: /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be -> 59302ddb-6aec7dc4-4b6f-0025b56210be (name esx01-localstorage) accessible true

            _var_log_vmware_hbrsrv-216.log:2018-01-16T16:03:41.706Z verbose hbrsrv[7F4E4B4B1700] [Originator@6876 sub=Host] Host: host-87 Datastore: /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be -> 59302ddb-6aec7dc4-4b6f-0025b56210be (name esx01-localstorage) accessible true

            _var_log_vmware_hbrsrv-216.log:2018-01-16T16:03:45.897Z verbose hbrsrv[7F4E4B268700] [Originator@6876 sub=Host] Host: host-87 Datastore: /vmfs/volumes/59302ddb-6aec7dc4-4b6f-0025b56210be -> 59302ddb-6aec7dc4-4b6f-0025b56210be (name esx01-localstorage) accessible true

            • 3. Re: vSphere Replication Add-On Server flooding network
              beefy147 Enthusiast

              This is now resolved. Working with support it was a storage IO control (SIOC) bug

               

              disabling SIOC on all datastores (luckily we didn't need to use it) resolved the issue

               

              "Why did ESX send so many datastore events? It is SIOC issue which is fixed in

              vsphere60u3.

               

              As a workaround, you can ask customer to disable SIOC in all datastores

               

              Finally restart the vsphere replication appliance"

               

              This work around was suitable for us! Hopefully this helps someone else one day

              1 person found this helpful