VMware Cloud Community
AntEvs
Contributor
Contributor

Replication just sits on Initial Full Sync

Hi All

I have installed vSphere Replication appliance at my Main site and another appliance and my remote site. The problem i am having is when i try to replicate to the remote site it just sits there saying Initial Full Sync. Nothing else happens. I have left it overnight to see if it would eventually go but nothing. I dont get any alarms or anything to show its failing.

I thought maybe i would need to open the ports on the firewall but this did nothing. Rebooted both replication appliances but still the same issue.

Can anyone help with any ideas?

Thanks  

5 Replies
mvalkanov
VMware Employee
VMware Employee

Hi,

Please see Events page at both source and target vCenter Server. There should be an error message, if the sync can not progress.

Please also grep vmkernel.log at source host for "Hbr". There could be a network issue and the source host not reaching the VR server at the target site.

Regards,

Martin

Reply
0 Kudos
davelee2126
Contributor
Contributor

Does the percentage creep up at all?  One thing you could try is checking the status of replication at the command line as you get a little more information that way.

  • SSH onto the ESXi host that is running the virtual machine you're replicating
  • Run the following command "vim-cmd vmsvc/getallvms" which should spit out a list of all VMs running on the host.  Note the ID number against the VM you are replicating
  • Run the following command "vim-cmd hbrsvc/vmreplica.getState x" where x is the ID number of the VM you're replicating

This will split out something like the example below which will tell you how far through the checksum is and how much actual data is being transferred.

/vmfs/volumes # vim-cmd hbrsvc/vmreplica.getState 12
Retrieve VM running replication state:
The VM is configured for replication. Current replication state: Group: GID-4b535f47-434d-4068-a0d7-a7e36e8c37cc (generation=38763970093236284)

  Group State: full sync (37% done: checksummed 218.0 GB of 410 GB, transferred 14.2 GB of 216.0 GB)
   DiskID RDID-ab79cc55-642d-4a4b-b261-668425455661 State: full sync (checksummed 10 GB of 10 GB, transferred 7.8 GB of 9.6 GB)
   DiskID RDID-ddb1b0f6-b394-41a3-9edb-c96b0e55190c State: full sync (checksummed 208.0 GB of 400 GB, transferred 6.4 GB of 206.4 GB)

Dave

Smoggy
VMware Employee
VMware Employee

what version is this? 5.0 or 5.1?

as Martin states this is almost certainly a routing issue. VR traffic will exit the source host via a vmkernel interface which is tagged as supporting "management" traffic type. Without using static routing it is not easy to control which portgroup is used if you had for example >1 tagged as management. in 5.1 we introduced a "tech preview" feature to make this simpler to select. By default VR traffic will select the first management vmkernel portgroup it finds.

a common issue that causes customers to have >1 "management" group is customers have dedicated group for vmotion and also have "manangement" traffic type checked "on" for that group when really all you need is "vmotion" law of sod dictates that group get targeted by VR and that groups vmknics cannot route to the target site.

at the target site we also select a NIC to use for NFC traffic to copy data down to the target hosts and ultimately datastores.

take a look here for info on issue having >1 "management" groups:

https://www.vmware.com/support/vsphere5/doc/vsphere-replication-51-release-notes.html#knownissues

Reply
0 Kudos
jkghfjkasdhfjka
VMware Employee
VMware Employee

Also, bear in mind that the required ports for the Initial Full Sync and the ongoing syncs are different. Port 31031 will need to be open to allow for the Initial Full Sync, whereas port 44046 is used for ongoing replications. Please see:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100956...

Reply
0 Kudos
mmarinov
VMware Employee
VMware Employee

Hi, Is there any progress on this issue? If the problem still exists could you please open an SR about it and provide the corresponding logs. If it is resolved, please marked as such.

Thanks

Martin Marinov VMware Software Engineer If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points
Reply
0 Kudos