VMware Cloud Community
panikfan
Contributor
Contributor

SRM5 using vSphere replication, status shows 'not active'

I've been using SRM5 with host-based replication pretty extensively and for the most part is has been solid.  I am using it to protect 18 VMs at my production site.  I am now trying to configure a couple of VMs at the recovery site to replicate back to the production site.  I was able to go through the wizard, watched it create the virtual disk, but when looking at the vSphere replication section in SRM, the status for these VMs is 'not active'.  If I highlight one and select 'synchronize now' it either does nothing (status remains 'not active') or I get an error saying that 'An ongoing synchronization task already exists' and the following info in the error stack:

Call "HmsGroup.OnlineSync" for object "GID-5be7451c-c57d-4f6b-bb8e-01cddec9297e" on Server "192.168.XXX.XXX" failed.
An unknown error has occurred.

The server it is referring to is the VRMS server at the recovery site.  Additionally, in the recent tasks area I see the message 'vSphere Replication operation error: Virtual machine is in an invalid state."

I have rebooted everything in the environment, including the VR servers, VRMS servers, Virtual Center servers (which include all the databases), vSphere Host servers, and the VMs, at both sites.  Verified there is plenty of storage on the target datastore.  I even removed the VMs from the inventory and re-added them, but still get this error.

Ok now the best part - when I look at the target datastore, it appears that the VM has actually been replicated successfully.  I see all the virtual disks that I'd expect to see, and all the hbrcfg files, with 'modified' timestamps that correspond to the last time I attempted to force replication.  So just a GUI issue or what?  Anyone else ran into this yet?

Reply
0 Kudos
13 Replies
VMmatty
Virtuoso
Virtuoso

Moved thread to SRM forum.

Matt | http://www.thelowercasew.com | @mattliebowitz
Reply
0 Kudos
JFitchVMA
Enthusiast
Enthusiast

I'm seeing the same "problem". I have 2 boxes I have setup through VR, one of the updates the sync time like I'd expect it to, but the second one looks the same way your describe and when I try to syncronize now I get the same error. Also when you try to add that VM to a recovery plan it gives you a warning. Has anyone found a way to fix it?

Jonathon Fitch blog: atangledweb.net Virtualization enthusiast and advocate
Reply
0 Kudos
panikfan
Contributor
Contributor

Well I was able to work around this finally.  I removed replications for the VMs that were not working, then restarted EVERYTHING.  All of the VRS, VRMS servers, vCenter servers at both locations, and all vSphere hosts at both locations.  Then logged into SRM again and then setup replication again, so far no more troubles.

I will be looking forward to the first update for SRM 5, it's a great product but still a little buggy.

Reply
0 Kudos
rcardia
Contributor
Contributor

panikfan

I'd like to know if u was able to resolve this problem? I am having the same issue and all blog I enter say that is a buggy, is it true?

Thks a lot

Renato Cardia

Reply
0 Kudos
panikfan
Contributor
Contributor

Yes I did get the issue resolved. I believe it was a problem with the VRMS appliance at the protected site, and restarting that cleared it up. It’s been working fine for new and existing replications since then.

Reply
0 Kudos
rcardia
Contributor
Contributor

I did shutdown in my VRMS source but didnt work. U only reboot the VRMS and did work?

Reply
0 Kudos
panikfan
Contributor
Contributor

No I rebooted everything end to end… The VC servers (which also run SQL express for all databases), the vSphere servers, and the VRS/VRMS servers at the protected and recovery sites. So I’m not sure exactly what it was, but I think it was something hung up in the protected site VRMS server. When it started working again I didn’t pursue it any further.

I do look forward to SRM 5.1 though… hopefully a few of the bugs will be worked out.

Reply
0 Kudos
WobblyAdam
Contributor
Contributor

I can add a bit more to this.

I had the same problem as described above.  Replication from Site2 to Site1 worked fine, but Site1 to Site2 kept stalling with 'Not Active' status.

I trawled the logs on the VR Server (/var/log/vmware/hbrsrv.log) and noticed that it was trying to make a connection to the ESXi host on an inaccessible IP Address.

If you don't know the root password for your Replication Server, it is 'vmware' by default.  Change it!

The ESXi host has two Management Networks configured.  One is via a NIC which is connected to a network isolated switch, used only for vMotion and management traffic.  For some reason the Replication Server was trying to use that address for communication to the host.

I did find the address was registered in the SQL Server Database used for vSphere Replication:

Select * from [vrm_database_name].[dbo].[HbrHostEntity]

However, changing that address by hand didn't solve the problem, even with server reboots and redeployment.  In fact, it reverted back to the wrong address.

In the end I removed the management network from the isolated adapter on the ESXi host, redeployed my Replication Server appliance, and everything sprang into life.  I am just about to put the management network back on the host again.  Hopefully it doesn't break things.

I hope that is helpful to someone...

Reply
0 Kudos
klabiak
Enthusiast
Enthusiast

I had a similar problem. But in my case for some reason disappeared gateway in vr appliance. I am 100% sure that I have entered gateway in vr deploy wizard. Once entered, machine started to synchronize without problem.

Reply
0 Kudos
YacoubIshak
Contributor
Contributor

reregister the VRS , that was the solution for me

check this URL it has more details about this problem

http://virtuallyhyper.com/2012/08/vsphere-replication-shows-inactive-replication/

Reply
0 Kudos
SCX
Enthusiast
Enthusiast

I just had the exact same problem with ESXi 5.1, so it's not been fixed.

Logged  a support request, but that was useless. The only thing that worked was a full reboot of both ESXi Hosts at primary and remote sites

No amount of restarting of VRMs, SRM services or ESXi services worked for me.

Seems like a serious bug that needs fixing.

Reply
0 Kudos
DAVIDWILHELM
Contributor
Contributor

Shutting down the SRM services at both the protected site and the recovery site, restarting the replication appliances at both site, and then starting the SRM services cleared up the problem for me.  I didn't have to restart my vcenter or reregister my vms for replication.

Reply
0 Kudos
rcardia
Contributor
Contributor

Por favor, alterem meu e-mail para cardia_renato@hotmail.com<mailto:cardia_renato@hotmail.com> pois o email rcardia@f9c.com.br<mailto:rcardia@f9c.com.br> será desativado.

Grato

Renato Cardia

Reply
0 Kudos