VMware Cloud Community
cypherx
Hot Shot
Hot Shot

After upgrading vSphere replication 6.1.2 to 8.1, I get "problem occurred with the storage on datastore path" - no replication working

The plans are to move to the lastest build of vCenter 6.7.  To do this, our Windows vCenter 6.0u3 servers will be migrated to VCSA 6.7, but there are a few prereqs that need to be taken care of first.  Namely vSphere replication and SRM.  The plan is first upgrade vSphere replication at HQ, and DR sites, then SRM at HQ and DR sites.  If all is well, regroup and continue moving forward.

I upgraded vSphere replication 6.1.2 to 8.1 at both ends.

Now there is no replication occurring. I'm monitoring bandwidth using two of our internal monitoring tools and I do not see any significant traffic from ESXi servers in the HQ site to the DR site.

All replications are there but have error (RPO violation). Each one complains about a problem occurred with the storage on datastore path '[Tegile02] vmname/hbrdisk-RDID-unique UID.vmdk'.

At the DR vcenter server, the NFS storage path Tegile02 is fully accessible and I can browse and see the vmdk files that are reference in the vSphere replication errors.

I tried "reconfiguring" one of the VM's replication and paged through it (next, next, next) and it still remains in the same error status. I tried restarting the new vSphere replication appliance at both ends and same result.

6 Replies
sarikrizvi
Enthusiast
Enthusiast

Check if vSphere replication are connected in source and destination under vCenter >>> Config >>> Under ...vSphere Replication ...

If not then try to re-authenticate.

Regards,
SARIK (Infrastructure Architect)
vExpert 2018-2020 | vExpert - Pro | NSX | Security
vCAP-DCD 6.5 | vCP-DCV 5.0 | 5.5 | 6.0 | vCA-DCV 5 | vCA-Cloud 5 | RHCSA & RHCE 6 | A+ (HW & NW)
__________________
Please Mark "Helpful" or "Correct" if It'll help you
_____________________________________
@Follow:
Blog# https://vmwarevtech.com
vExpert# https://vexpert.vmware.com/directory/1997
Badge# https://www.youracclaim.com/users/sarik
Reply
0 Kudos
tayfundeger
Hot Shot
Hot Shot

The reason for this error may be a coexistent firewall. There may be a port restriction. Can you temporarily open any port on either side of it? Also, you do not have a problem on the DNS side, do you? Have you added ESXi, vCenter, vSphere Replication IPs in the hosts files of vCenter and vSphere Replication machines located on the Source and Destination site?

--
Blog: https://www.tayfundeger.com
Twitter: https://www.twitter.com/tayfundeger

vBlogger, vExpert, Cisco Champions

Please, if this solution helped your problem, "Helpful" if it solves your problem "Correct Answer" to mark.
vFouad
Leadership
Leadership

Check if the (HBR) replication servers are connected... as mentioned above...

please share a screenshot like:

pastedImage_0.png

For each side so we can see they are connected, and replication jobs are configured...

When you upgraded... what method did you use?

The upgrade process is more of a migration than a traditional upgrade...

Please specify your steps so we can validate.

Please let me know (via direct message) if you have a support agreement, so if you have uploaded logs etc, I can review them, outside of communities.

Kind regards,

Fouad

Reply
0 Kudos
cypherx
Hot Shot
Hot Shot

Ok the upgrade was like a migration.  Deploying the OVF, and upon power on choosing the upgrade option and filling out all the details.

It turns out the hbr (guid) disk was invalid so to fix this is the step that has to be done for each vm.  I have 29 replicated VM's.

On destination site rename the folder the VM is replicating to.  For example vm.old.

On protected site disable the replication for the VM.

On destination site rename the folder the VM was replicating to back to its original name.  For example vm.

On protected site create a new replication, browse and use the seed disks at the destination site.

I got through 16 of 29 of these, but now I'm getting a java socket timed out. I think the replication appliance is just overwhelmed with traffic.  I usually throttle the ports on the switch at 60mbps (out of 100mbps pipe) for vsphere replication ports, but I changed it to 90mbps to give it more time.

Back story:

Friday upgraded both sites, but protected site did not work.

VMWare support finally got back to me Monday.  Struggled but got them some logs.

VMware support suggested trying the upgrade again monday afternoon.

Tuesday morning on protected site, powered off new appliance and tried upgrade again. This time sucessful.

Everything connected and paired... just replications not working.

VMware support suggested the faulty upgrade that they see from time to time may have messed up hbr disks.

   First they tried deleting the hbr disks, just leaving the base disk.  Did not work.  

   Next they did the procedure I did above and it worked, so instructed me to do the same.

So a little over half are in the "initial seed" and I'll keep trying every few hours to get the last few back on.  I'm now seeing bandwidth usage on the WAN links, and when I dive deeper its from the ESXi servers, so it is progressing slowly.

Next I need to upgrade SRM.  Traditionally SRM has been installed on our Windows vCenter servers.  But soon we want to migrate those vcenter servers to the VCSA and do an upgrade from 6.0 to 6.7.  Does vmware have a Photon OS based virtual appliance (OVF) to deploy for SRM?  If the move away from Windows as the OS behind some of these vmware components is the goal, what is the solution for SRM?

Reply
0 Kudos
vFouad
Leadership
Leadership

There is a SRM appliance please check the compatibility matrix for your VC versions first:

The upgrade process would be upgrade to windows SRM 8.2 then migrate to the appliance

Download VMware Site Recovery Manager for IT Disaster Recovery and planned migrations

VMware Product Interoperability Matrices

Migrating from Site Recovery Manager for Windows to the Site Recovery Manager Virtual Appliance

Kind regards,

Fouad

Reply
0 Kudos
cypherx
Hot Shot
Hot Shot

Ah so SRM 8.2 IS compatible with vCenter 6.0 update 3?  I thought 6.0 only went to SRM / VR 8.1.

So I did download vR 8.1 (and I'm upgraded to it now), and also SRM 8.1.  I didn't install SRM 8.1 yet because when I try it says port 443 is in use on the vcenter windwos server where SRM 6.1.2 existis today.  Well of course port 443 is in use, thats the vCenter web client.

So I'm not really entirely sure what to do.  Maybe get the 8.2 binaries install it and put it on some nonstandard port like 8443 and then migrate to the appliance where it can live back on a stanadard port again (hopefully).  Then finally upgrade vcenter to VCSA 6.7.

Anyway I was able to get all of my replications re-seeding.  Just needed to give it some time.  It will take a day or two to catch up from being down since last Friday, but once its caught up I'll resume the upgrade / migrations.

Reply
0 Kudos