VMware Cloud Community
fgl
Enthusiast
Enthusiast

vdr and vcenter integration?

I like to know how integrated is vdr into vcenter the reason I'm asking is because I had 8 backups running with vdr when I had to reboot my vcenter server I logged back into to vcenter afterward and then vdr and I noticed that all of the 8 backups that were running disappearred and a recatalog is running, checking the event log of vdr it thinks that a system crash occurred but the vdr appliance did not crash or reboot. I thought that the vdr was independent of vcenter and that backups would still run without vcenter. I thought I read in some previous post that people said you could do vdr restore without vcenter by going directly to the host that is running the vdr appliance but if by rebooting the vcenter it causes a complete crash of vdr and any running task then it kinda makes it very difficult to apply updates or patches to the vcenter especially when I have vdr backups running like 20 hours a day.

Update: The vdr appliance stopped doing anymore backup after this and all of the snapshots from the 8 runnings backups are all just sitting there and attached to the vdr appliance. I had to do a shutdown and reboot of the vdr to give it a kick in the ass to clean up all those snapshots and start the backup process all over again. In other words make sure your vdr is shutdown before you reboot your vcenter to save yourself problems.

0 Kudos
9 Replies
admin
Immortal
Immortal

Some clarification

1) For any VDR automated tasks,VDR requires the presence of a vCenter Server. This includes central authentication (otherwise, VDR will prompt you for the password for every ESX host that has a VM that it is protecting) and licensing (the dormant license file is stored in the vCenter Server and is what VDR uses to verify that VMs protected reside on ESX hosts that have a valid VDR license). In addition, where is VM movement (vMotion, HA, etc), VDR just needs to get the information from the vCenter Server as opposed to tracking this separately.

2) For the vast majority of manual VDR tasks, VDR does require the precence of a vCenter Server. The classic example is when you need to restore a VM or file and the vCenter Server is not available. You would login to the ESX host where the VDR appliance resides, open the VDR plug-in, select the Restore tab and begin the restore process.

In #1, once a snapshot for a VM has been initiated, then the backups can continue without presence of the vCenter Server UNTIL we need to delete the snapshot from the VDR appliance - since we need to authenticate. So based on what you described, it looks like VDR was not able to reauthenticate to the vCenter Server - leading to the orphaned snapshots. Or as you are theorizing, the backups ended exactly when the vCenter Server was rebooted - in this case, it make sense that we could not reauthenticate since vCenter Server was indeed gone.

In terms of what was happening here, it is probably best to look at the ESX log too (the host where VDR resides that see what is going on here).

Another theory is that the destination disk for the VDR appliance is attached to the vCenter Server (VMDK or CIFS)...so if that disappears during a backup because of the reboot of the vCenter Server, there is not much we can about the backups than to abort. Can you provide more info on location of the destination disk for the VDR appliance and whether it is a virtual disk or a CIFS share?

About the backups that run 20 hrs a day, can you provide more description?

- Is this the entire job or a single VM? If yes, how many VMs?

- On average, how big are the VMDKs for these VMs?

- Are they HW4 or HW7 VMs?

- If you looked in the logs, when these backups are performed, does it state Hot-Add or Network as the transport mechanism for data transfer?

0 Kudos
RParker
Immortal
Immortal

Or as you are theorizing, the backups ended exactly when the vCenter Server was rebooted - in this case, it make sense that we could not reauthenticate since vCenter Server was indeed gone

Not a theory, that's how it works (incorrectly or by design) if you reboot vCenter during a session of VDR, even though VDR is NOT Actively polling the vCenter you will need to restart the VDR appliance for it to properly authenticate. To me that's an issue, because if I reboot the vCenter I ALSO have to reboot ALL the VDR's in the meantime, because even though they are already authenticated, they will NOT connect back to vCenter until they are rebooted.

0 Kudos
RParker
Immortal
Immortal


In other words make sure your vdr is shutdown before you reboot your vcenter to save yourself problems.

Yes, that is confirmed. I see the same thing. REALLY annoying... NOT only that, but if there is a network glitch.. you STILL have to reboot the VDR, almost as if they have a unique token to authenticate, so if they lose connection they need another token...

0 Kudos
admin
Immortal
Immortal

RParker

I believe the original post was that indicated that the backups ended immediately when the vCenter Server was rebooted.. My point is that backups do not end when a vCenter Server is rebooted. You may have a point about VDR reauth issues after a VC reboot..and lets pursue that. My goal of the previous post was to understand more about fgl's environment.

0 Kudos
RParker
Immortal
Immortal

I believe the original post was that indicated that the backups ended immediately when the vCenter Server was rebooted.. My point is that backups do not end when a vCenter Server is rebooted

Well still not technically true. They may not END but they WILL FAIL. So same result.

vCenter is up. VDR-1 is up.

Start backup. VM-1 runs fine (able to commit snapshot).

reboot vCenter.

VDR-1 is STILL up.

vCenter comes up BEFORE a successive backup completes...

VM-2 fails with a complaint that snapshot cannot be deleted, and ALL subsequent backups fail as well. So no they don't stop.. but the NEXT VM-3 backup won't start either.. (failure to scan disk).

I have LOTS of logs you want them? Smiley Happy

FAILED BACKUP = STOPPED BACKUP because the end result is the backups are aborted... until the VDR is restarted. Semantics aside, the same problem exists, no matter how you label it.

0 Kudos
fgl
Enthusiast
Enthusiast

Azmir,

Rparker is correct in that you need to reboot the VDR after any reboot of the vCenter. I did not have any of the 8 backups that were currently running end during the time of the vCenter reboot, they were all around 20-40% into the backup and the reboot of the vCenter only lasted 4 minutes so I know for sure none of the backups finished during that time. Afterward I checked the VDR and noticed that all snapshots were still attached to the VDR appliance but no activity (cpu or network and memory usage was at minimal) I let the VDR sit for about an hour to see if anything changes and nothing so I rebooted the VDR and after it came up it cleaned up all 8 snapshots from the 8 backups then ran it's recatalog and started backing up like normal again.

I have one backup job for all of my VMs (about 40 of them) and it takes about 20 hours to complete one cycle. About half of them are still on version 4 hardware so I know those VMs will take longer but I don't have any open window to even down them for 5 minutes to upgrade the hardware version.

The average size of the VMs are 40GB single vdmk.

They are all Hot-Add backups.

They VDR appliance itself is on a iSCSI datastore and the dedup storage for the VDR appliance is a CIFS volume. I'm pretty sure it's not network bandwidth related because it has a 1GB connection and I rarely see usage above 20MB sustained.

0 Kudos
fgl
Enthusiast
Enthusiast

Does anyone know if VDR 1.2 fixes this problem in which you need to reboot VDR appliances after a vCenter server reboot?

0 Kudos
admin
Immortal
Immortal

This is fixed in 1.2. From release notes

  1. If vCenter Server Becomes Unavailable Data Recovery Permanently Loses Connectivity

If a vCenter Server was rebooted or lost network connectivity while Data Recovery was conducting backups, Data Recovery failed to re-establish connectivity with the vCenter Server until after currently running backup jobs completed. This caused all new backup operations to fail and the vSphere Client plug-in could not connect to the engine during this time. Data Recovery now attempts to reconnect to the vCenter Server at regular intervals. This occurs while backups are in progress, thereby minimizing potential failures.

0 Kudos
kcucadmin
Enthusiast
Enthusiast

Thanks Azmir,

i think several of my data stores issues, have steamed from failed backups that were caused from a vcenter restart, damn windows updates everyday at 3:00 am...

0 Kudos