I love the fact that VMware is providing VDR as part of the vSphere package. It's definitely a step in the right direction, albeit I'm still inclined to think this software hasn't been put through the ringer in terms of proper QA. I'm just trying to put out a feeler to see how many others have experienced some of the same issues I'm having.
To start, I'm backing up my VMs via a network share on a standalone Windows 2003 server that has a NAS attached to it.
Some of the issues I've noticed:
1) Backups take an inordinate amount of time. I can understand the first backup, but my VMs don't change very much from day to day. Most of the data being manipulated is located on RDMs are these are backed up using Tivoli, not VDR (I use VDR solely for the OS partitions). Each partition is approximately 25GB, there are 15 VMs and my backup window (10pm - 6pm) isn't sufficient to complete the process.
2) Integrity checks for the backups are taking a crazy amount of time and will usually stop due to my window being closed (see point #1)
3) I'm getting inconsistent "failures" for certain VMs (the report will simply state that a VM failed to backup, not much else). It also varies per night and not always the same VMs (not exactly sure if this is related to #1 where the window is closing while VDR is executing)
4) I had the most difficult time setting up the remote share from the VDR appliance in vSphere. The username and password would never be accepted (even though if I tried the same share with the same user/pass on a Windows machine, it would work fine). I finally narrowed down the problem to the simple fact that the VDR appliance can't handle passwords that have special characters in them (this password had an "@" and a ","). Looking at the console while attempting to mount the share would spit out a CIFS error -22. Changing the password to include only numbers and letters was sufficient to work around this issue.
5) Snapshots not being created for no apparent reason and thus failing the VDR process. I'm fully able to do a manual snapshot with or without the memory state, so I'm not sure why VDR can't do it. This issue is very intermittent. I had it often when I first setup VDR, but now it only happens every so often (without any type of consistency).
I think that's all I can think about for now..
Just wondering if anyone has experienced the following behaviour:
After an outage where the VDR appliance doesn't shut down gracefully, the subsequent backup for all VM's is a full instead of incremental. I was under the impression that only the initial backup of a VM would be a full.
I'm running VDR version 220.127.116.111 and all the VM's involved have version 7 hardware.
My retention policy is "custom" and set to 5 most recent, 1 weekly, 1 monthly, 1 quarterly and 1 yearly.
Yes, I have gotten many of those errors. However. I resolved them.
Here is a list of the errors I received.
1. Can't access backup set /blah/, error -2246 (wrong destination index found)
2. Trouble writing to destination volume, error -2241 (destination index invalid/damaged)
3. Trouble writing to destination volume, error -1020 (sharing violation)
(not in any given order per se)
We tried everything you can think of, unmounting / remounting drives, integrity checks, recateloging the drive all were epic wastes of time and resolved nothing.
I finally caved in an called support. After two hour twenty minutes on the phone we were unable to find a solution to the issue. We tried a dozen work arounds which all failed. I don't believe this was due to a lack of knowledge on the support end either. Eric the support tech that helped me really knew his stuff and is a credit to vmware for sure, and although he was unable to help me, he did inspire an idea that fixed the issue...
Here is what I did to fix the above errors.
Eric tried at my request to remove the different back up jobs on the back up tab. Non of which would remove! He would click delete and it would disappear then the vsphere client would lose connection then reconnect and the deleted job would be there again...
So sometime after hanging up with Eric I thought to myself "self, what if you deleted the job, then did a hard disconnect of the vdr client". Seemed like a good idea and the risk wouldn't be high. So I did it and then reconnected and the job stayed gone! Great! So I did this with the rest of the jobs (a dozen or so in total) and once they were all cleared out I rebooted the vdr appliance.
Once the appliance came back up, the drive in question started working again!
I recreated the jobs I wanted and tested them without any failures.
So, there is for sure something wrong or corrupted with the way the backup jobs are working at least on my appliance. The jobs were locking the disk up before it became mounted (although it showed it was mounted on the configuration tab)
In a nutshell I believe the error codes above were way too general or the problem doesn't have an error code and the system is in a way running back to momma and showing any code that would seem correct (/shrug)
So with that, my vdr issues are now resolved. I'm waiting for a call back from Eric so I can explain in great detail to him how this worked for me and I HOPE the many, many hours I spent on this today will help one of you...
Recreating the jobs always worked for a while when I used this buggy product in the past. That fixes nothing...
I assume your problems will return when the datastore grows... Wait for the first full check to run.
But maybe you are one of the few lucky customers who are able to use it...
Our problem was the jobs could NOT be recreated. They simply refused to delete and the destination drive was unusable. Deleting the jobs got the drive back working. So simply making NEW jobs wasn't going to work.
I will take a 30 second fix any day of the week to keep my backups going.
Sure it would be epic if the product was perfect, but as this thread suggest it's not, plain and simple.
I'm having pretty good luck with it, and maybe, as I stated before, the quick fix that helped me, might help someone else...
Lot of problems with VDR 18.104.22.1681.
We use iSCSI target as destination.
Backup does not occure anymore because of these errors :
10/27/2010 1:54:17 PMCan't access Backup Set /SCSI-0:2/, error -2261 ( can't use Backup Set until integrity check succeeds)Execution Event0
and a lots of ...
10/26/2010 10:01:48 PM: Integrity check failed for the restore point created on 10/12/2010 1:00:13 AM for
Many backup are set to (damaged).
I did several integrity check but it down not help.
I marked damaged backup for deleting but they are still here.
What can I do?
you don't need to delete the jop jus shutdown the VDR and edit the Hard disk wich you use to backup the VMs and edit the virtual Device Node
like from SCSI (0:1) hard disk 2 to SCSI (1:0) hard disk 2
Je suis absent jusqu'au vendredi 26 novembre. En cas d'urgence, merci de contacter le Helpdesk au 027 606 2288.
Avec mes meilleures salutations.
>>> ELziny <email@example.com> 11/25/10 11:05 >>>
A new message was posted in the thread "Anyone else having these VDR issues?":
Author : ELziny
To reply to the original question - yes, you're not the only one having issues with VDR. I've been running 22.214.171.1241 for about a month and have come to the conclusion that it simply cannot be relied upon. I don't think that more than 3 days has ever passed without an issue with a deduplication store of some sort that requires my intervention. What I would say is that the 2 appliances that backup the VDI farm (160+ VMs) have isssues much less frequently than the appliance that backs up the file & application servers. My guess is that this is related to the fact that the VDIs have much smaller vdisks that are not modified as frequently as the servers. Let's hope v2.0 works better.
i have nothing to add - VDR has more than once corrupted complete backup-chains for different VMs, i.e. out of VM1-VM10, all of a sudden all backups of VM3 and VM6 were corrupted and needed to be removed to make VDR work again.I have experienced this on more than one system now and I cannot understand how something like that can happen. That was with DR 2.0. Currently i`m updating to 2.0.1 as i have been severely spammed again by VDR appliances.... (>10000 Notification Mails in my inbox this morning)