VMware Cloud Community
aWhat_
Contributor
Contributor

Deduplication store integrity and recatalog problems

I have Data Recovery running for just over a week and the Integrity check has completed OK each day until now. Now it seems to hang at 33%. The current task name is 'Recatalog' and the Status is 'Checking integrity of deduplication store... Progress is 33%

How can I get more information as to what is happening. I have tried restarting the Data Recovery appliance. The Destination is a 250GB SAN vmfs store which shows 96GB free. The log just indicates 'Executing Integrity Check'. There are no errors on the restore points of the backups.

Reply
0 Kudos
18 Replies
admin
Immortal
Immortal

If you log into the appliance and go to /var/vmware/datarecovery/, there should be a bunch of log files named chunkDedupe-*. Is there any activity going on in these logs? (i.e. is there stuff currently being written to it) Note, these logs roll over so the number isn't necessarily an indication of which one is the most current.

asitdesai
VMware Employee
VMware Employee

The appliance performs a full integrity check once a week and incremental integrity check every day. In some cases the full integrity check can take a long time. We are working on a fix for it. For now, do not stop or restart the appliance during integrity check even though it appears hung. It will eventually finish.

aWhat_
Contributor
Contributor

Thanks for the input. It has progressed from 33 to 38% overnight. There are chunkdedup-0 to -9 and -index so I guess things are still happening.

Reply
0 Kudos
aWhat_
Contributor
Contributor

My weekly integrity checks are taking up to 4 days. Should I log a support call on this? This is unworkable as a backup solution. Last week we had to rebuild 2 machines from scatch as we couldn't restore during the integrity check period. I am backiing up 27 VM's to a 256GB store.

I am regulary getting messages as follows: Integrity Check: Task incomplete - Can't load source session tree for <data and time>, error -2249 (could not find session)

Reply
0 Kudos
robc_yk
Enthusiast
Enthusiast

We too are having many issues with this software.

We are currently using a "local disk" on the VDR VM that is actually a 1TB LUN from an HP EVA 8100 for our store. We initially tried connecting to a Windows share but seemed to have a boat load of issues with that. Re-creating everything with the EVA and local disk was our hope of 'cleaning' up some of the performance / reliability issues. Apperantly it is not a help.

It is currenly sitting at 48% of "Checking integrity of deduplication store" with no indication that it is actually doing something. This "Recalculating" started about 11 hours ago.

The big issue is that while this runs, no restores can be done. We currently have 99 VM's and templates in out VMware enviornment, about 75 of them are in this VDM marked for backup.

Is there a way to tell if the VDM is actually doing something other then just sitting there? How can we be confident that if / when this finishes, out restore points will appear again, currently it shows "There are no restore points".

Would upgrading the VDM VM to use the latest VMware Tools help? If so, how is this done? What about the VM Version, can it be upgraded to 7?

I find it odd that the software came on the vSphere DVD is out of date from the rest of the software on the disk.

Reply
0 Kudos
millennia
Enthusiast
Enthusiast

I too started getting Task incomplete - Can't load source session tree for <data and time>, error -2249 (could not find session) messages after a backup failure and now the integrity checks fail every day. Does anybody know how this can be reset, and in fact how do you reseed a backup set on VDR so you can start afresh?

I'm a little worried readng comments that seem to show this is very much a v1.0 product with a decent amount of bugs still in it when it is a vital part of the system - being locked out of restores because of some integrity checking system that runs for hours or days is ludicrous.

Reply
0 Kudos
admin
Immortal
Immortal

All,

We just released VDR v1.0.1 - see my posting in this forum. There are three integrity check changes/enhancements that we made in this release that should address some of the issues brought up in the forum

Backups Can Be Completed While Integrity Checks Are Running -Data Recovery can complete backup operations at the same time that an integrity check is running. In the past, when an integrity check was running, backups could not be completed.

Improved Integrity Check Backup Speed - Integrity check has been optimized for faster performance. In the past, comparable integrity checks took longer to complete.

Integrity Check Optimized to Run During Idle Times - Before running regularly scheduled integrity checks, the Backup Appliance determines if the current time is during a backup window. If the current time is not during a backup window, the integrity check runs. If the current time is during a backup window, the backup appliance checks the backup schedule to determine if there will be a time in the next 24 hours that will not be during a backup window. If there is a time in the next 24 hours that is not during a backup window, the Backup Appliance waits for that time. If there is no time that is not during a backup window in the next 24 hours, the Backup Appliance completes the integrity check.

Reply
0 Kudos
millennia
Enthusiast
Enthusiast

I decided to reinstall this as a fresh install as I only have a few machines being backed up as a test, and I couldn't see an upgrade option anyway. This appears to have been a mistake as I can't seem to connect to the device through vCenter anymore - it just keeps asking for credentials and then reporting "not connected". The network is set up OK so I don't see what the problem is, especially as this is a complete reinstall. Perhaps some settings are kept in vCenter that need to be cleared?

Reply
0 Kudos
Paul11
Hot Shot
Hot Shot

Have you reinstalled the Plugin too? After this I could connect after a while.

Paul

Reply
0 Kudos
millennia
Enthusiast
Enthusiast

Hi Paul, I had reinstalled the plugin but I noted the "after a while" you posted so I went back to it and indeed I can now connect and configure the backups.

Now I just have to figure out why the recovery appliance is using 100% CPU even though not backing up anything, it never did that before...

John

Reply
0 Kudos
robc_yk
Enthusiast
Enthusiast

Hey Folks,

I have removed the tools on the VM's that were having difficulties backing up, then re-installed. The backups seem to be going good now. I have also upgraded the VDR server to the one mentioned here in the form, that process was pretty painless and strait forward.

I do have another question that perhaps the experts can answer here.

Since the backups are set to expire once out of scheduled retension, is it possible to do a backup to tape of these expering backups, then if need be restore them from tape and then use the VDR to restore a functioning VM? I guess the backup process would be like:

Backup through VDR, Backup the VDR results to tape, let VDR scheduling expire old VDR files.

Possible?

Reply
0 Kudos
horace_ng
Contributor
Contributor

I got the same problem after I marked one session for delete after a backup incomplete error. Then now, I get a error -2249 integrity check problem everyday. Anyone know how to solve this?

Reply
0 Kudos
aWhat_
Contributor
Contributor

I have been able to keep VM Data Recovery working but not without constant intervention. I have upgraded to release 1.0.1.362 but still have problems. I often have to delete corrupt restore points, restart the VDR appliance and do manual integrity checks which will take 12 - 20 hours. Currently have a error -1115 disk full, when it is not. Another time I got some failing VM backups to complete again by migrating them to a different DataStore. I have about given up on VM Data Recovery and have purchased Veeam Backup and Replication 3.1 instead. Would like to see VMware fix DR though as it has potental if it would work. I have logged support calls with very poor reponse.

Reply
0 Kudos
bmckerritnu
Contributor
Contributor

How does one delete a corrupt restore point ?

How does on manually run integrity checks ?

I am experiencing similar issues where my integrity checks fail and it tells me that ;

"backup set X will be locked until the restore point with errors is deleted and integrity checks succeed"

Reply
0 Kudos
aWhat_
Contributor
Contributor

Yes, It took me a while to find it too. Go to the Restore tab, enter a number of restore points - say 20, so that you see them all. Then expand out each server until you find all corrupt restore points indicated with a red cross. Select the corrupt restore points, then choose 'Mark for Delete' on the top right of screen. Then run a manual integrity check - it will take a long time. You would think we should be able to find that in the documentation somewhere!

Reply
0 Kudos
aWhat_
Contributor
Contributor

Forgot to mention - you can find exactly which restore points are corrupt by looking in the log under Configuration Tab.

Reply
0 Kudos
bmckerritnu
Contributor
Contributor

Great thanks,

I has just found that and I also found that you run the integrity checks on the 'destinations' - Duh !

Is it just me or is this whole product just all over the place !

Reply
0 Kudos
KBuchanan
Enthusiast
Enthusiast

I suggest you call support and open a SR. It is the only way they are going to understand that the product (VDR) isn't working for everyone! I'm on a crusade to tell people - test VDR and DO NOT rely on it as your sole backup solution. Come on people...this is a v1.0 product!!! It is foolish to just accept it as a completely reliable backup solution without due diligence to test it!!

I'm not saying anything that support hasn't already told me. When I asked if I could rely on VDR, I was told I should it until I feel confident with it - OR, I was never even given an answer.

VDR has a LONG way to go...it has lots of promise - but what's the point if it doesn't work?!? My opinion is that anyone that uses VDR as their backup solution is taking a major risk!

Bottom line...keep opening SR. At least they can collect statisical information, log files, etc...on what is breaking - and hopefully - use this to help fix the problems.

Reply
0 Kudos