VMware Cloud Community
jarsenea
Contributor
Contributor

VDR Issue with large VMs

Hey all,

I currently have a few webservers that occupy anywhere between 250-500GB of space that need are being backed up with VDR on a nightly basis. The problem however is that after the first backup (which I completely understand will take a while), subsequent, incremental backups also take an exceedingly long amount of time (hours, not minutes). This is case even if only 10-50mb of files have changed from one night to another.

Anyone experience this? VDR 1.2 seems to be working better than any other version so far, but I'm hoping I can speed up these incremental backups because I just won't have enough time in a day to do all my VMs.

Best,

Reply
0 Kudos
34 Replies
RParker
Immortal
Immortal

This is case even if only 10-50mb of files have changed from one night to another.

VDR is much improved but it's still a .1 product, meaning it's not ready for prime time. It still has a LONG way to go. It doesn't do change block or white space skip inside of a VM, so when it scans, it takes an inordinate amount of time, because its doing the same work that it would do for a FULL backup.

Therefore it takes a long time.

If you want to speed up the process, I would get VizionCore vRanger, it has advanced methods, and I can do a FULL backup of a 36GB VM in under 10 minutes on Fiber. By comparison this same VM takes almost 40 minutes in VDR.

It can do this because of Active Block scanning, and it skips the white space, plus it can do compression...

VDR can't do any of this, its a VERY basic, remedial product that is still very much a WORK IN PROGRESS.

Reply
0 Kudos
jarsenea
Contributor
Contributor

I'm quite aware of that VDR lacks the maturity of vRanger, Veeam or even esXpress, however I assumed it would at the very least to change block tracking. Oh well.

I know Veeam doesn't do this, but does vRanger dedupe across the entire infrastructure or target folder for it's backups? I know Veeam only does it on a per-backup job basis, so if you have 50 Windows 2003 VMs, you basically have to throw them all into 1 backup job if you want dedupe. VDR is nice in that you can have 2-3 jobs with the same target and it dedupes the target.

I guess worse comes to worse, I could use LessFS on a Linux machine as my Backup store to do FS-side deduplication instead of software-based.

Best,

Reply
0 Kudos
RParker
Immortal
Immortal

I know Veeam doesn't do this, but does vRanger dedupe across the entire infrastructure or target folder for it's backups?

Not sure I wondered this myself..

However vRanger is a Windows product. If you install Windows 2008 Storage Manager, that has SIS. So at night or during idle periods in Windows you can de-dupe the entire volume / disk. So it doesn't matter if you have veeam, vRanger or some other Windows based backup.

It's possible to de-dupe with Windows itself now...

Reply
0 Kudos
jarsenea
Contributor
Contributor

I'm not a fan of SIS.. It's not true deduplication and in my opinion, more of a hack on Microsoft's part than a true deduplication solution.

I don't believe the benefits of doing any type of deduplication are worth it if you're not using a solution that can do block-level deduplication (as VDR does). My VDR store after the first backup was around 2TB, deduplicate it was approximately 350GB. There's no way you can hit that kind of ratio using SIS.

Reply
0 Kudos
RParker
Immortal
Immortal

I'm not a fan of SIS.. It's not true deduplication and in my opinion, more of a hack on Microsoft's part than a true deduplication solution.

That's funny because SIS is used on some of the biggest SAN's in the world including EMC and Netapp. That's how de-duplication is done on those SANS. It's not designed by Microsoft AT ALL, they licensed the product, that's why it's a good idea. . . . .

And it was a solution for vRanger / Veeam to de-dupe the ENTIRE store. VDR might have good ratios, but you prefer to wait for large files during incrementals?

It's a trade off. SIS has good ratios when the data is on the same volume, maybe not 4:1 but then WE control the De-Dupe rather than the appliance doing it.

Reply
0 Kudos
admin
Immortal
Immortal

Is your VM at hardware version 7? HWv7 is required to take advantage of change block tracking.

Reply
0 Kudos
jarsenea
Contributor
Contributor

Yes they are.. We only deployed VMware as of vSphere 4, and all VMs that were created and/or migrated (P2V'ed) were set to HW7.

For example, tonight, my main webserver that occupies about 270GB of space took over 8 hours to do an incremental backup. Only about 150mb had changed. Not exactly sure how long it took but when I had left the office, it had gotten up to 18% after about 2 hours of "Copying..."

Reply
0 Kudos
Pitterling
Contributor
Contributor

Is CTK enabled? Check the directory of VM for *-ctk.vmdk files.

If not, you have to enable it manually (actually it should be done by VDR)

http://itknowledgeexchange.techtarget.com/virtualization-pro/what-is-changed-block-tracking-in-vsphe...

The VM needs to be powered off for this change.

Reply
0 Kudos
jarsenea
Contributor
Contributor

Yes, ctk is enabled on all my VMs. I had to do this when I was evaluating Veeam. One of the reasons I didn't go with Veeam is that I'm a little worried about having a single massive file that could be 20-30TB in size. That and I couldn't separate my backup jobs while maintaining deduplication within the datastore.

Reply
0 Kudos
Pitterling
Contributor
Contributor

Then i would check for the authorizations for the user who connects to the vCenter server.

Please check the VDR1.2 admin guide.

In my case, rights were missing and the HotAddCopy feature didn't work and the changed data have been transferred over the network. My console network speed is limited to 100Mbit .. it takes a while to transfer the data. Now i fixed this, data is transferred from disk to disk and backup times are dramatically reduced.

The VDR1.2 guide also describes how to check the detailed logs. You need to hold the SHIFT key when pressing the Log entry and also when pressing the Refresh link.

Reply
0 Kudos
jarsenea
Contributor
Contributor

Thanks for the suggestion.

I was afraid of this a while ago, so I created a specialized account that has complete control over the vSphere environment (part of the Administrator group). The active directory account is also part of domain admins to make sure it has full access to any resources it needs. It may be overkill, but I wanted to rule that out. Unfortunately, it's been running like that since 1.2 was installed (and it was installed from scratch, no previous data was brought in), and I'm still experiencing the issue.

Reply
0 Kudos
RParker
Immortal
Immortal

Then i would check for the authorizations for the user who connects to the vCenter server.

It's not a permissions problem, it's VDR. People have been complaining since product came out. VDR has a very limited functionality, and it wasn't tested well enough before it was introduced into vSphere.

He is talking about a 250-500GB file also. We have much smaller files that do the same thing, it's not out of the realm of possibility that he has discovered yet ANOTHER in a long list of flaws in VDR.

VDR is slow, missing features, and it has many bugs, SOME of which were fixed recently MOST are still causing problems. Permissions only affect making changes to the VDR for adding hot add SCSI drives, not how VDR works.

Besides I find that network copies are almost as fast as Hot add, depending on the configuration, so the fact the drives were not added to the VDR during backup wouldn't make that much of a difference.

Reply
0 Kudos
Pitterling
Contributor
Contributor

Then go for checking the logs.

Reply
0 Kudos
jarsenea
Contributor
Contributor

Glad I didn't attempt to backup my File Servers.. We're talking anywhere from 1-4TB per server Smiley Wink

Reply
0 Kudos
Pitterling
Contributor
Contributor

In my case it reduced backup times dramatically from 2 hours to 20 minutes.

Furthermore i would check for the network and disk transfer, service times, etc. VDR VM CPU and memory usage.

It might be that other solutions perform better, but its not the explanation why it performs so slow in JRArseneau's environment.

Reply
0 Kudos
admin
Immortal
Immortal

I assume this is a HW7 VM ? CBT only works on HW7 VMs - one way to see if look in the VDR logs. If it says that it is always performing a full backup on this VM, then CBT is not being leveraged. If CBT is being leveraged, the log will state that VDR is performing an incremental backup. Also, it should state whether Hot Add or Network is the storage transport that is being used - this will have an impact on overall backup performance (in terms of speed of storing the data in the dedupe store)

Reply
0 Kudos
jarsenea
Contributor
Contributor

I indicate in post #7 that I am indeed using HW7..

Interesting... Your comment about the Network/HotAdd got me thinking.. Of all my VMs, there are 1-2 (out of 15 currently being backed up) that are using "network" as the method of backing up instead of HotAdd.

Any reason for this?

PS: Again, ctk is enabled for all VMs.

Reply
0 Kudos
RParker
Immortal
Immortal

that are using "network" as the method of backing up instead of HotAdd.

The ESX host that VDR lives on MUST have access to ALL the LUN's it will backup. Not only that, but ALL of the VM's it needs to backup need to be in the SAME datacenter.

Otherwise it will use network backup, not LAN Free backups.

Reply
0 Kudos
jarsenea
Contributor
Contributor

I should add that I have two backup stores, each of them are RDMs attached directly to the VDR appliance connected to a backup fibre storage system.

Here's one of the logs:

7/7/2010 1:09:08 PM: Normal backup using Default Policy: Web/Application Servers (9p-6a)

7/7/2010 1:09:27 PM: Copying INTERMED

7/7/2010 1:09:49 PM: Performing incremental back up of disk "[VMFS_DATA_1] INTERMED/INTERMED-flat.vmdk" using "Network"

7/7/2010 2:02:00 PM: Performing incremental back up of disk "[VMFS_DATA_1] INTERMED/INTERMED_1-flat.vmdk" using "Network"

7/7/2010 11:04:10 PM: Task completed successfully

Reply
0 Kudos