5 Replies Latest reply on May 1, 2011 11:38 AM by ahoogerhuis

    long-running backups vs DRS

    MillardJK Enthusiast
    vExpert

      Because  I'm running a backup against 100+ VMs to fairly slow (i.e., cheap) disk, I've run into the following timing  problem  with ghettoVCBg2:

      1. Guest X is running on Host B
      2. ghettoVCBg2 completes the backup of all selected guests on Host A and moves to processing guests on Host B
      3. DRS  moves Guest X to Host A before ghettoVCBg2 is able to begin backing it up.
      4. ghettoVCBg2 completes the backup of all selected, "reachable" guests on Host B
      5. Guest X is not backed up because it "cannot be found"

       

      While I know there are efficiencies in logging into a given host and processing all the selected guests, but I'd prefer some inefficiency if it meant my guests got backed-up and I didn't have to disable DRS while it was happening.

        • 1. Re: long-running backups vs DRS
          vmbru Enthusiast

          We don't have DRS but are the "moves" to save power? ;  are you shutting down the ESX(i) hosts at night or dynamically shuffling your VM's around for better performance?  You may have to dumb DRS down in order to use this script.  Maybe DRS can peek at VMA appliance and check for activity before DRS starts dancing?

           

          Could you schedule the "moves" at a later time or pause during your backup window?  You are doing this after-hours, maybe do the "moves" first then the backups?

           

          This "poor mans" script is pretty much a quick scan of host then backup perform and not really meant to work with vcenter DRS/storage vmotion, you may have to look at an app like VDR (which is DRS aware that can "track" where/what the VM's host is.

           

           

          From: Jim Millard communities-emailer@vmware.com

          Sent: Friday, April 08, 2011 1:43 PM

          To: ryonb@kembacu.org

          Subject: New message: "long-running backups vs DRS"

           

          VMware Communities<http://communities.vmware.com/index.jspa>

           

          long-running backups vs DRS

          created by Jim Millard<http://communities.vmware.com/people/MillardJK> in ghettoVCBg2 - View the full discussion<http://communities.vmware.com/message/1732886#1732886

          • 2. Re: long-running backups vs DRS
            MillardJK Enthusiast
            vExpert

            In my case, DRS isn't moving guests because of a power-saving rule for the cluster; it's only doing resource load balancing--and it's not doing it very aggressively at that. It's quite possible the actions of the script (copying VMDKs wholesale) are changing the host's load profile for DRS, by sequentially "hitting" each selected guest on a given host.

             

            I've been into the guts of the script; it should be possible to revise the algorithm to "go looking" for a VM on the hosts in a cluster. Alternatively, it might also be possible to temporarily disable DRS for the cluster during the backup window.

             

            I'm not doing these backups frequently, as our data recovery process is already taking care of the high-velocity changes through other means. The purpose of the script is to take monthly backups of all the VMs for DR purposes, regardless of the data state in the other backup system. The few hours I've spent (and will spend) fine-tuning this for that goal (in addition to the cheap storage I'm using) will save the company (tens of?) thousands of dollars over the cost of a vertical enterprise backup solution like Comvault or other VMware-compatible solution.

            • 3. Re: long-running backups vs DRS
              vmbru Enthusiast

              Well...

               

              1.  Have you looked at: http://www.trilead.com/Editions/  ?  supports CBT and under $700 not sure if it plays well for DRS but they have a demo.

               

              2.  If you have DRS, probably have rights to VDR I'd think, not recommending VDR as we have had fits with it but VDR 1.2.1 just came out March, 2011

               

              NOTE:  We beta tested vdr 2.0.0.1381 and better features but still a bit buggy.  I'd wait for vdr 2.0.1 to come out.

              • 4. Re: long-running backups vs DRS
                ahoogerhuis Enthusiast

                One option, to reuse the answer from another recent post in here on how to backup only a single VM:

                 

                ./ghettoVCBg2.pl --vmlist <( echo yourvmname )

                 

                wrap a nice loop around it like so:

                 

                for n in `cat vmlist.txt`; do

                     ./ghettoVCBg2.pl --vmlist <( echo $n )

                done

                 

                Then aquisition of the VM will only happen at the moment it starts the backup per VM.

                 

                It's a workaround, but untill the script is fixed to redo aquistion of the VM untill it's ready to do the backup, then it works.

                 

                -A

                • 5. Re: long-running backups vs DRS
                  ahoogerhuis Enthusiast

                  And to do a reply to my own reply, there is more to this. Another failure mode is that the VM itself can migrate out off the host under the legs of the backup while the backup is running. It seems to complete the backup and then the script gets confused.

                   

                  I quick digging in the VI Perl Toolkit seems to indicate there is (on 4.x at least) access the the DRS settings for both the cluster and the VMs, and thus a way of controlling DRS of the VM while the backup is ongoing. Specifically ClusterDrsVmConfigInfo has the description "DRS configuration for a single virtual machine. This makes it possible   to override the default behavior for an individual virtual machine.", so it shuld be possible to tell DRS to get its fat fingers off the VM while the script does its deed.

                   

                  I think its time to dig up some old perl skills and see if I can butcher the script to do the right thing. I'll post a patch as soon as I have one.

                   

                  -A