VMware Communities > VMTN > Datacenter Virtualization Products > VI: ESX 3.5 > Documents

ghettoVCB.sh - Free alternative for backing up VM's for ESX 3.5+ and ESXi

VERSION 26 Published

Created on: Nov 17, 2008 7:04 PM by lamw - Last Modified:  Jan 7, 2009 12:27 PM by lamw

This script performs backups of virtual machines residing on ESX 3.5+ and ESXi servers using methodology similar to VMware's VCB tool. The script takes snapshots of live virtual machines, backs up the master VMDKs and then upon completion, deletes the snapshot until the next backup. The only caveat is that it utilizes resources available to the Service Console of the ESX server running the backups as opposed to following the traditional method of offloading virtual machine backups through a VCB proxy.

This script has been tested on ESX 3.5u3 and ESXi u3 and supports the following backup mediums: LOCAL STORAGE, SAN and NFS. The script is non-interactive and can be setup to run via crontab. Currently, this script accepts a text file that lists the display names of Virtual Machine(s) that are to be backed up. Specification of the destination path is done by editing the top portion of the script itself. It is important to note that backup destinations specified in the VM_BACKUP_DIR variable must be the full path to storage (LOCAL, NFS, and SAN) that is being presented to the ESX(i) server as a datastore (i.e. datastores that are located within /vmfs/volumes/).

Additionally, for ESX environments that don't have persistent NFS datastores designated for backups, the script offers the ability to automatically connect the ESX server to a NFS exported folder and then upon backup completion, disconnect it from the ESX server. The connection is established by creating an NFS datastore link which enables monolithic (or thick) VMDK backups as opposed to using the usual *nix mount command which necessitates breaking VMDK files into the 2gbsparse format for backup. Enabling this mode is self-explanatory and will evidently be so when editing the script (Note: VM_BACKUP_DIR variable is ignored if ENABLE_NON_PERSISTENT_NFS=1 ).

In its current configuration, the script will overwrite the last backup of the Virtual Machine if it exists in the target backup location; this however, can be modified to fit procedures if need be. Please be diligent in running the script in a test or staging environment before using it on production live Virtual Machines; this script functions well within our environment but there is a chance that it may not fit well into other environments.


NEW UPDATES

UPDATE (01/07/2009): Small bug found, thanks to shechtl. The VMDK array was not reseted, so the previous VMDK disk were appended and the script tried to backup the previous VMDK again. This has been fixed and updated ghettoVCB-enhance.sh has been uploaded. Sorry about that folks

UPDATE (01/06/2009): An issue was discovered by user NoSa on a corner case that could cause a disk_lib() warning message when executing a backup. The issue arises when a user removes a VM disk using the VIC with either option: "Remove from virtual machine" or "Remove from virtual machine and delete file from disk" which could cause some unexpected behavior.

Case "Remove from virtual machine": If you remove the VM disk, the VMDK will continue to exist on the filesystem but in the .vmx file, it will still contain the present flag set to false, the backup script will still backup the VMDK

Case "Remove from virtual machine and delete file from disk": If you remove the VM disk, it will delete the VMDK off the filesystem but a reference will still exist in the .vmx file, in which the script will try to backup the deleted VMDK and causes the warning message

This is now fixed and will only look for valid presented disks to the VM and backup only those VMDK(s).

Additional features:

  • Added a new flag at the top of the script "POWER_VM_DOWN_BEFORE_BACKUP", if set to 1, will try to power down the guestOS prior to taking a backup, which will not include taking a snapshot as the primary VMDK will not be locked. Once the backup has finished, it will power the VM back on and continue on to the next VM. By default this option is disabled

  • Added a new flag at the top of the script "ENABLE_HARD_POWER_OFF" which allows you to force a hard power off of the guestOS shutdown has not powered down within the specified time interval using the variable "ITER_TO_WAIT_SHUTDOWN". This allows you to set the number of 3 second iterations to wait for a guestOS shutdown, and if it does not shutdown within the time interval, then a forced hard power off will occur. For example, the default value is set to "4" in the script. This means it will wait up to 12 seconds total prior to the hard power off, of which are 3 second interval checks. This functionality is optional to the user, by default this is option is disabled along with powering off VMs prior to backups.

If you have any questions, feel free to ask. You may also comment any unecessary output

Lastly, since this new script introduces some new features, I encourage you to test this thoroughly prior to executing on your production environment. I've gone ahead and renamed the script to "ghettoVCB-enhance.sh" for this release as a side update and I'll keep the original script for download as well.

Here is a sample run on ESX 3.5u3:

[root@himalaya scripts]# ./ghettoVCB-enhance.sh example_virtual_machine_backup_list
Powering off initiated for Quentin, backup will not begin until VM is off...
VM is still on - Iteration: 1 - waiting 3secs
VM is still on - Iteration: 2 - waiting 3secs
VM is off
################## Starting backup for Quentin ... #####################
Destination disk format: VMFS thick
Cloning disk '/vmfs/volumes/himalaya-local-SAS.VMStorage/Quentin/Quentin_1.vmdk'...
Clone: 100% done.
Destination disk format: VMFS thick
Cloning disk '/vmfs/volumes/himalaya-local-SAS.VMStorage/Quentin/Quentin.vmdk'...
Clone: 100% done.
Powering back on Quentin
#################### Completed backup for Quentin! ####################


Start time: Tue Jan  6 16:09:35 PST 2009
End   time: Tue Jan  6 16:12:12 PST 2009
Duration  : 2.62 Minutes

Completed backing up specified Virtual Machines!

Here is a sample run on ESXi 3.5u3 (recommend changing the ITERATION value to higher number, ESXi for some reason may take longer):

~ # ./ghettoVCB-enhance.sh backup
Powering off initiated for UCSB-ENGINEERING, backup will not begin until VM is off...
VM is still on - Iteration: 1 - waiting 3secs
VM is still on - Iteration: 2 - waiting 3secs
VM is still on - Iteration: 3 - waiting 3secs
VM is still on - Iteration: 4 - waiting 3secs
Hard power off occured for UCSB-ENGINEERING, waited for 12 seconds
VM is off
################## Starting backup for UCSB-ENGINEERING ... #####################
Destination disk format: VMFS thick
Cloning disk '/vmfs/volumes/dlgCore-FC-LUN200.Templates/UCSB-ENGINEERING/UCSB-ENGINEERING.vmdk'...
Clone: 100% done.
Powering back on UCSB-ENGINEERING
#################### Completed backup for UCSB-ENGINEERING! ####################


Start time: Sat Jan 10 00:57:37 UTC 2009
End   time: Sat Jan 10 01:01:55 UTC 2009
Duration  : 4.30 Minutes

Completed backing up specified Virtual Machines!



UPDATE (12/04/2008): Small update, In the NON-PERSISTENT NFS section, the flag to check whether a NFS Datastore has been mounted or not is checking for the wrong exit status value. Should be "0" and not "1"

UPDATE (11/29/2008): Updated a small bit of code with the suggestions from JoSte and aremmes to ensure maximum efficiency. The script name has also been changed from ghettoVCBni.sh to just ghettoVCB.sh

UPDATE (11/26/2008): A new fix has been implemented to allow the name of the Virtual Machines to contain spaces, which will be provided via the input file. This is not a recommended best practice in terms of naming conventions in general, underscores and dashes should be used to add separation in your naming schemes.

INFO (11/24/2008): The text input file should contain the displayName of your Virtual Machine separated by a newline and should look similar to the following (the file can be named anything, ensure it's created on ESX or ESXi else you'll have some issues with the Windows "^M" character:

vm1
vm2
vm3

UPDATE (11/22/2008): The script has been updated with the following fixes and additional features:

Fixes:


  • An issue that involved snapshot commits during virtual machine backups has been addressed. Performance degradation may occur if snapshot remove operations persist and accumulate in number during the backup process. The script now ensures that the snapshot removal process completes on the current virtual machine prior to continuing onto the next virtual machine specified in the backup list.

Additional features:

  • The script now detects virtual machines that contain existing snapshots and RDMs (raw device mappings). If any of these two attributes exist, the script will skip the virtual machine in question and continue onto the next virtual machine specified in the backup list.

  • Backup rotation is now possible. To specify the number of backups to keep, please modify the VM_BACKUP_ROTATION_COUNT variable at the top of the script. For example, a value of 3 will tell the script to retain three prior backups of the specified virtual machine. Old virtual machine backups that fall beyond the third prior backup will be deleted. If this value is left blank (NULL), the script will default to 1, that is, every new backup will overwrite the previous backup if it exists. The default format of the rotated backup directory is: <VMDISPLAYNAME>-YYYY-MM-DD. The suffix, YYYY-MM-DD, can be modified via the VM_BACKUP_DIR_NAMING_CONVENTION variable at the top of the script to fit your backup procedures.

  • The script now provides the option to output VMDK files as THICK (default behavior) or 2GB SPARSE format on the backup datastore. A self-explanatory variable at the top the script can be modified to enable 2GB SPARSE format output.

  • The backup directory for the non-persistent NFS backup feature can now be specified. Please modify the NFS_VM_BACKUP_DIR variable to specify the backup directory name. This directory will be created if it does not exist. This variable MUST be defined or else the script will exit.

UPDATE (11/17/2008): Thanks to JoSte, a small duplication of code is being executed which caused multiple snapshots to be taken if a Virtual Machine has more than 1 VMDK, the code has been updated and optimized to only snapshot the Virtual Machine once and backing up all VMDK(s). Please download the latest script which has been uploaded to fix this.
Attachments:
Average User Rating
(4 ratings)




Dec 17, 2008 10:16 AM Shaveht  says:

How can i edit it to run on VIMA?

Dec 19, 2008 8:59 AM lamw  says:

Forgot to mention, you can also find more scripts/resources located at:

http://engineering.ucsb.edu/~duonglt/vmware/

Dec 19, 2008 9:08 AM lamw  says: in response to: Shaveht

Hi Shaveht,

This script can not be locally executed on VMware VIMA, but can be remotely kicked off from VIMA or any other UNIX/Linux, even a Windows host using plink.

The process is as follows:

You store the "ghettoVCB.sh" on shared storage (FC/iSCSI SAN / NFS), including the list of Virtual Machines you would like to backup. You can then create a cron entry that will kick off the backup process say every Friday at 12am. To ensure this does not require a user to enter a password, you'll need to utilize public/private SSH Key Authentication. This would then ssh into the ESX/ESXi host(s) and execute this script passing in the list which is also stored on the shared storage and the backup process would be kicked off.

This allows you to centralize your backup process without having to create separate cron entries on each of the ESX and ESXi hosts and storing a copy on each system. This does add an additional system to the process and if you know this VIMA host will always be accessible to the ESX/ESXi cluster, you should be fine.

An entry to the cron for the command would look like the following:

ssh root@A.B.C.D "/vmfs/volumes/SOME_DATASTORE/ghettoVCB.sh /vmfs/volumes/SOME_ADMIN_DATASTORE/virtual_machine_backup_list"


Again, you would just create a cron entry for the set of ESX/ESXi hosts and ensure that the datastore that contains the backup script and backup list are presented and visiable to all hosts. This also includes the datastore that is used to backup all the VMs are also visible/presented to all hosts

Hopefully this cleared up some of your questions and good luck

Dec 29, 2008 12:35 AM Shaveht  says: in response to: lamw

Hi lamw,
Thanks for the explanation.
in effect i was confused about how to use it.

thanks again.