Hello,
I've created for our own purposes a hot backup script for ESX servers. features are:
\- autodetect all VMs to be backed up: You can specify a single VM, save all powered on VMs or simply all VMs on one ESX host
\- send a email to an administrative account if backup was successful or not and send another mail with a summary per ESX host
\- Backups older than a specified period are deleted automatically
\- Uses a ESX storage to transfer the backup files (can be managed via VI Client)
\- Skip VMs with their names stored in a external file
Configuration:
\- store file ESXBackup in /usr/bin and chmod it executable
\- store file sendmail.pl in /usr/bin and chmod it executable
\- open port 25 in ESX firewall if you intend to send mails
\- create storage in ESX GUI (we use NFS storage, which works well)
\- create local Account ESXBackup on each ESX host where you want to run the backup. This account has to be a ESX administrator
\- Modify the configuration section in the ESXBackup script to meet your configuration settings. The config settings should be self-explaining
\- Create a file on your Backup storage with names of VMs which should not be backupped automatically. You can later specify this file at backup time, the named VMs will be skipped
Usage: Log onto the console of the ESX host and run "ESXBackup" without parameters. Some informations will be displayed.
To backup a single VM use this command: "ESXBackup -v VMNAME"
To backup all powered on VMs use that: "ESXBackup -v poweredon"
You may want to use ESXBackup in your crontab, be sure to modify roots crontab "crontab -e -u root" ...
Known bugs and issues:
\- Sometimes backup ends with error code 141. This error indicates, that vcbMounter has created, but not deleted a snapshot. Delete this snapshot manually or the next backup won't run anyway.
\- In our environment, the NFS server seems to be not fast enough to handle the transfer rate some time, so the NFS client (our ESX server) receives an error and the transfer stops. If this is the case use a sync export or your nfs share instead of a async (but this slows down backup time dramatically) or mount the nfs share via ESX console and not as a ESX storage.
\- VMs should not be moved from one ESX host to another during backup.
\- Logiles are stored on /var/log - if you don't use them anymore, delete them manually
\- Be aware that VM-Names are CASE-sensitiv. If you use the exludelist, write one VM's name per line without spaces and tabs.
Planned enhancements:
\- Detect and delete "lost" \_VCB-SNAPSHOT_ snapshots
\- Retry backup if one VM fails
Note: This script is "at is it" - no warranty or other features. Feature request are welcome. This was my first "more-than-a-three-liner" shell script, so many improvements can be made.
If you find this script useful (or not) please let me know ...
Have fun,
Alex
#!/bin/bash
#
\# ESX Backup script
#
\# 18. Dezember 2006
\# Version 0.1c
#
\# Alexander Storf
\# eMail: vmware\[_at_]vmpro.de
#
\### functions ###############################################################
function LogOut() {
echo "$(date '+%b %d %H:%M:%S'): $2" | tee -a $1
}
\### definitions #############################################################
ESX_backupdir=/vmfs/volumes/ESXBackup
ESX_logfile=/var/log/ESXBackup.log
ESX_user=ESXBackup
ESX_pass=Password
ESX_host=$HOSTNAME
ESX_mailfrom=ESX-Admins\@domain.org
ESX_sendmail=/usr/bin/sendmail.pl
ESX_mailhost=mail.domain.de
\### code ####################################################################
echo "ESXBackup V0.1c - ESX host backup script"
echo ""
\#### if no parameters are spedified, print usage hints and exit
if \[ -z "$1" ]; then
cat $ESX_summaryfile
ESX_failed=0
ESX_success=0
ESX_skipped=0
for ESX_VMS in `/usr/sbin/vcbVmName -h $ESX_host -u $ESX_user -p $ESX_pass -s $ESX_search | grep -i name: | awk -F: '\{print $2}'`
do
skip server if its name is in the exclude list
RetVal=""
if \[ $ESX_exclude ]; then
if \[ -e $ESX_exclude ]; then
RetVal=`grep -i -x $ESX_VMS $ESX_exclude`
fi
fi
if \[ $RetVal = $ESX_VMS ]; then
LogOut $ESX_logfile " Skipped $ESX_VMS."
echo " SKIPPED virtual machine $ESX_VMS." >> $ESX_summaryfile
ESX_skipped=$(($ESX_skipped+1))
else
LogOut $ESX_logfile " Init backup of $ESX_VMS"
ESX_vmfolder="$ESX_backupdir/$ESX_VMS"
ESX_date=$(date '+%Y%m%d-%H%M')
ESX_backupfolder="$ESX_backupdir/$ESX_VMS/$ESX_date"
create folder for backup
LogOut $ESX_logfile " Creating folder $ESX_vmfolder"
mkdir -p $ESX_vmfolder
RetVal=$(expr $?)
if \[ $RetVal != 0 ]; then
LogOut $ESX_logfile " Error creating folder: $Retval"
fi
start backup - destination folder should not exist
LogOut $ESX_logfile " Backing up to folder $ESX_backupfolder"
START=$(date +%s);
ESX_vcb=`/usr/sbin/vcbMounter -h $ESX_host -u $ESX_user -p $ESX_pass -a name:$ESX_VMS -r $ESX_backupfolder -t fullvm`
RetVal=$(expr $?)
END=$(date +%s);
DIFF=$(($END - $START));
echo "$ESX_vcb" >> $ESX_vmfolder/$ESX_date.log
echo "" >> $ESX_vmfolder/$ESX_date.log
test whether backup was successfull
if \[ $RetVal != 0 ]; then
LogOut $ESX_logfile " Backup error: $RetVal"
echo "Backup time needed: $DIFF seconds." >> $ESX_vmfolder/$ESX_date.log
echo "*** ERROR backing up $ESX_VMS. Error code was $RetVal" >> $ESX_summaryfile
echo " Please refer to the logfile stored on $ESX_vmfolder/$ESX_date.log" >> $ESX_summaryfile
ESX_failed=$(($ESX_failed+1))
if \[ $ESX_mailto ]; then
perl $ESX_sendmail $ESX_mailhost $ESX_mailfrom $ESX_mailto Backup\ failed:\ $ESX_VMS $ESX_vmfolder/$ESX_date.log
fi
else
ESX_hddused=`du $ESX_backupfolder -b | awk -F\ '\{print $1}'`
ESX_throughput=$((($ESX_hddused/(1024*1024))/$DIFF))
LogOut $ESX_logfile " Backup of $ESX_VMS successfull. Saved $ESX_hddused bytes."
echo "Backed up $ESX_hddused bytes within $DIFF seconds. Throughput was $ESX_throughput MB/second." >> $ESX_vmfolder/$ESX_date.log
echo " SUCCESS backing up $ESX_VMS. It took $DIFF seconds to back up $ESX_hddused bytes ($ESX_throughput MB/s)." >> $ESX_summaryfile
ESX_success=$(($ESX_success+1))
if \[ $ESX_purge ]; then
LogOut $ESX_logfile " Purging content older than $ESX_purge days."
find $ESX_vmfolder/* -maxdepth 0 -type d -mtime +$ESX_purge -exec rm -rfdv \{} \;
find $ESX_vmfolder/* -maxdepth 0 -type f -mtime +$ESX_purge -exec rm -fv \{} \;
fi
if \[ $ESX_mailto ]; then
perl $ESX_sendmail $ESX_mailhost $ESX_mailfrom $ESX_mailto Backup\ succeeded:\ $ESX_VMS $ESX_vmfolder/$ESX_date.log
fi
fi
LogOut $ESX_logfile " Done backup of $ESX_VMS"
fi
done
if \[ $ESX_mailto ]; then
if \[ $ESX_VMS ]; then
perl $ESX_sendmail $ESX_mailhost $ESX_mailfrom $ESX_mailto Backup\ summary\ $ESX_host:\ $ESX_failed\ failed,\ $ESX_success\ successful,\ $ESX_skipped\ skipped. $ESX_summaryfile
fi
rm $ESX_summaryfile
fi
\#### no host was detected by the given name
if \[ ! $ESX_VMS ]; then
LogOut $ESX_logfile " Unable to detect any host to be backed up."
fi
LogOut $ESX_logfile "Script ESXBackup stopped."
use Net::SMTP;
die ("Usage: sendmail.pl quit;
Hem... Alex, am I missing something or the location of your script is not stated in your post ? I'm really interested in it !
regards,
Antonello
Message was edited by:
Nementis
Oops... posting lag....
Hi,
took some time to format them
Terribly the script is a little bit "mis-formatted", but soon, I'll maintain it on my own website - maybe I've got enough time during christmas holiday \*g*
regards,
Alex
Hi, I'm really interested in this script. What is the URL of your website?
Hi,
my web site takes a little extra time to come online - and additionally I'll update my script to include the following features:
\- Locking mechanism so that only one backup is running on even more than one server at the same time to reduce traffic load on target server.
\- Extra error handling for incomplete vcbMounter activities (such as removement of left snapshots after backup has failed).
The job should be done next week, then I'll post the new version here (now I know how to post code g).
regards,
Alex
Hi there,
this is a new version of my backup script. Installation is as easy as it was before with the older script. It's tested with ESX 3.0.1
Improvements:
\- "lost" \_VCB-BACKUP_ snapshots are deleted if backup fails.
\- possibility to ensure that only one backup runs at a time - even if the script starts on several ESX hosts at the same time
\- exclude VM's from being backed up (create a file on a shared storage with EXCACT VM names one by one line by line if it shouldn't be backed up)
\- retry failed backups
I'm interested if anybody uses my script, so please provide me with information/experiences and/or simply write me a private message.[/b] If nobody uses this script, there's no need to publish it ...
And once again: I provide no warranty at all ... you know what I mean. But it's tested - it should do the job
Here it goes:
#!/bin/bash
#
\# ESX Backup script
#
\# 11. Jan 2007
\# Version 0.1f
#
\# Alexander Storf
#
\# changelog:
\# From Version 0.1d
\# Jan 08 2007: Delete \_VCB-BACKUP_ snapshot after backup failed
\# Jan 08 2007: Implemented switch to use locking for simultaneous backups
\# From Version 0.1e
\# Jan 10 2007: Implemented optional retry if backup fails
#############################################################################
\### function for logging output (screen and file)
\### Parameters: times. Be aware that
backup time could significantly raise using this
parameter.
EOF
exit 0
fi
\### get command line parameters
while getopts v:t:m:x:r:l Optionen; do
case $Optionen in
v) ESX_VM=$OPTARG;;
t) ESX_purge=$OPTARG;;
m) ESX_mailto=$OPTARG;;
x) ESX_exclude=$OPTARG;;
l) ESX_lock=TRUE;;
r) ESX_retry=$OPTARG;;
esac
done
\### test parameters
if \[ ! $ESX_VM ]; then
echo "You have to define at least a VM to be backed up. Exiting."
echo ""
exit 2
fi
if \[ "$ESX_VM" = "any" ]; then
ESX_search="any:"
elif \[ "$ESX_VM" = "poweredon" ]; then
ESX_search="powerstate:on"
else
ESX_search="name:$ESX_VM"
ESX_singlevm=TRUE
fi
\### test if purge parameter is positiv numeric
if \[ ! -z $ESX_purge ]; then
echo $ESX_purge | egrep '\[:alpha:]|\[:cntrl:]|\[:graph:]|\[:punct:]' -q
RetVal=$(expr $?)
if \[ $RetVal == 0 ]; then
echo "Invalid value for purge (-t) parameter."
exit 3
fi
ESX_purge=$(expr $ESX_purge)
if \[ $ESX_purge -le 0 ]; then
echo "Purge parameter must be positive numeric."
exit 3
fi
fi
\### test if retry parameter is positiv numeric
if \[ ! -z $ESX_retry ]; then
echo $ESX_retry | egrep '\[:alpha:]|\[:cntrl:]|\[:graph:]|\[:punct:]' -q
RetVal=$(expr $?)
if \[ $RetVal == 0 ]; then
echo "Invalid value for retry (-r) parameter."
exit 3
fi
ESX_retry=$(expr $ESX_retry)
if \[ $ESX_retry -lt 0 ]; then
echo "Retry parameter must be positive numeric."
exit 3
fi
else
if its not definied in commandline, assume 0 retries
ESX_retry=0
fi
\### temporary file to store backup messages for the complete ESX host
ESX_summaryfile="/var/tmp/$(date '+%Y%m%d-%H%M')"
LogOut $ESX_logfile "Script ESXBackup started."
echo "Backup summary for ESX host $ESX_host:" > $ESX_summaryfile
echo "" >> $ESX_summaryfile
ESX_failed=0
ESX_success=0
ESX_skipped=0
\### loop through all VMs which fit within the search criteria
for ESX_VMS in `/usr/sbin/vcbVmName -h $ESX_host -u $ESX_user -p $ESX_pass -s $ESX_search | grep -i name: | awk -F: '\{print $2}'`
do
skip server if its name is in the exclude list
RetVal=""
if \[ $ESX_exclude ]; then
if \[ -e $ESX_exclude ]; then
RetVal=`grep -i -x $ESX_VMS $ESX_exclude`
fi
fi
if \[ "$RetVal" = "$ESX_VMS" ]; then
LogOut $ESX_logfile " Skipped $ESX_VMS."
echo " SKIPPED virtual machine $ESX_VMS." >> $ESX_summaryfile
ESX_skipped=$(($ESX_skipped+1))
else
LogOut $ESX_logfile " Init backup of $ESX_VMS"
Handle locking mechanism
if \[ $ESX_lock ]; then
while \[ -e $ESX_lockfile ]
do
ESX_vmlocked=`cat $ESX_lockfile`
LogOut $ESX_logfile " Backup locking active ($ESX_vmlocked). Waiting 60s ..."
sleep 60
done
create locking file and write VM and hostname into it
touch $ESX_lockfile
echo "$ESX_VMS on server $ESX_host" > $ESX_lockfile
LogOut $ESX_logfile " Setting lock on $ESX_host for $ESX_VMS"
fi
ESX_vmfolder="$ESX_backupdir/$ESX_VMS"
ESX_date=$(date '+%Y%m%d-%H%M')
ESX_backupfolder="$ESX_backupdir/$ESX_VMS/$ESX_date"
create folder for backup
LogOut $ESX_logfile " Creating folder $ESX_vmfolder"
mkdir -p $ESX_vmfolder
RetVal=$(expr $?)
if \[ $RetVal != 0 ]; then
LogOut $ESX_logfile " Error creating folder: $Retval"
fi
start backup - destination folder should not exist
LogOut $ESX_logfile " Backing up to folder $ESX_backupfolder"
START=$(date +%s);
RetryCount=0
Success=FALSE
ToRetry=TRUE
while \[ $ToRetry == TRUE ];
do
if \[ $RetryCount -gt 0 ]; then
LogOut $ESX_logfile " Now retrying backup (loop $RetryCount out of $ESX_retry) ..."
echo "Now retrying backup (loop $RetryCount out of $ESX_retry) ..." >> $ESX_vmfolder/$ESX_date.log
fi
ESX_vcb=`/usr/sbin/vcbMounter -h $ESX_host -u $ESX_user -p $ESX_pass -a name:$ESX_VMS -r $ESX_backupfolder -t fullvm`
RetVal=$(expr $?)
echo "$ESX_vcb" >> $ESX_vmfolder/$ESX_date.log
echo "" >> $ESX_vmfolder/$ESX_date.log
test whether backup was successfull
if \[ $RetVal != 0 ]; then
Backup was not successful
LogOut $ESX_logfile " Backup error: $RetVal"
echo " ERROR backing up $ESX_VMS. Error code was $RetVal" >> $ESX_summaryfile
Try to rename folder of unsuccessful backup
If we retry the backup, the folder must not exist
if \[ -d $ESX_backupfolder ]; then
LogOut $ESX_logfile " Renaming Backupfolder to $ESX_backupfolder.$RetryCount"
mv $ESX_backupfolder $ESX_backupfolder.$RetryCount
else
LogOut $ESX_logfile " $ESX_backupfolder does not exist, so no renaming needed"
fi
Sometimes if backup fails the snapshot still resided - now delete it:
DelSnapshot $ESX_VMS
RetVal=$(expr $?)
if \[ $RetVal == 0 ]; then
LogOut $ESX_logfile " Deleted \_VCB-BACKUP_ snapshot."
echo " Remaining \_VCB-BACKUP_ snapshot successfully deleted." >> $ESX_summaryfile
echo "Remaining \_VCB-BACKUP_ snapshot successfully deleted." >> $ESX_vmfolder/$ESX_date.log
fi
else
Backup has been successful
END=$(date +%s);
DIFF=$(($END - $START));
Determine space on backup folder used and calculate the throughput
ESX_hddused=`du $ESX_backupfolder -b | awk -F\ '\{print $1}'`
ESX_throughput=$((($ESX_hddused/(1024*1024))/$DIFF))
LogOut $ESX_logfile " Backup of $ESX_VMS successfull. Saved $ESX_hddused bytes."
echo "Backed up $ESX_hddused bytes within $DIFF seconds. Throughput was $ESX_throughput MB/second." >> $ESX_vmfolder/$ESX_date.log
echo " SUCCESS backing up $ESX_VMS. It took $DIFF seconds to back up $ESX_hddused bytes ($ESX_throughput MB/s)." >> $ESX_summaryfile
ESX_success=$(($ESX_success+1))
Delete old backups if purging was specified
if \[ $ESX_purge ]; then
LogOut $ESX_logfile " Purging content older than $ESX_purge days."
find $ESX_vmfolder/* -maxdepth 0 -type d -mtime +$ESX_purge -exec rm -rfdv \{} \;
find $ESX_vmfolder/* -maxdepth 0 -type f -mtime +$ESX_purge -exec rm -fv \{} \;
fi
Send success eMail
if \[ $ESX_mailto ]; then
perl $ESX_sendmail $ESX_smtphost $ESX_mailfrom $ESX_mailto Backup\ succeeded:\ $ESX_VMS\ on\ $ESX_host $ESX_vmfolder/$ESX_date.log
fi
Break to while loop
Success=TRUE
ToRetry=FALSE
fi
Increment the number of retries and determine if a retry should be made
RetryCount=$(($RetryCount+1))
if \[ $RetryCount -gt $ESX_retry ]; then
ToRetry=FALSE
fi
done
If backup unsuccessful, increment the failed counter and send an email
if \[ $Success == FALSE ]; then
LogOut $ESX_logfile " *** Backup of $ESX_VMS finally failed!"
echo "*** Backup of $ESX_VMS finally failed!" >> $ESX_summaryfile
ESX_failed=$(($ESX_failed+1))
if \[ $ESX_mailto ]; then
perl $ESX_sendmail $ESX_smtphost $ESX_mailfrom $ESX_mailto Backup\ failed:\ $ESX_VMS\ on\ $ESX_host $ESX_vmfolder/$ESX_date.log
fi
fi
Backup is done for this VM, so release the lockfile
if \[ $ESX_lock ]; then
Release lockfile
rm $ESX_lockfile
LogOut $ESX_logfile " Locking released on $ESX_host for $ESX_VMS"
fi
LogOut $ESX_logfile " Done backup of $ESX_VMS"
Sleep some time if not only a single vm has to be backed up to give other
ESX hosts the chance to start their backup in the 30 second pause
if \[ ! $ESX_singlevm ]; then
sleep 30
fi
fi
done
\### At last send the summary mail of the complete ESX host
if \[ $ESX_mailto ]; then
if \[ $ESX_VMS ]; then
perl $ESX_sendmail $ESX_smtphost $ESX_mailfrom $ESX_mailto Backup\ summary\ $ESX_host:\ $ESX_failed\ failed,\ $ESX_success\ successful,\ $ESX_skipped\ skipped. $ESX_summaryfile
fi
fi
\### remove the summary file
if \[ -e $ESX_summaryfile ]; then
rm $ESX_summaryfile
fi
\### no host was detected by the given name
if \[ ! $ESX_VMS ]; then
LogOut $ESX_logfile " Unable to detect any host to be backed up."
fi
LogOut $ESX_logfile "Script ESXBackup stopped."
Hi kapplah,
i test your script today and it works fine but i found an failure.
it the name of an host have a space in the name the script crash.
Can you fix it ?
Regards
Alex
Hallo Alex,
(wow - it's funny to speak to myself as my first name is alex, too
yes, this was untested as we don't use spaces in our virtual machine name intentionally.
Tomorrow I'll test and begin to modify the script.
Thanks for your reply,
Alex
Hi,
unfortunately it's not as easy as I thought, because I use the VM's name under several conditions:
\- creating a serach string for vcbVmName
\- creating a directory (ok, I can simply 'sed' it out)
\- sending a mail
We always recommend using printable 7bit ascii characters for the VM name. Is it possible to simply rename the VM?
regards,
Alex
Hi Alex!
Great script. Thank you!
Probably i'm the only one who encountered this problem...
\### loop through all VMs which fit within the search criteria
for ESX_VMS in `/usr/sbin/vcbVmName -h $ESX_host -u $ESX_user -p $ESX_pass -s $ESX_search | grep -i name: | awk -F: '\{print $2}'`
After searching for VMs the script always reported, that it was unable to find any VMs.
I did then the following:
\### loop through all VMs which fit within the search criteria
/usr/sbin/vcbVmName -h $ESX_host -u $ESX_user -p $ESX_pass -s $ESX_search | grep -i name: | awk -F: '\{print $2}' > /tmp/.ESX_VMs
for ESX_VMS in `cat /tmp/.ESX_VMs`
I know - it's really not a big difference...
But like this the script works like a charm!
Roland
Hi Roland,
please post the statement which you have called the script. What was the content of your temporary file?
Can you give me the output of the vcbVmName when called with your hostz settings, Username, password and search string? The search string itself may be 'any:' or 'powerstate:on' or 'name:VMNAME' (where VMNAME is the name of the single VM you wanted top backup) depending on your commandline call.
cu,
Alex
Hi Alex
I did the following:
ESXBackup -v poweredon
After a bit debugging, I figured out that the variable $ESX_VMS was always empty (with your original version).
I made the changes I posted above, then my temporary file had the correct content - all names of VMs which where powered on at the moment.
If I called the vcbVmName (copy / paste the command from the script and changed the necessary variables) the command worked fine.
Another issue - the sendmail.pl script gives me the following error:
\[root@vi3h02 /]# perl /usr/bin/sendmail.pl relay.host.com backup\@address.com recipient@address.com test /var/log/ESXBackup.log
Can't call method "mail" on an undefined value at /usr/bin/sendmail.pl line 12.
Do you have an idea what's going wrong there?
CU, Roland
Ooops, my mistake....
Thx, regards
Roland
So far I am very impressed with your script, there is just one thing missing. Any chance you can have the snapshot deleted if a Vmotion occurrs during a VM backup? I have had several VM's fail backups because DRS vmotions them while being backed up.
Hi there, I am not a genius with linux but I have a quick questions:
I have managed to edit the backupscript so now instead of taking VMNAME its looking for UUID (My machine all have like IT: Test (ast) - test ) So I have spaces special caracters and everything. But Once I modified this now it works fine, actually I have added also a file called host.lst who defined every uuid and a friendly name to create directory, if not found on host.lst then the script will not backup this machine. (Almost work as the exclusion one)
But anyway, here is my question:
I need to be sending mail using this script but I need to be able to specify the port, and the authentication. I have tried looking on the new but I can't seem to be able to modify the sendmail.pl to use another port (2525) and authenticate to the mail server.
Thanks for all your help. Btw the script is awesome.
Have a great day!
Message was edited by:
nface
Any ideas? I know what this error means... but how do you install the perl package? Is there an rpm?
\[root@dali bin]# ./sendmail.pl
Can't locate Net/DNS.pm in @INC (@INC contains: /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .) at ./sendmail.pl line 7.
BEGIN failed--compilation aborted at ./sendmail.pl line 7.
Hi Steve,
the script doesn't detect whether a vmotion occurs. If the backup fails, the script SHOULD delete the created \_VCB-BACKUP_ snapshot.
But there is a little work to do for me to get this procedure REALLY fully functional (under some conditions when backup fails, the following calls for vcbVmName and vcbSnapshot don't return results ...).
Alex