VMware Cloud Community
kapplah
Enthusiast
Enthusiast

ESX Backup Script

Hello,

I've created for our own purposes a hot backup script for ESX servers. features are:

\- autodetect all VMs to be backed up: You can specify a single VM, save all powered on VMs or simply all VMs on one ESX host

\- send a email to an administrative account if backup was successful or not and send another mail with a summary per ESX host

\- Backups older than a specified period are deleted automatically

\- Uses a ESX storage to transfer the backup files (can be managed via VI Client)

\- Skip VMs with their names stored in a external file

Configuration:

\- store file ESXBackup in /usr/bin and chmod it executable

\- store file sendmail.pl in /usr/bin and chmod it executable

\- open port 25 in ESX firewall if you intend to send mails

\- create storage in ESX GUI (we use NFS storage, which works well)

\- create local Account ESXBackup on each ESX host where you want to run the backup. This account has to be a ESX administrator

\- Modify the configuration section in the ESXBackup script to meet your configuration settings. The config settings should be self-explaining

\- Create a file on your Backup storage with names of VMs which should not be backupped automatically. You can later specify this file at backup time, the named VMs will be skipped

Usage: Log onto the console of the ESX host and run "ESXBackup" without parameters. Some informations will be displayed.

To backup a single VM use this command: "ESXBackup -v VMNAME"

To backup all powered on VMs use that: "ESXBackup -v poweredon"

You may want to use ESXBackup in your crontab, be sure to modify roots crontab "crontab -e -u root" ...

Known bugs and issues:

\- Sometimes backup ends with error code 141. This error indicates, that vcbMounter has created, but not deleted a snapshot. Delete this snapshot manually or the next backup won't run anyway.

\- In our environment, the NFS server seems to be not fast enough to handle the transfer rate some time, so the NFS client (our ESX server) receives an error and the transfer stops. If this is the case use a sync export or your nfs share instead of a async (but this slows down backup time dramatically) or mount the nfs share via ESX console and not as a ESX storage.

\- VMs should not be moved from one ESX host to another during backup.

\- Logiles are stored on /var/log - if you don't use them anymore, delete them manually

\- Be aware that VM-Names are CASE-sensitiv. If you use the exludelist, write one VM's name per line without spaces and tabs.

Planned enhancements:

\- Detect and delete "lost" \_VCB-SNAPSHOT_ snapshots

\- Retry backup if one VM fails

Note: This script is "at is it" - no warranty or other features. Feature request are welcome. This was my first "more-than-a-three-liner" shell script, so many improvements can be made.

If you find this script useful (or not) please let me know ...

Have fun,

Alex

0 Kudos
66 Replies
eksy
Contributor
Contributor

Dobyme,

I haven't used myself but VMSnapper (http://www.virtualizeplanet.com/content/index.php?option=com_content&task=view&id=26&Itemid=44) sounds like it'll do what you want. Not as a script, but at least a single operation.

0 Kudos
jonhutchings
Hot Shot
Hot Shot

Of course you could also try the V.I.S.B.U script written by mittell of these forums - you can get that one from here

http://www.xtravirt.com/index.php?option=com_remository&Itemid=75&func=fileinfo&id=7

and read the many many messages here

http://www.vmware.com/community/thread.jspa?messageID=653056&tstart=0

it's actively developed and we use it in house with great success

0 Kudos
nicolas_ruiz
Contributor
Contributor

Hi all,

I just make this script running, but with root account. If i test with an other, i got error. Sounds like normal as this user shouldn't have enough permission to do so.

My question is a general linux like : What kind of access i have to give to my backup account ? I would like to give strictly minimum to it to work ...

I got an other question, about the restore :

I didn't understand the last post speacking about, The snapshot i had copying to my NFS is sufficient to restore or i have to first copy my .vmdk (virtual box off) and after make snapshot copy ?!

Thanks a lot for your response.

0 Kudos
kapplah
Enthusiast
Enthusiast

The Backup Account has to be created on each ESX host locally and must have Administrative Privileges on each host via VC GUI.

cu,

Alex

0 Kudos
beaunewcomb
Contributor
Contributor

Can you shed some more light on creating the ESXBackup account and granted it Admin Priveledges?

When you say "Locally" I would assume you mean create a new user "ESXBackup" via linux on the host, then give them permissions via VC?

I don't see where you can change local users via VC, only add domain users...

Can you just step through this process?

Thanks in advance

0 Kudos
ted_byrne
Contributor
Contributor

an0malist,

You can create a local user on each ESX server by pointing the VI client at the individual host rather than the VC server. (You'll probably have to connect as root.)

Click on the Users & Groups tab in the right-hand pane, then right-click and select "Add..." from the context menu that pops up.

HTH,

Ted

0 Kudos
beaunewcomb
Contributor
Contributor

Hey, thanks for the quick reply!! That was easier than what I was thinking... I was about to add the user in the linux environment. I didnt even know you could connect to individual servers... nice!

I'm running into a problem though... here's what I get when I run the script:

\[esxbackup@VMESX01 bin]# ./ESXBackup -v JMHOptio01

ESXBackup V0.1c - ESX host backup script

Sep 14 18:16:59: Script ESXBackup started.

./ESXBackup: line 153: syntax error near unexpected token `newline'

./ESXBackup: line 153: `echo "*** ERROR backing up $ESX_VMS. Error code was $RetVal" >> '

\[esxbackup@VMESX01 bin]#

any ideas?

0 Kudos
kapplah
Enthusiast
Enthusiast

Hi there,

it's me again - I've attached the latest version of my Backup Script. There are a lot of improvements (configurable parallel backup threads for more than one backup at the same time on different hosts).

Check it out and give me some kind of reply if you find it useful.

sendmail.pl is still the same ...

regards

Alex

0 Kudos
azn2kew
Champion
Champion

Nice script you've made Alex. I have to compare with other people scripts and tools to see which one is more flexible. Thanks again.

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

iGeek Systems LLC.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA
0 Kudos
shane_presley
Enthusiast
Enthusiast

Thanks for the nice script.

I've installed it, and it does a backup, but it fails with this error:

/usr/local/bin/ESXBackup: line 270: (/(10241024))/139: syntax error: operand expected (error token is "/(10241024))/139")

Mar 24 13:04:23: Script ESXBackup stopped.

It looks like the backup got produced, so I'm not sure what the calculation was supposed to be doing.

-Shane

0 Kudos
kapplah
Enthusiast
Enthusiast

Looks like the first operand for throughput calculation is missing. It's determined in the statement before using a 'du' command.

The error is completely non-critical - it's only for speed-calculation and display.

When I'm in the office on wednesday, I'll take a closer look at the script.

regards,

Alex

0 Kudos
shane_presley
Enthusiast
Enthusiast

Thanks Alex,

I upgraded to the latest version of your script, and it fixed the problem. So no need to investigate.

I'm wondering if there's an easy way to turn the exclude list, into an include list? I'd like to specify a list (from a file) of VMs to backup. We have roughly 100 VMs, and I'd like to back up about 15 or so. Rather than build an exclude list, I'd like to have a file that lists the VMs to backup.

Shane

0 Kudos
kapplah
Enthusiast
Enthusiast

Hi shane,

that shouldn't be much work to "invert" the exclude list into an "include" list. There's a grep statement inside the script which does all the job. Lat't wait until tomorrow. When I'm back in the office againt, I'll try to make a short hack to make your dreams come true Smiley Wink

Alex

0 Kudos
shane_presley
Enthusiast
Enthusiast

Sounds good, thanks!!

0 Kudos
Bonduelle
Contributor
Contributor

Hello,

I am using this script (version 0.1d) on ESX 3.0.1.

But often times the Backup has not been done fully for all machines (I am using a script, which is located in cron.weekly and in which I have defined the vm's to be backed up (by calling "ESXBackup -v <VM-Name> -t 13)).

The ESX-Log shows me mostly the last entry "Purging content older than 13 days" and not more. This can be on the first, second ... machine. So sometimes till th fourth machines it's working and the problem occurs on the fifth machine, sometimes it occurs on the first machine, sometimes on the third. So, it's not related on a specific VM. It can occur for every of my machines.

To analyze the problem more detailed, I modified the script on the following way to find out why the script stopps (or at least not more is logged).

  1. Delete old backups if purging was specified

if ; then

LogOut $ESX_logfile " Purging content older than $ESX_purge days."

find $ESX_vmfolder/* -maxdepth 0 -type d -mtime +$ESX_purge -exec rm -rfdv {} \;

if [$ESX_ErrorTL == "1"]; then

LogOut $ESX_logfile " Find-1 for Purging $ESX_vmfolder has been done."

fi

find $ESX_vmfolder/* -maxdepth 0 -type f -mtime +$ESX_purge -exec rm -fv {} \;

if [$ESX_ErrorTL == "1"]; then

LogOut $ESX_logfile " Find-2 for Purging $ESX_vmfolder has been done."

fi

fi

($ESX_ErrorTL is a specific variable which I can set to do more detailed logging).

So I can say, the problem is caused by the first find-command "find $ESX....", because in the logfile the entry "Find-1 for Purging $ESX_vmfolder has been done" is missing.

Has anyone an idea why the script is stopping executing.

Thanks a lot

0 Kudos
kapplah
Enthusiast
Enthusiast

Hi Bonduelle,

version 0.1d is rather old, the latest version is 0.1h which has some improvements.

If you want to user the old 01d version you should write the complete command to the logfile and start it manually (maybe you remove the part after -exec to be sure nothings goes wrong), e.g.:

LogOut $ESX_logfile "find $ESX_vmfolder/* -maxdepth 0 -type d -mtime +$ESX_purce -exec rm -rfdv {}\;"

cu,

Alex

0 Kudos
espi3030
Expert
Expert

kapplah,

Awesome script!! I used on one ESX host and it worked great, but when I try it on different host setup very similarly I get "Unable to detect any host to be backed up". Both hosts have atleast one VM on local storage. The only thing I modified on the script is the Copy to location, backup username and password. Can you provide me with some info where to troubleshot this problem? Thank you.

0 Kudos
kapplah
Enthusiast
Enthusiast

Hi,

try this at command prompt:

/usr/sbin/vcbVmName -h this_host -u user -p password -s powerstate:on

replace this_host with your hostname, user and password with the user account and the password, you've created to run the backup with (the user must have administrative role within ESX permissions settings). The command should give you a list with all running VMs on the host or maybe you receive a error message (permissions or something else) which you can post here.

regards,

Alex

0 Kudos
espi3030
Expert
Expert

Alex,

Thank you for your reply. I ran the command you suggested below (changing to my ESX hostname, and specifying the backup user credentials) here is the outpu of that:

Current working directory: /tmp/ESXBackup

HOSTINFO: Seeing AMD CPU, numCoresPerCPU 2 numThreadsPerCore 1.

HOSTINFO: hyperthreading disabled, setting number of threads per core to 1.

HOSTINFO: This machine has 2 physical CPUS, 4 total cores, and 4 logical CPUs.

HOSTINFO: Seeing AMD CPU, numCoresPerCPU 2 numThreadsPerCore 1.

HOSTINFO: hyperthreading disabled, setting number of threads per core to 1.

HOSTINFO: This machine has 2 physical CPUS, 4 total cores, and 4 logical CPUs.

System libcrypto.so.0.9.7 library is older than our library (90701F < 90709F)

Unsetting unknown path: /vmomi/

Error: Invalid user name or password

What is odd about this output, is I can do the same thing on the ESX host that does execute the script correctly and I get this same output as above, right down to the "Error:Invalid user name or password". I even changed the user name and password in the script to root and rootpassword. I can successfully log directly into the ESX server via VIC and SSH.

Thank you,

Alfred

0 Kudos
espi3030
Expert
Expert

Alex! I got it! Your suggestion got my hamster running, the passwords in the script were too complicated (have to be that way) so I encased the password with single quotations 'W3lc0M3to3$X'. That is not my actual password of course, but you get the idea. The script is running now. Once again thank you for an awesome script, you have automated my long working nights.

Alfred

0 Kudos