This is a VMware community group for ghettoVCBg2 script and not VMware's VCB utility. You may want to post your inquiry on the main VMTN forum or a sub-forum related to your topic - http://communities.vmware.com/community/vmtn
Sorry for not replying to this sooner, but I believe that it is related to the ghetto script.
In the older versions of ESX you had to tweak the ESX configuration when using NFS as the data store so that you did not get timeouts when removing snapshots. I had to do this when I originally started using ghetto on my 3.5 servers. It appears to have been fixed in 4.1. On my 4.1 test server I can manually create and remove snapshots with only 1 single ping lost for each which I can live with.
The problem that I am having is when the ghetto script completes I get 8 ping timeouts. So I would like to know if there is something I am doing wrong in the script, or if there is something else I need to do to overcome this.
I'm not including memory in the snapshot, nor am I quiescing the VM.
The NFS store is persistent on the VMware host.
Thanks in advance,
This really depends on the VM and how busy it is, there's been some posts on the VMTN regarding this, so it's outside of the ghettoVCB scripts as it just leverages the VMware snapshot creation/removal.
Here's a post that you may want to follow-up on as well - http://communities.vmware.com/thread/274479
What you can do is manually create the snapshot using the vSphere Client (that's all the script is doing via the APIs) and see how many pings you're losing within the guest.
Thanks for the link, but unfortunately this does not help.
The VMware host is only running 1 VM (the one I am testing)
No activity while the backup is happening
A manual snapshot create/delete results in 2 ping losses (one when the snapshot is taken, one when it is removed)
What I have not done is wait the 20 minutes that it takes for the ghetto script to complete before remving the manually generated snapshot. However I don't think this will make an difference as thinging is accessing the VM.
I just want to make sure, are you using ghettoVCB or ghettoVCBg2 script? I assume the latter as you posted in the ghettoVCBg2 forum but I just want to make sure as it may have an impact on what you're seeing.
I downloaded the latest one this morning and had the same results. When the backup completes and deletes the snapshot I loose 8 pings.
The only things that I have changed from the defaults are:
1) The backup destination (to a Linux hosted NFS location)
2) The rotation count to 1
3) The DISK_BACKUP_FORMAT is thin
I am using "-f" to specify the host(s) to backup which is just a single Windows 7 VM right now.
Both source and destinations are NFS.
I also tested by taking a manual snapshot, then waited 20 minutes (the same as it took for the Ghetto script to complete), then deleted the snapshot. I lost only one ping.
It could be a combination of script/appliance and vmware. Overall it is a vmware process of the snapshot and removal timeout. Using the appliance may introduce more lag but not much unless it is on bad network link. We really need a retry feature in this script, we have bad snapshots using VDR 1.2 too, BUT it retries failed backups 3 times every 30 minutes. I'd recommend a failed backup file that is created by this script that we the end user could then run another backup on later (or cron). Not sure we'll ever get 100% good backups on 1st try with any solution (quiece issues). A Retry feature is needed.
We usually have 3 VM's that fail from time to time. Usually 1 every 4-5 days. You could create a snappy .sh script that parses thru the log file for the names of the failed VM backups and create a file for backup retry.
WIth the older script we see errors from time to time as well. Usually related to a snapshot not getting removed properly.
However in this case, the backup is running fine. The only problem being that the snapshot removal interferes with the VM availability.
Thanks for info, I just wanted to double check we were talking about the same script as ghettoVCB and ghettoVCBg2 are two different scripts.
Thanks for also running the test manually, I did have one additional question. You mentioned this VM is pretty much idle? This means that no matter when you run the ghettoVCBg2 backup script, you're seeing 8 ping lost? Are you manually pinging from vMA or from another host or from within the VM pinging to a gateway or some sort? I'm also assuming that you have the latest VMware Tools installed on this host?
Thanks for the feedback, I'll definitely take "retry" into consideration for a future release. It's definitely something I can look into, though there maybe others who feel that if a user specified a particular backup window, that it should only execute during that period. I'll have to weight both pros/cons and figure out how to best incorporate such a feature.
OK, I am officially a knucklehead. I’m using ghettoVCB not ghettoVCBg2. My apologies.
ok, that's fine.
Can you also provide answers to these questions:
You mentioned this VM is pretty much idle? This means that no matter when you run the ghettoVCB backup script, you're seeing 8 ping lost? Are you manually pinging from vMA or from another host or from within the VM pinging to a gateway or some sort? I'm also assuming that you have the latest VMware Tools installed on this host?
Yes, I've run the script multiple times over the course of a couple of days. The VMware host is only running this one VM, and always on removing the snapshot it looses 8 pings.
I'm running a continous ping from my desktop to the VM
Yes, I verified the VMware tools install this morning.
You can try taking a look at your hostd and vmware.logs to see if there's anything odd, is this running on ESX or ESXi? You also said that if you manually created this snapshot and removed it you _always_ just lose a single ping on each operation?
Only other difference is the script is running in the Service Console/Busybox versus going through the APIs which is what the vSphere Client is doing and this consumes some amount of resource but shouldn't be anything that would impact a single VM unless your ESX(i) host is constrained on resources