sabya1232003
Enthusiast
Enthusiast

vSphere Snapshot causing disk issues

Dear All,

We have vSphere 4.0 environment where we have 2 ESX on cluster and 40-50 VMs (Mostly on production).Our Datastore design is specific to VMs as we have allocated 1 Datastore for each VM and also multiple Datastores for VMs (in case of multiple discs on a Server ...i.e each vmdk comes from separate Datastore --> separate LUN from iSCSI storage)

Current Challenge : Now we are facing trouble in case of snapshots.When we take any snapshot and try to revert to that ..all the vmdks of diff datastores are getting stored on the 1st datastore which causing space issue as we have limit for each Datastore

Also verified that the old vmdk files are not in use and idle on other datastores and the 1st Datstore size keeps on increasing

We have managed to survive by increasing the size of the 1st Datastore to contain all growing vmdk files and we are fine with that.

Now we want to remove the Datastores and also snapshots which are not in use for the VM ...what is the best way to do it ..I am sure Server needs to be Shut down for it.

Please suggest ...

Regards,

Sabasachi


0 Kudos
9 Replies
AWo
Immortal
Immortal

A datastore for each VM? Why?

Didi I understand that correctly, all snapshots are stored on one LUN, regardless of to which guest it belongs on whatever LUN that guest is stored? A snapshot should be created where the corresponding parent disk is stored.

However, you might hava a look here: http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1003412

AWo

Edited by AWo

vExpert 2009/10/11 [:o]===[o:] [: ]o=o[ :] = Save forests! rent firewood! =
0 Kudos
SimonStrutt
Enthusiast
Enthusiast

Now we want to remove the Datastores and also snapshots which are not in use for the VM ...what is the best way to do it ..I am sure Server needs to be Shut down for it.

To remove the snapshot files you'll need to delete the snapshots, which will merge the snapshot data into the base VMDKs.  This can normally be done while your VM's are up, but if they're grown very large (say over 20GB) this isn't always successful, and you may notice some brief interruptions to your VM's whilst this happens.  If you run into problems trying to remove snapshots see this excellent page...

http://geosub.es/vmutils/Troubleshooting.Virtual.Machine.snapshot.problems/Troubleshooting.Virtual.M...

Similarly downtime isn't required to remove datastores, but obviously you'll need to ensure nothing is still using the DS first for this to be successful

"The greatest challenge to any thinker is stating the problem in a way that will allow a solution." - Bertrand Russell
sabya1232003
Enthusiast
Enthusiast

The challenge is while reveting back to the old snapshot.All the vmdks are getting dumped on the 1st Datastore

May be this is design error to have each vmdk on a dedicated Datastore....but we are good on all other cases than snapshoting

0 Kudos
sabya1232003
Enthusiast
Enthusiast

Thanks Simon for the nice doc on snapshot

I have been testing these since 2-3 days on test machines  with multiple datastores.I have found the same issues ..while reverting to the snap shot ...its dumps all vmdks to the parent datastore.Then removing the snapshot removes those vmdk files.This sounds good if this is done soon after creating the snapshot.

But in our case its been 3 months and now we have vmdks grown to larger sizes.I suspect there will be issues if we remove the snapshot and if the grown up vmdks also deleted.We are fine if those are moved to their original location

0 Kudos
SimonStrutt
Enthusiast
Enthusiast

When a snapshot is deleted, the ESX creates another helper snapshot, to maintain the VM's disk access whilst its working with the original snapshot and the parent VMDK.  It sounds like you mean that the temporary helper snapshot is being written to the parent datastore (???).

The size of the helper snapshot is driven by the amount of disk write activity on the VM during the snapshot deletion/removal.  Obviously if your VM's are shutdown then this is not a problem - which is probably going to be your only reliable way forward given that your snapshots are also very large.

"The greatest challenge to any thinker is stating the problem in a way that will allow a solution." - Bertrand Russell
0 Kudos
sabya1232003
Enthusiast
Enthusiast

Dear Simon,

I meant when we had to revert to previous snapshot ....it dumped the other 2 datastore vmdks to the 1st datastore where the snapshot files and .vms files are stored.This is tested by me already on a test environment

Now we are running on a snapshot and our data already grown much ...so if I delete all the snapshots for the VM (online/offline)..is it going to affect the current status of the VM ? will there be any data loss on the additional vmdks ?

0 Kudos
SimonStrutt
Enthusiast
Enthusiast

I've not come across a situation where you lose data during a snapshot deletion (obviously you do if you revert/jump back to a snapshot) - even if it goes wrong and the deletion fails/aborts and your VM crashes (which is very rare).  But, being that your VM's are in an unideal state, the unexpected should be expected, so worth checking your backup's are good and recent.

To be clear on what you see happening (assumes your using the VI Client Datastore Browser to view, you'll also see -flat.vmdk and -delta.vmdk files if SSH'ed to an ESX)...

VM - with two disks on two different Datastore's

- disk1: \DS1\VM\vm.vmdk

- disk2: \DS2\VM\vm_1.vmdk

you start a snapshot so now have two extra vmdk's...

- disk1: \DS1\VM\vm-000001.vmdk

- disk2: \DS1\VM\vm-000002.vmdk

you delete the snapshot, during which the following extra files appear...

- disk1: \DS1\VM\vm-000003.vmdk

- disk2: \DS1\VM\vm-000004.vmdk

you delete the snapshot and end up with...

- disk1: \DS1\VM\vm.vmdk

- disk2: \DS1\VM\vm_1.vmdk        (2nd disk has been moved to DS1 ?!)

"The greatest challenge to any thinker is stating the problem in a way that will allow a solution." - Bertrand Russell
sabya1232003
Enthusiast
Enthusiast

Thanks for the detail explanation ...I am also expecting the similar out put and planning the activity on production server.

Rather than trusting the backup system ...i would prefer to clone the VM before doing this change

Does this sounds good ?

0 Kudos
SimonStrutt
Enthusiast
Enthusiast

The last step in my example, where disk2's base vmdk gets moved from one datastore to another is not expected operation. If this is happening, then there is something wrong with the way your setup is working, and I'd suggest opening a VMware support case so that it can be investigated properly. They can start a webex and actually see for themselves what's happening.

Taking a clone of a machine with large snapshots may itself be problematic, if you can take a backup then this would be preferred, but if you can't then obviously a clone would have to do.

"The greatest challenge to any thinker is stating the problem in a way that will allow a solution." - Bertrand Russell
0 Kudos