VMware Cloud Community
grittyminder
Contributor
Contributor

Snapshots with Oracle = Safe?

We recently completed virtualization of a few Oracle 10.x Windows servers. This is a huge deal--a few people did not want this project to be carried out.

Here is where things went wrong. We are (were?) backing up or VMs using scripted snapshots and disk exports (we have no SAN at this point). We didn't realize that snapshots do not play nicely with Oracle DBs (we did not notice anything unusual during the testing phase). A few days after deploying the backup scripts into production our main Oracle DB crashed while being snapshotted. Now there is a huge snapshot 'backlash'--snapshots are forbidden to be used with Oracle DBs.

At this point, I feel I need to give a bit of background information. One of the objectives with virtualization for us was to reduce the manual/attended portion of the backup job. The previous non-virtualized system backups went something like this: backup Oracle DB to tape, run DB batch job, backup Oracle DB to tape again. This entire process would be completed at midnight every night. With the introduction of virtualization we were hoping to eliminate the first Oracle tape backup and use snapshoting instead (if something goes wrong with the DB batch job we simply roll back the snapshot). However, thanks to the 'backlash' we are now carrying out a complete DB export prior to the DB batch job (the tape backup system was eliminated as part of the refresh), which takes much longer time-wise. 1AM to be exact.

Now for the question. Is there a way to safely use snapshots with an Oracle database that guarantees that data integrity will be maintained always? Is there a best practices document I can read? Also, if I am a risk-adverse manager, can you explain to me in simple and rational language why snapshotting would be a more desirable alternative to database exporting (the present mentaility: snapshots=fast, yet unfamiliar and risky. DB exports=slow, yet familiar and safe)?

Reply
0 Kudos
10 Replies
philvirt
Hot Shot
Hot Shot

The answer is no.

Unfortunately, the database needs to be put in an altered state A.K.A. backup mode. Oracle needs to pause transactions while your doing your backups and you will need to send an SQL statement in order to do so. Oracle "Clients" send the necessary statements before executing the backup.

With the new version of Oracle 11g you can use VSS and create snapshots with some sort of SAN replication solution, however with VMware's snapshot as of now your out of luck. Sorry. I've been waiting for this myself, since I deal with mostly Oracle databases.

Regards,

Fil

Thanks, phIL
grittyminder
Contributor
Contributor

Thanks for the great information. I can't help but feel a little down--you've painted quite a bleak picture for me! Smiley Wink

I can't help but feel that there's got to be a way to be able do this. It seems that a lot of people who are using VCB are using the pre-freeze and post-thaw scripts to stop the oracle service outright prior to taking the snapshot. (NOTE: we're not using VCB, but nonetheless the same pre-freeze/post-thaw functions could be added to a simple backup script). This seems... crude. But if VSS isn't an option, what other alternatives are there? Could I conceivably write a script that has an Oracle client tell the DB to pause transactions while the snapshot is being taken? I haven't yet seen any forum posts from folks who have tried going this route...

Reply
0 Kudos
philvirt
Hot Shot
Hot Shot

You could use a script to alter the database and put it in backup mode and then release it from backup mode after your snapshot. The hard part is the timing.

How would you automate the verification of your snapshot completing and then execute the script to remove the database from backup mode?

Unless you did this with some sort of scheduled process, but risky since the snapshot mechanism can't talk to your scripts...

Regards,

Fil

Thanks, phIL
Reply
0 Kudos
Gilmore
Contributor
Contributor

Hi,

We are having the same problem. We have some VM with Oracle 10g and W2k3, and the problem is the backup. We are now using EsXpress and it seems to be really fantastic: we haven't found any problem when starting the restored databases, but we don't dare to put it in production because we haven't been able to run a "begin backup" command in the windows system before running the esxpress backup command and a "end backup" script when the backup have finished.

Do you know any way of running commands from the ESX server to the VM? Is it possible to do it through the vmware tools?

Of course there is an easy way to run a backup: Consider the VM as a PM and install a backup agent, like TSM or DP, and backup the database with these backup tools.

Thanks and regards,

G.

Reply
0 Kudos
grittyminder
Contributor
Contributor

It is theoretically possible to put Oracle into backup mode, take the snapshot, and export the disks all via the command line. Here are the things you proabably need to do (I don't claim to be an Oracle or VMware expert, so input/feedback is welcome):

Oracle Server:

1) Create a "pre-freeze" bat file that contains the necessary commands to put the Oracle VM into backup mode. This bat file will be executed prior to the snapshot.

2) Create a "post-thaw" bat file that contains the necessary commands to put the Oracle VM back into regular mode. This bat file will be executed after the snapshot.

3) Enable some way to access the Oracle VM "remotely" via the command line. (i.e. enable the Windows telnet service, or install Cygwin and OpenSSH, etc.)

4) Create a user (locally or with some authentication service like Active Directory) to be used with the telnet/SSH account. If possible, limit the system access permissions of the account. Ideally only the pre-freeze and post-thaw scripts should be able to be executed from the account.

5) Poke a hole in the VM software firewall if necessary.

ESX Server.

Create a shell/perl script that

1) Connects to the Oracle VM using telnet/SSH without prompting for a password. Passwordless authentication can be set up for SSH, and I think that telnet allows you to initially specify a password on the command line (so that you will not be prompted for it later).

2) Executes the "pre-freeze" bat then terminates the telnet/SSH session.

3) Takes a snapshot.

4) Exports the disk files.

5) Removes the snapshot.

6 Connects to the Oracle VM using telnet/SSH again and executes the "post-thaw" bat.

It should be noted that ESX Server steps 3-5 can be done using pre-existing backup scripts (e.g. VISBU) so there is no need to re-invent the wheel here.

I would be willing to test all these steps; however, I'm not sure what one should put in the "pre-freeze" and "post-thaw" bat files. If somebody could post that information I would gladly post the results from my tests and try to help anybody else wanting to accomplish the same thing...

Note: the above steps will work if the ESX server is connected to a network on which the Oracle server is also connected, or, alternatively, if the ESX and Oracle servers are separated via a firewall with the appropriate firewall rules in place for telnet/SSH connections.

Reply
0 Kudos
Tom_Daytona
Contributor
Contributor

We are having the same problems with oracle. I was suggested to do this.

In Device Manager, enable view hidden devices Disable Sync Driver under Non Plug and Play drivers. Reboot.

I have to wait til the morning to reboot our oracle server. but I am told it works

Reply
0 Kudos
Rob_Lisi
Contributor
Contributor

Tom did this work? Were you able to restore from a snapshot and Oracle was ok by disabling the sync driver?

Reply
0 Kudos
proden20
Hot Shot
Hot Shot

I've been performing VCB backups on Oracle for years.

I run a "pre" scheduled task on Oracle:

10:50PM: Quiesce the database

11:00PM: VCB comes in and backs up the VM

12:00AM: Unquiesce the database

Works like a charm but you need to coordinate with the DBAs and watch your windows. I try to follow this model with any DB VM. The first time I tried without quiescing the database corrupted. Luckily we're in dev.

Dennis.

If you found this question to be correct or helpful, please remember to award points.

Dennis Procopio - VCP

Reply
0 Kudos
proden20
Hot Shot
Hot Shot

Sorry, jumped the gun and didn't notice you didn't have shared storage or VCB. What about adding storage and cloning to it via script? See the attached document, module 6: "Importing and Exporting.." and take a look at vmkfstools.

VCB is the prescribed way to back up with snapshots. In training it was explained that snapshots were originally developed to allow VCB functionality. Any way you look at it, it could be convoluted or you could purchase a product.

Dennis Procopio - VCP

Reply
0 Kudos
mbuk
Contributor
Contributor

Hi, the answers are all good. The way i will do it is to take Oracle offline with a pre freeze scripe then backit up and the bring Oracle back online with a post freeze script.

The script will need to stop oracle

net stop DatabaseServiceName

echo.

echo -


echo SHUTTING DOWN THE ORACLE DATABASE INSTANCE ...

echo.

oradim -shutdown -sid YourSid -shutmode immediate

echo Database shutdown

echo.

net stop Oracleora102HTTPServer

net stop Oracleora920TNSListener

Then to start

@echo off

echo.

echo -


echo STARTING HTTP + LISTENER SERVICES ...

echo.

net start Oracleora102HTTPServer

net start Oracleora920TNSListener

echo.

echo -


echo STARTING ALL DATABASES + SERVICES

oradim -startup -sid YourSid

echo.

echo -


echo STARTING THE GPI SERVICES ...

echo.

net start DatabaseServiceName

echo.

Please consider awarding point if this helps you.

Reply
0 Kudos