Oracle on VMware - One Man's battle against a titan.

Oracle on VMware - One Man's battle against a titan.

Is it just me, or is there still a lot of taboo and misinformation in our industry about VMware, and what it brings to the table, and what it's capable of? As big of keywords as 'virtualization' and 'consolidation' have become, I still get the eerie doubt from a lot of peers about just that. The minute you say, "Hey, why don't we just VM it!" there's a hush that falls over the room as everyone starts throwing random reasons out why they think it wouldn't work. And it really comes down to a lack of education (adoption might be a better word..."the norm" so to speak) throughout the industry. You sales folks have GOT to stop feeding us high-level powerpoint slides with these monotone readings over the top of them. Put the slides away and just TALK SHOP WITH US!

As much as VMware and other big players have done for the industry, especially in the last few years, the sales force is still the same as it was10-20 years ago. You're still just using keywords to get our attention, but not really telling us how to use your product to our advantage. Even when we've had vendors and resellers come INTO our office to give 'demonstrations,' it's still nothing more than them plugging their laptop into our big tv, and showing us the same powerpoint they've shown to 100 other potentials. Eventually, the techie's get to take over, and the lonely PSE they let out of the closet for a day to come along finally gets to converse with the company's IT team and we get down to business, if we're not nodding off from watching the slides click by while the droning of the sales rep reading them verbatim seems more like a lullaby.

But that's not what I'm writing about here. This isn't a rant. This was a precursor into a battle I didn't know I was getting myself into.

A little backstory...

Our current environment involves a couple of Sun 18U refrigerators with a fiber-channel Hitachi brick. Pretty cookie-cutter for a high-end, mission-critical Oracle database. Layer on a bunch of Oracle database, application, and DR software as well as all the manual processes that go along with them, and you've got a rough idea of what we're starting with.

Before this upgrade process started, we had finally gone down the consolidated storage path, joined the NetApp country club (as my boss likes to call it), and were using it for mainly shared network drives, user data, and VMware. The NetApp came into play because we also wanted to consolidate the Oracle storage onto the NetApp and off of the standalone brick.

There were a few scenarios that were thrown out in the beginning. There were the new Niagara-based Sun T2000's, which were actually RECOMMENDED to us, and I'm still convinced to this day that this is 100% our fault for not doing the homework we should have done pre-purchase. In case you're not familiar, the Niagara chipset was never really designed to run databases. They turned out to be a huge flop during our performance testing until the right engineer/Sun-guru came along and told us about the whole Niagara story.

It was also during this phase that we started parting the seas of IT between the DBA's and the Operations crowd. I had been doing some serious reading into NFS. I have had a lot of success using NFS + VMware for my datastores. We actually started out using iSCSI + VMware, but I added a new large datastore via NFS and the performance was actually better. And the allure of resizing entire datastores on the fly was enough to make me jump in with both feet. So, the DBA's would never conceive of running anything but fiber-channel to high-end storage. So our initial failed phase of testing with the T2000's was largely attributed to using NFS, because it was the likely scapegoat. It wasn't until we hooked the T2000's up to fiber-channel to the same NetApp configuration that we noticed the performance was still terribad, and only marginally better than our current 5+ year old Sun refrigerators.

After some serious digging, NetApp and Sun engineer involvement, and theorycrafting (more like "ok now what the hell are we supposed to do?"), I started planting the VMware bug.

"No way."

"There's no way VMware can handle the workload."

"Are you kidding me?"

etc.

So, I sat quietly as we went on to the next Sun solution. This time, we did some serious homework, in looking at the Sun M4000. Different architecture, built to run high-end databases, etc etc etc.

At the same time, in another corner of the datacenter, I proactively on my own, built a linux VM. Working with one of the DBA's, we installed a copy of 11gR1, and migrated a copy of our production database to it. We divi'ed up some volumes on the NetApp to host the data, and mounted those via NFS directly to the VM. At the same time, we also configured Oracle's new D-NFS client, to make direct connection from the database to the storage, bypassing the kernel layer.

The results....well, they were nothing short of shocking. The linux VM running 11g completely outworked the M4000 with a 1:1 hardware configuration. Mind you, this was only on a 1GbE connection, single path, to a NetApp filer that was already heavily taxed hosting tons of CIFS shares, and NFS ops to VMware. (Testing was conducted using SwingBench and Real Application Testing)

So, where do we go from here? We had a meeting. Everyone was excited/shocked/appalled by the results ("How the hell did a little VM run circles around the boxes that launch the space shuttle?!"). We all threw our hands in the table, and said "GO VMWARE!" and off we went, with a purpose in mind to virtualize Oracle.

We are currently in the process of building out a production-level environment, and once I have cleared it to discuss it more, look for a subsequent post related to the final architecture.

We are definitely excited. Not so much for the consolidation VMware brings to the table, but the agility and easy resilience things like vMotion, HA, and eventually FT, bring to the table, especially for big tier 1 apps such as Oracle OLTP databases.

Stay tuned.

-Nick

Comments

Hi Nick,

I am in almost similar situation and would love to get VMware proven to my Oracle-hardened colleagues.

Mind sharing your VM config and any other tidbits for when I setup my Goliath-killing vm?

Cheers

Louw

Thanks Nick, good stuff. Looking forward to the follow-up. Also curious how you'll decide to configure resiliency features, and how you'll manage your backups, whether through VM-level tools or just your old agents or what-not.

Lou,

Can I say, "It depends?" Smiley Happy

We did our testing with a 1:1 config vHardware vs pHardware. VM had 4 vCPUs, and 16GB RAM. We split up 12GB of this between SGA and buffer cache (and the other various settings within Oracle) and left 4GB for the OS.

On the host-side, we're using (2) HPDL380G6 boxes in their own cluster.

Now, with all of this said, we're doing a cutover to a 10GbE storage backbone next weekend, so all of this is going to change, and we're going to pick up even more performance between MPIO/NMP + 10GbE + D-NFS.

We also decided to spend $120 and get Oracle's Unbreakable Linux. I have since redubbed it "Unusable Linux" because they don't install a few things you actually NEED (nfs client, ssh server). In hindsight, I've come to like this, and recently, they've released their "minimalist install" which I'm planning on test driving.

I'll have a hard layout documented and will post here once we've officially put it in production.

Thanks for reading!

-Nick

-


Go virtual, or go home.

don,

As far as transaction-consistent backups, we're a NetApp shop, and plan on using Snapmanager for Oracle. This has built-in functionality to put Oracle in hotbackup mode, snap it, release it once it's done. This still needs to be tested fully. For OS-level backups, well, that's up in the air right now because I'm testing about 4 different products to do VM-level backups. Also, when I was building the Linux VM, I got it to a certain point where we were ready to run the Oracle installer, and took a snap. Once Oracle was installed bare, took a snap. These were saved as templates.

I'm currently entertaining the idea of FT, but there is the problem with it only supporting 1 vCPU currently, which is a tough sell. What the DBA's don't know is that I've already chopped it down to 1, and I want to see if they notice. Closely monitoring performance for now before enabling it. We're going 10GbE next weekend, and I'll be able to fully test FT on the environment.

-Nick

Go virtual, or go home.

Hi Nick,

I keep running into the same challenges every time over and over again with Oracle, we do run Oracle on VMware 3.5 and up for several customers out performing physical by a mile. However you do need to know how to tune the Oracle machine with memory PGA and that sort of stuff, same as on tuning your physical machines running Oracle.

One of the battles I had recently with Oracle is described here: http://www.vmguru.nl/wordpress/2009/07/want-to-play-truth-or-dare-with-the-oracle-sales-force

Read and shiver.

We have the principle "Everything virtual, unless...." unless its not possible to run virtual we go physical. (Been a while tho we went physical)

Good luck on your battle and keep us posted.

Regards,

Edwin Weijdema

Version history
Revision #:
1 of 1
Last update:
‎09-09-2009 03:56 PM
Updated by: