VMware Cloud Community
stanj
Enthusiast
Enthusiast

RedHat and Oracle (dies)

ESX Server 3.5 U4

VCenter 2.5 U4

Dell 2950 with 16 GB of RAM.

Installed RedHat Linux on a VM.

Installed Oracle 10g R2 (10.2.0.1).

Created two instances plus the demo DB.

On the VM, we are seeing Oracle being killed by the Linux OOM process because Linux runs out of RAM.

This happens at different times. After reboot, the instances run for 5-6 hrs, then dies.

Another time, the instances are restarted and will run for about 10 hours.

We have 8 GB of free RAM on the ESX server and have upped the VM RAM and still see the Oracle processes being killed by OOM.

Has anyone seen this problem on Linux VM running Oracle?

Any ideas?

Thanks

Reply
0 Kudos
12 Replies
mcowger
Immortal
Immortal

Is your host overcommited - this has been seen with the balloon driver before, but only if your host is already under memory pressure or you have funky resource pool limits.






--Matt

VCP, vExpert, Unix Geek

--Matt VCDX #52 blog.cowger.us
Reply
0 Kudos
mehul96
Enthusiast
Enthusiast

Linux OOM issues are a nightmare to troubleshoot even on physical servers, on VM it could be worse!

I am guessing you allocated additional RAM to the VM but did you also try increasing the reservation? That way, you can rule out balloon driver, vmkernel swap etc in troubleshooting.

Sine you have 8GB available, a suggestion would be to start with as high as possible for RAM and lower in small increments to see at which point OOM kicks in

Mehul

PS: if you find responses correct or helpful, please consider awarding points!

Reply
0 Kudos
petedr
Virtuoso
Virtuoso

That sounds like a good idea, interesting, I did see that issue in my Oracle environments running under Red Hat

www.phdvirtual.com, makers of esXpress

www.thevirtualheadline.com www.liquidwarelabs.com
Reply
0 Kudos
dxb
Enthusiast
Enthusiast

What version of Red Hat are you using? I've had trouble in the past particularly with RHEL4. Make sure you use the latest kernel too.

Reply
0 Kudos
stanj
Enthusiast
Enthusiast

Interesting..

We are running RedHat 4.5 but the VM is not in a resource pool, so no reservations are set.

We do not have clusters set up (no SAN) so we are limited to the resources on the ESX Server.

There are a mix of 20 other VMs running on the same ESX server but the are not doing alot of work (min RAM usage).

I did find this post from 2007.

Any suggestions like shutting down some of the other VMs or creating a resource pool?

Reply
0 Kudos
dxb
Enthusiast
Enthusiast

Before doing that, have a look at whether there's memory ballooning happening, since if there isn't then the problem is more likely to be within the VM itself.

Reply
0 Kudos
stanj
Enthusiast
Enthusiast

Ok,

Attached are two screen shots from vCenter.

Looks like no ballooning at this point..

But, this morning, the majority of the VMs were shut down.

I would assume that if we do not see the Oracle processes stop on the Redhat VM in the next 8-12 hrs, we may assume it was related to the other VMs using RAM?

Then start up the other VMs and see if the chart will show Memory Blooming avg columns with some numbers?

Reply
0 Kudos
dxb
Enthusiast
Enthusiast

That seems like a reasonable test.

Reply
0 Kudos
mehul96
Enthusiast
Enthusiast

It is not necessary to have a cluster or resource pool in order to set reservations for a VM. In case of a single ESX, it itself becomes the logical resource pool. In order to satisfy a particular VMs CPU and RAM requirements, you can still reserve CPU and RAM so that as you bring up other VMs, they get resources from the unreserved pool. This way, the VM in question gets a guaranteed minimum regardless of other VMs

Even for the unreserved portion, you can set higher share values for a critical VM such that when there is a contention of resources, the critical VM gets a higher share

Hope that helps

Mehul

PS: if you find responses correct or helpful, please consider awarding points!

Reply
0 Kudos
IlyaE
Contributor
Contributor

We run multiple Oracle instances on various VMs with no issues. One thing to watch is dedicating a large enough swap partition to the OS - in our case we had to go 2x RAM for things to remain stable. Not sure why so much swap is needed but this helped alleviate most of our issues.

Cheers,

Ilya

Reply
0 Kudos
stanj
Enthusiast
Enthusiast

Ok

Thanks,,

We will take a look at the reservation and then possibly at upping the swap partition.

The latest test shows that the RH VM Oracle processes stayed up for 24 hrs with 4 other VMs running on the same ESX server.

Ballooning kicked in on the VM at some point. Avg showed at 112341.2 KB

Restarted the VM and the Ballooning spiked to 53552.87 KB but dropped back to 0 and is currently staying there.

I will check back later..

Reply
0 Kudos
stanj
Enthusiast
Enthusiast

Setting the individual VM reservation equal to the VM RAM (4096) has seemed to have solved the problem

Reply
0 Kudos