VMware Cloud Community
big_vern
Enthusiast
Enthusiast
Jump to solution

ESX memory ballooning - help !

Hi,

We have a 4 node VMware cluster running ESX 3.5 U2 (32GB RAM each hp blad server)).

We are suffering memory ballooning on VMs which show up on the performance tab of ESX. On the cluster we have 91GB of "memory resrvation used" and 26GB "memory unreserved"

Also on the summary tab of each ESX box it show that there is plenty of cpu and memory headroom (about 40 % free memory)

I have checked each individaul VM and ensured that on the 'resources' tab the memory limit is set to 'unlimited', so as far as I know there is no reason any of the vms should balloon?

These are production VMs, I have a support call in with VMware but aren't much further forward. Any help greatly appreciated

0 Kudos
1 Solution

Accepted Solutions
Wozzer
VMware Employee
VMware Employee
Jump to solution

I deliver some of the VMware courses, and resource management is the most difficult to teach as well as the most difficult to learn. Here is my guess about why you have ballooning and swapping.

Bear in mind that the summary is for the cluster. Each VM is running explicitly on one host.

I think you originally stated that 91GB is reserved (ie. it is guaranteed DIMM space for the VM's with reservations set, and it won't be allocated to other VM's), leaving 26GB unreserved. And now you have suggested some VM's have no reservations set. The VM's with no reservation can only utilise that 26GB unreserved. And if much of that unreserved 26GB is on the one 'good' host then the other hosts kernels are probably short of unreserved RAM, and having to balloon (and later swap) to accomodate their guest VM's memory requests.

Ian Worrall

View solution in original post

0 Kudos
18 Replies
LarsLiljeroth
Expert
Expert
Jump to solution

Hi

Is it the same for all vms or just a couple. If you move one of the vms to another host is it then still the same ?



Best regards

Lars Liljeroth

-


If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

// Lars Liljeroth -------------- *If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
big_vern
Enthusiast
Enthusiast
Jump to solution

happening on about 20 VMs, we have one 'good' ESX host

If I vmotion a ballooning VM to that particular 'good' host then the ballooning stops on that VM, but i think this is beacuse the ESX host is hardly stressed at all.

Unfortunately I'm not in a position to move all the ballooning VMs as we haven't the capacity, and my understandin is that they definately should not be balloooning in any case?

0 Kudos
dconvery
Champion
Champion
Jump to solution

Vern -

Also make sure that the total amount of reserved memory does not exceed the total amount of available RAM on the physical machine. SO there should not be more than 117GB RAM reserved. Check any resource pools and VM settings for the memory reservations.The ballooning should not occur unless you are oversubscribed on memory. Also, what is the totla amount of vRAM allocated to all of the VMs in the cluster?

Dave Convery - VMware vExpert 2009

************************

Accomplishing the impossible means only that the boss will add it to your regular duties.

Doug Larson

Dave Convery, VCDX-DCV #20 ** http://www.tech-tap.com ** http://twitter.com/dconvery ** "Careful. We don't want to learn from this." -Bill Watterson, "Calvin and Hobbes"
0 Kudos
Erik_Zandboer
Expert
Expert
Jump to solution

Hi,

First of all, why are you using memory reservations? Secondly, if you see sustained ballooning, it is almost certain you have a limit somewhere where a VM or a group of VMs think they have more memory assigned to thenm than the limit that was set - VMware "solves" that problem by introducing ballooning. The limit could be on a VM level, or on a resourcepool level. For example, if all VMs inside a resource pool have 5x 1GB assigned to them (guest OS memory settings), and the resource pool is limited at 3GB, you would see a total of 2GB of ballooning on these VMs...

I wrote a blog on resources and its use which you might find useful:

Visit my blog at

Visit my blog at http://www.vmdamentals.com
0 Kudos
kjb007
Immortal
Immortal
Jump to solution

Are you using resource pools, and do you have reservations set on them?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
dconvery
Champion
Champion
Jump to solution

Nice post Erik. I see issues with VM performance frequently and many times it comes down to someone changing resource settings in resource pools or VMs.

Dave Convery - VMware vExpert 2009

************************

Accomplishing the impossible means only that the boss will add it to your regular duties.

Doug Larson

Dave Convery, VCDX-DCV #20 ** http://www.tech-tap.com ** http://twitter.com/dconvery ** "Careful. We don't want to learn from this." -Bill Watterson, "Calvin and Hobbes"
0 Kudos
Erik_Zandboer
Expert
Expert
Jump to solution

Thanks. I wrote that post just because so many people are having resource settings they don't even know about (VMware is partly responisble for that themselves), and some are misinterpreting the meaning of the various options and breaking stuff that way Smiley Happy

Visit my blog at http://erikzandboer.wordpress.com

Visit my blog at http://www.vmdamentals.com
0 Kudos
big_vern
Enthusiast
Enthusiast
Jump to solution

Hi Erik,

Wow - I was expecting some guidance /advice on my problem, seems some people haven't even read the question and assumed I am a numpty.

First - I am using memory resrvations on critical VMs to ensure performance should we become overcommitted on the ESX boxes.

Second - If you read my first post it makes it clear that I have checked the VMs memory limit - maybe its not clear enough.

We don't have resource pools

Having taken all that in , is there any reason for the VMs to balloon that anyone has come accross ?

0 Kudos
Wozzer
VMware Employee
VMware Employee
Jump to solution

Vern

The summary tab might not be the best indicator of bottlenecks (if you watch those status bars the info is static for quite a while before it gets refreshed). But you are correct to concentrate on the Host resources.

Ballooning occurs when the ESX server is short of RAM, so it claims some of it's guest VM's configured memory back for itself (forces the guest OS to page stuff out to accomplish this). As the others have suggested, the ESX server could have sufficient RAM, but the VM's are within a Resource Pool which has not got a high enough allocation of RAM to 'feed' the guest VM's. Answer the previous posted questions about whether you have Resource Pools (and if so, what Reservation and Limits are configured) and we can get further. I guess VMware haven't come up with any answers?

Ian Worrall
0 Kudos
big_vern
Enthusiast
Enthusiast
Jump to solution

wozzer,

why on earth is no one reading the info I post, I have answered the question about resource pools, we don't have any, I think I'll give this post up as a bad job

0 Kudos
kjb007
Immortal
Immortal
Jump to solution

In short, no. If there is no overcommittment, taking into account your memory reservations, you will not balloon on your vm's. You should go back through and double-check your resource settings to make sure they aren't the culprit here.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
big_vern
Enthusiast
Enthusiast
Jump to solution

Hi kjb,

Yes, thats what I thought,, I have double checked the reservations, once with VMware tech via webex, I'm still waiting for them to get back to me after looking at the logs.

I will check the reservations again and will be happy to spot my mistake, I was just wondering if there was any other possible explanation bar the ones that have been mentioned.

0 Kudos
Wozzer
VMware Employee
VMware Employee
Jump to solution

Sorry Vern

I wrote my previous post and didn't read your last entry before I posted it. I can sense the frustration here.

<!--

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

{mso-style-parent:"";

margin:0cm;

margin-bottom:.0001pt;

mso-pagination:widow-orphan;

font-size:12.0pt;

font-family:"Times New Roman";

mso-fareast-font-family:"Times New Roman";}

@page Section1

div.Section1

-->

The

Reservations value on VM’s can be ignored now – they each have at least that

much allocated to them. It also seems from your original post that there is RAM

left over and available at the Cluster level, and at least from the Summary tab

each Host also has spare capacity of RAM.

On the face

of it there should not need to be any ballooning. However, as there is then it

is likely that one or more VM’s on each Host that suffers this is requesting a

lot of memory. Perhaps it is worth trying to set Limits on your VM’s and weed

out the culprits? If a VM which is currently using a lot of Host memory has a

Limit set on it, then it can’t exceed that and they will cease to be competing with

eachother for more, and causing the Host to balloon.

Ian Worrall
0 Kudos
big_vern
Enthusiast
Enthusiast
Jump to solution

Sorry all, it appears I am a numpty after all and have misunderstood the role of memory, this is the reply back I got from VMware tech support. I do have a question after you have read it, then I'm out for dinner to eat humble pie..

""We've been analysing your logs here and have come to the following conclusions:

Your server memory is significantly over-committed on both servers for which you have provided logs (esx03 & esx05). In the case of esx03 you have allocated 43GB to running VMs, and on esx05 you have allocated 50GB to running VMs. However, each server has 32GB of physical RAM. This essentially means that ESX has a choice between swapping or memory ballooning, or both. Swapping introduces a higher performance penalty than ballooning, so ESX will try to use ballooning first. In your case, because you are so heavily over-committed ESX is using both swapping and memory ballooning.

However, active memory in many of your VMs is well below what you have allocated for them, which means that memory over-commitment of this scale is most probably not necessary as you should be able to decrease the amount of memory assigned to any VM not making full use of it's allocation.

The alternative would be to increase the physical memory in your servers from the current 32GB per server up to 48GB or even 64GB.

At this point, there's nothing further that I can add. I will keep the servce request open until end of shift today though, in case you have any queries relating to this issue.""

OK - here goes, so balloooning will occur on an ESX host if you have allocated more RAM to the VMs than is on the physcial host, even if those VMs have zero RAM reservation set?

Th reason I ask is that the VMs were ballooning / swapping on the particular hosts, yet the summary tab was only showing 15GB of physical memory in use, indicating to me that there is plenty of RAM left. Why use ballooing, which is crippling the VMs when there is loads of physicalRAM left?

I don't deserve I know but any replies appreciated... :smileylaugh:

0 Kudos
dconvery
Champion
Champion
Jump to solution

Big Vern -

I stayed away after the "scolding", but felt compelled to reply. No need for your humble pie. We all feel frustration in the heat of battle. And things going to the crapper is considered battle in our line of work. In speaking for myself, usually I assume nothing about the person needing help and ask sometimes obvious or basic questions just in case. When gathering information, all of the facts, including the trivial, add up to the entire equation. That being said, I will try to answer your question. I am sure Eric will key in as well. BTW: if you have not already, take a read of his post, it may help you with some understanding.

If you have a host with 32 GB of RAM with 10 guests each at 4GB, you are over comitted by about 10GB. (Giving a little lee-way to the hypervisor 800MB kernel and 1.2GB for hypervisor). Even though each machine may only use 2GB average, there will be peaks, like backups and virus scanning. ESX will allocate 4GB to each VM as it is turned on, even though it only needs 2. As time goes on, the guest may eventually take up the 4GB. As you get close to saturation, ESX kicks in the baloon driver. This will flush the unused RAM. It may be a bad analagy, but the normal RAM usage is almost like a hard drive. As things are deleted, the space is still used until it is needed again. The dirty pages are left behind until the space is needed. The balooning is determined by shares allocated to each VM. If you have upgraded ESX through the 3.0.x days, then the shares may have gotten out of whack as well, which will cause some VMs to baloon before others.

You can use the version of capacity planner that comes with vCenter to analyze the RAM usage of the VMs over time to get a feel for peaks and averages. Then you can adjust VM memory accordingly. Or you can buy more RAM.

Eric, please add your two cents to this.....

Dave Convery - VMware vExpert 2009

************************

Accomplishing the impossible means only that the boss will add it to your regular duties.

Doug Larson

Dave Convery, VCDX-DCV #20 ** http://www.tech-tap.com ** http://twitter.com/dconvery ** "Careful. We don't want to learn from this." -Bill Watterson, "Calvin and Hobbes"
Wozzer
VMware Employee
VMware Employee
Jump to solution

I deliver some of the VMware courses, and resource management is the most difficult to teach as well as the most difficult to learn. Here is my guess about why you have ballooning and swapping.

Bear in mind that the summary is for the cluster. Each VM is running explicitly on one host.

I think you originally stated that 91GB is reserved (ie. it is guaranteed DIMM space for the VM's with reservations set, and it won't be allocated to other VM's), leaving 26GB unreserved. And now you have suggested some VM's have no reservations set. The VM's with no reservation can only utilise that 26GB unreserved. And if much of that unreserved 26GB is on the one 'good' host then the other hosts kernels are probably short of unreserved RAM, and having to balloon (and later swap) to accomodate their guest VM's memory requests.

Ian Worrall
0 Kudos
big_vern
Enthusiast
Enthusiast
Jump to solution

Thanks for replying guys, I think the penny is almost dropped...!!

0 Kudos
cchesley
Enthusiast
Enthusiast
Jump to solution

One of the most useful statistics that i have used to look at memory is Memory Consumed (Average). It gives me a much better indication of how much memory is being consumed and used by the VM's and host than memory active or memory used. I can then correlate the number i get from memory consumed (average) to what i have allocated or limited my VM to use. Both memory used and memory active seem to just show memory actually read and written to, not all memory that is storing memory for the vm as well what is being read and written.

Chris

http://www.vkernel.com
0 Kudos