VMware Cloud Community
jengl
Enthusiast
Enthusiast
Jump to solution

What metric to use for memory capacity planning? Active or Consumed (in vROPS Demand or Usage)

Hey all,

I was once again wondering which metric to use for memory capacity planning. Since informations from different sources (e.g. Your ESXi Host needs more RAM from Iwan Rahabok or VMworld 2012: Session VSP1729 - Understanding Virtualized Memory Performance Management - Eric Sloof... from Kit Colbert) all pointing to Active (Demand in vROPS), I was expecting that when we have high memory consumption / usage in vCenter there will be no ballooning / compression / swapping.

Also the metric active/demand shouldnt be used for all workloads (e.g. Java / DB), why is that? I didnt find the reason for that anywhere, only the fact.

Other sources like Mark Achtemichuk suggest to use consumed/usage, thats the opposite (Understanding vSphere Active Memory - VMware vSphere Blog - VMware Blogs).

So I thought I go with active/demand as metric and ignoring the warnings inside vCenter.

But last time we patched the half of our cluster we had high consumed/usage of memory and the vCenter started to balloon, compress and finally swap. Not much but I wasnt expecting that, because active memory was only a small percentage of consumed.

Can someone explain me, why this happened and who is right regarding this questions.

Kind regards,

jengl

1 Solution

Accepted Solutions
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

jengl‌ I'm not sure if you have asked the wrong question, or I give the wrong answer 🙂

It's always good to attach screenshots, background to the questions, and what exactly you're trying to solve.

You wrote:

So to sum it up:

- ESXi uses consumed RAM to decide when to start memory reclaiming techniques

- To rightsize VMs for RAM you should use the counter inside the OS, not Active

- If you don't want to over-commit RAM and have a lot of ballooning you should right-size your VMs or extend the RAM / buy additional servers

Please correct me if I am wrong.

My reply: you are right. If you do not have access to the Guest OS, then VM Active and VM Consumed are best guess. Better than nothing 🙂

Sizing ESXi and sizing VM are 2 different things (2 different use cases). Please don't mix them up as it becomes confusing. Don't try to resize the VM when you are sizing the ESXi, unless you have full control over them.

You wrote:

The question remaining for me is: How much of the cache (Standby in Windows) inside the VM is really needed for the VM to have the best performance without wasting resources. Maybe this question can only be solved by testing or do you know another option?

My reply: this is a different topic brother 🙂 It's a different realm as this is inside a Guest OS (Windows in this case) and no longer at ESXi level. Suggest you post a different one, possibly under Windows or vSphere too 🙂

e1

View solution in original post

14 Replies
jengl
Enthusiast
Enthusiast
Jump to solution

Thanks Iwan for responding to my IM!

But I still dont get it totally, in this article you recommend to look inside the VMs, but that doesnt help me with capacity planning right?

Also I am not understanding why the vCenter begun to balloon, compress etc.

And last but not least why is it a problem to use the active/demand metric for Java/DB-VMs.

Another thing: Why is vROPS using Active for right-sizing if this is too aggressive?

Would be nice if you can elaborate a little bit more.

Thanks!

jengl

Reply
0 Kudos
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

But I still dont get it totally, in this article you recommend to look inside the VMs, but that doesnt help me with capacity planning right?

[e1: are you doing capacity planning for each VM? If yes, then you should have access to the VM. My customers do not have accessed to VM since it's not owned by them. So they only do capacity planning for IaaS (ESXi).

Also I am not understanding why the vCenter begun to balloon, compress etc.

[e1: not sure what you mean here. You mean the vCenter VM begins to balloon? vCenter does not initiate ballooning. ESXi does. Kindly look at the ESXi memory management whitepaper as it explains it. }

And last but not least why is it a problem to use the active/demand metric for Java/DB-VMs.

Another thing: Why is vROPS using Active for right-sizing if this is too aggressive?

[e1: I hope this next sentence is not seen as promoting my own book: I answer lot of questions like this in the book]

e1
Reply
0 Kudos
jengl
Enthusiast
Enthusiast
Jump to solution

No, I am also doing capacity planning on the ESXi level. But I am wondering which metric the ESXi uses to decide when to start balloning...does he takes the active or consumed metric?

Sorry for not being clear, I meant the ESXi begun to ballon not the vCenter. In your blog article (Your ESXi Host needs more RAM), you had high memory usage but no balloning, so no contention.

We had the same high usage in our environment, but in our case the hosts begun to balloon etc. What is the difference?

Thanks for taking your time and helping me!

I reread parts of your book and found some answers in them, but not everything. I am looking forward to the new updated version and can only recomment it for every VMware administrator.

Greetings,

jengl

Edit:

Here is an example:

Low contention and workload

high usage

pastedImage_0.png

But still ballooning and swapping:

pastedImage_2.png

Hope this describes what I dont understand.

Regards,

jengl

Reply
0 Kudos
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

I've updated the blog article.

It answers your question on "you had high memory usage but no ballooning, so no contention." The answer is simply the Host did not have high Memory Consumed. It is a 2nd example, not the 1st example. The one with high memory usage is the 1st one.

It also answer your main question.

If it does not, let me know. You need to tag me as I don't get notification.

Have a great day travelling down the memory lane 🙂


e1
Reply
0 Kudos
jengl
Enthusiast
Enthusiast
Jump to solution

Hey Iwan,

sorry for taking so long, I was sick und now reading your new book Smiley Happy. So far very good content! I am still traveling... Smiley Wink

But I have another question for you: I thought in vCenter Memory Consumed is mapped to Usage, so in your first example you should also have high Memory consumed values. What am I missing?

Kind regards,

jengl

Reply
0 Kudos
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

Sure, I hope this answers your question. I copied from page 405 of the book.

For utilization, vCenter provides Active (KB), Active Write (KB), Consumed (KB), Granted (KB), and Usage (%). Because you will have VMs with different vRAM sizes, it is easier to use the Usage (%) counter.

Pay special attention to the Usage counter. It has different formulas depending on the object:

• For VMs, it is mapped to Active

• For ESXi, it is mapped to Consumed

• For clusters, it is mapped to Consumed and Memory overhead

The effect of this formula is that you will see ESXi usage as much higher than your VM usage. For example, if all the VMs have the same size of RAM and their usage is about the same, you will notice your ESXi usage is higher than VM usage.

Technically speaking, mapping usage to active for VM and consumed for ESXi makes sense, due to the two-level memory hierarchy in virtualization. Operationally, this can create some confusion, as it is not a consistent mapping.

At the VM level, we use active as it shows what the VM is actually consuming (related to performance). At the host and cluster levels, we use consumed because it is related to what the Guest OS has claimed (related to capacity management).

To tag me, I think it's a matter of the @ sign. Something like jengl

As you type the @ sign, pause and let it drop down the list of people. I tried with @iwan and it listed me as the first result.

e1
jengl
Enthusiast
Enthusiast
Jump to solution

Iwan Rahabok‌: Maybe I have asked the wrong question.

In your first example on your blog (Your ESXi Host needs more RAM) you described a situation where the vCenter reported high memory utilization but vCOPS not:

Picture3.png

Picture6.png

But in this example when you deploy more VMs with around 3 GB of RAM and they consume it (as most OSes do) almost completely after boot you will leave the high memory state of the host and the host will began to balloon, compress and swap. Is that correct?

So to have the possibility to run more VMs on this host without over-commiting you have to downsize the existing VMs to free allocatable RAM, right?

I think in our case the active RAM consummation of the VMs was higher, so we had the balloon, compress and swapping issues. Could that be the case?

So the right question would be: How to right-size the VMs for RAM if you doesn`t want to over-commit and that is described in your other blog post (How to monitor Windows RAM usage with vRealize Operations 6.1), which you posted directly after my question Smiley Happy.

What I don't understand is why in your first blog post (Your ESXi Host needs more RAM) you use the Demand metric (derived from Active) to show that the VMs doesn't use most of the RAM. But in your other post you say that you shouldn't use Active, but the counters from inside the OS. Isn't that a contrary statement?

So to sum it up:

- ESXi uses consumed RAM to decide when to start memory reclaiming techniques

- To rightsize VMs for RAM you should use the counter inside the OS, not Active

- If you dont want to over-commit RAM and have a lot of balloning you should right-size your VMs or extend the RAM / buy additional servers

Please correct me if I am wrong.

The question remaining for me is: How much of the cache (Standby in Windows) inside the VM is really needed for the VM to have the best performance without wasting resources. Maybe this question can only be solved by testing or do you know another option?

BTW: Are you also planning to write such a blog post like your right-sizing Windows RAM for Linux? That would be great! Smiley Happy

Kind regards,

jengl

Reply
0 Kudos
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

jengl‌ I'm not sure if you have asked the wrong question, or I give the wrong answer 🙂

It's always good to attach screenshots, background to the questions, and what exactly you're trying to solve.

You wrote:

So to sum it up:

- ESXi uses consumed RAM to decide when to start memory reclaiming techniques

- To rightsize VMs for RAM you should use the counter inside the OS, not Active

- If you don't want to over-commit RAM and have a lot of ballooning you should right-size your VMs or extend the RAM / buy additional servers

Please correct me if I am wrong.

My reply: you are right. If you do not have access to the Guest OS, then VM Active and VM Consumed are best guess. Better than nothing 🙂

Sizing ESXi and sizing VM are 2 different things (2 different use cases). Please don't mix them up as it becomes confusing. Don't try to resize the VM when you are sizing the ESXi, unless you have full control over them.

You wrote:

The question remaining for me is: How much of the cache (Standby in Windows) inside the VM is really needed for the VM to have the best performance without wasting resources. Maybe this question can only be solved by testing or do you know another option?

My reply: this is a different topic brother 🙂 It's a different realm as this is inside a Guest OS (Windows in this case) and no longer at ESXi level. Suggest you post a different one, possibly under Windows or vSphere too 🙂

e1
jengl
Enthusiast
Enthusiast
Jump to solution

Iwan Rahabok‌: Could you please answer my first question, which was:

But in this example when you deploy more VMs with around 3 GB of RAM and they consume it (as most OSes do) almost completely after boot you will leave the high memory state of the host and the host will began to balloon, compress and swap. Is that correct?

So to have the possibility to run more VMs on this host without over-commiting you have to downsize the existing VMs to free allocatable RAM, right?

I think in our case the active RAM consummation of the VMs was higher, so we had the balloon, compress and swapping issues. Could that be the case?



I really want to understand the difference between your example and our situation.

Okay, I understand why you used VM Active Smiley Happy, but if we dont want to overcommit Memory and have a high consummation we have to downsize the VMs (if possible) to have the possibility to deploy more VMs, right?

Regarding the cache question: Yeah thats a different topic, but I thought you maybe have all the magical answers Smiley Wink.

Thanks for all your time and hopefully I will understand it completly soon Smiley Happy.

Regards,

jengl

Reply
0 Kudos
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

But in this example when you deploy more VMs with around 3 GB of RAM and they consume it (as most OSes do) almost completely after boot you will leave the high memory state of the host and the host will began to balloon, compress and swap. Is that correct?

[e1: Windows does, Linux does not write 0 upon boot. Also, ESXi is smart enough as it's all 0, although the benefit has reduced with Windows randomising its address space]

[e1: ESXi when it has many pages mapped, it will trigger balloon. But this is a high number, probably near 98%]

[e1: more details on the 2nd edition of the book. Apology cannot recall which page....]


So to have the possibility to run more VMs on this host without over-committing you have to downsize the existing VMs to free allocatable RAM, right?

[e1: Yes. It's supply and demand model. Either you reduce demand, or increase supply]


I think in our case the active RAM consummation of the VMs was higher, so we had the balloon, compress and swapping issues. Could that be the case?

[e1: No. I don't think it's driven by active. It's Consumed that drives ballooning. I could be wrong here.]


Regarding the cache question: Yeah thats a different topic, but I thought you maybe have all the magical answers .

[e1: I'm kinda lost in the length of the 1st question already 🙂 Hence I suggested you break it so you get your answer. A long thread spanning a long time resulted in TLDR by many]

e1
jengl
Enthusiast
Enthusiast
Jump to solution

Iwan Rahabok: Yes, I am sorry if I am mixing things and didnt explain my intentions so good. But one question leads to another and I want to know all the answers Smiley Wink.

I have checked some of our Linux VMs and they use most of their RAM for caching, so it will be shown in vCenter as consumed. What do you mean by smart enough to recognize 0s, does this vRAM didnt count as consumed for the VM?

Regarding the issues in our environment, I also think that its driven by consumed. What I meaned is that it seems in our case there wasnt enough RAM to be freed through balloning, so we had compression and swapping. Or the RAM was used from the VMs so fast through the mass vMotion, that there wasnt enough time to free up enough RAM through balloning.

I will open another thread in another forum regarding the "how much of the cache inside the VM is needed" question, but I am wondering which is the right one. Any suggestions? Maybe a completly other website especially for Windows/Linux?

Kind regards,

jengl

Reply
0 Kudos
Iwan_Rahabok
VMware Employee
VMware Employee
Jump to solution

I have checked some of our Linux VMs and they use most of their RAM for caching, so it will be shown in vCenter as consumed. What do you mean by smart enough to recognize 0s, does this vRAM didnt count as consumed for the VM?

[e1: I meant ESXi would not duplicate the pages. TPS would kick in unless there is randomisation by Guest OS. The shared page counter counts this]

I will open another thread in another forum regarding the "how much of the cache inside the VM is needed" question, but I am wondering which is the right one. Any suggestions? Maybe a completly other website especially for Windows/Linux?

[e1: Yes, I'd suggest a site that focus on the question. Apology, I'm not sure which Windows or Linux web site is good for such]

e1
Reply
0 Kudos
jengl
Enthusiast
Enthusiast
Jump to solution

Thanks Iwan Rahabok‌ for your time and detailed explanation.

Now I am one step further in my knowledge of vSphere Smiley Happy.

Kind regards,

jengl

Reply
0 Kudos