txarls_89
Contributor
Contributor

vRops 6.7 and memory usage

Jump to solution

Hi everybody!

We are trying the new version of vmware vRops 6.7 and there are some metric compared with the previous version that are very confuse, one of them is the "memory usage".

In vRops 6.6 the metric "memory usage" is the "guest memory", that's ok. Now you read the release notes about 6.7

"

The Memory|Usage (%) metric of Virtual Machines considers memory usage from Guest OS perspective and not from hypervisor perspective.

In previous releases, the Memory|Usage (%) metric of virtual machines referred to the amount of memory that is actively used, as estimated by VMkernel based

on recently touched memory pages.

This was different from what you would see inside the Guest OS as a memory usage.

The formula of metric Memory|Usage (%) is now changed to (Memory|Utilization (KB) / Memory|Total Capacity (KB)) * 100.  Here the newly introduced Memory|Utilization (KB) metric depends on the Guest OS metric,

which is provided through VMware Tools, and is available since vCenter Server 6.0 Update 1, ESXi 6.0 Update

"

One example comparing vROPSa 6.6 and 6.7 in my enviroment

I have a VM1 with 16384 MB Ram Memory Size

VM1--> vROPS 6.6

Memory usage-->21.79 %

Memory recommendation size--> 7413 MB RAM

Active guest memory-->3768 RAM

VM1-> vROPS 6.7

Memory usage-->87.54 %

Memory recommendation size--> 16777 MB RAM

Active guest memory-->3768 RAM

will this be the topic? Because the are a lot of difference between 7,4 GB and 16 GB in recommendations terms. I have to purchase a new hardware....

Best regards

1 Solution

Accepted Solutions
sunnydua2011101
VMware Employee
VMware Employee

Thanks vExpert: Chip Zoller​ for adding me this thread.

I have gone through the comments from all the community members and I appreciate all the feedback both positive and not so positive (read negative) on this change.

I do want to echo the thoughts of txarls_89​ and being technical we all understand that the Active Memory at the hypervisor level is not the correct indicator of capacity when it comes to understanding of application behavior.

To run applications efficiently one has to understand what is going on inside the Guest Operating system and if possible, even better to understand whats happening at the application layer. The intention behind intorducing a guest os level metric is to ensure that you are able to get the insight into the guest os without having to install agents or have security holes in your app and hence we leverage VMware tools for this integration.

For technical folks on the thread, this is new metric is calculated using the following formula

Guest Metrics= GuestOS Needed Memory + (GuestOS Page in Rate * GuestOS Page Size) + VM Memory Overhead

Where VM Memory Overhead = VM Configured Memory - GuestOS Physically Usable Memory

The Memory|Usage % and Memory|Workload % metric is configured to automatically use the Guest OS Memory metrics which we collect from VMware tools.

Basically, we are leveraging he following 4 metrics from the guest OS which we are getting through our VMware Tools integration :

Guest|Needed Memory

Guest|Page In Rate

Guest|Page Size

Guest|Physically Usable Memory

Now, in cases where you do not have VMware Tools, this metric will automatically fall back to Memory Consumed % which is a high water mark to be rather conservative instead of being overly aggressive with Active Memory. It is essential to be on vSphere 6.0u1 (vCenter and ESXi) with the latest version of VMware tools to benefit out of the guest memory data. This integration has been in place since vROps 6.3 and now we are leveraging this metrics inside the product based on high customer demand of making sure that we are more closer to applications while assessing memory as memory at the hypervisor layer seems too aggressive to the app owners who are the unlimate consumers of IaaS.

We have also found a few cases where VMware tools is unable to supply these metrics due to some identified bugs. We are triaging the bugs and looking at possible solutions to fix them. Will keep you guys on this thread posted with the outcomes of this bug.

Also for people who still want to leverage the Active Memory for alerting, dashboards, reports etc. we do have the metric available in the policy and it can be ebabled easily to be reused in any content by a simple modification of this content.

Here is the extract from the release notes which explains the above mentioned behavior - vRealize Operations Manager 6.7 Release Notes

The Memory|Usage (%) metric of Virtual Machines considers memory usage from Guest OS perspective and not from hypervisor perspective.

In previous releases, the Memory|Usage (%) metric of virtual machines referred to the amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages. This was different from what you would see inside the Guest OS as a memory usage.
The formula of metric Memory|Usage (%) is now changed to (Memory|Utilization (KB) / Memory|Total Capacity (KB)) * 100.  Here the newly introduced Memory|Utilization (KB) metric depends on the Guest OS metric, which is provided through VMware Tools, and is available since vCenter Server 6.0 Update 1, ESXi 6.0 Update1, and VMware Tools - 9.10.5.  If these versions of  vCenter Server, ESXi, and VMware Tools are not met, the Memory|Utilization (KB) metric will fallback to "Memory|Consumed (KB)" which shows usage from hypervisor perspective and not Guest OS perspective.

Note: If interested in the older Memory|Usage (%) metric of virtual machines, which was based on active memory, use the Memory|Guest Active Memory (%) replacement metric. This out of the box metric is disabled and first needs to be enabled in the corresponding policy of a virtual machine.

Feel free to reach out to me with questions on this topic and would be ahppy to connect -  duas@vmware.com

Regards

Sunny Dua

Product Manager, vRealize Operations

Regards Sunny

View solution in original post

31 Replies
daphnissov
Immortal
Immortal

Yes, and the reason is that until you look inside the guest OS to see how its using memory, it's just a guess. So in 6.7 this change means more accurate memory usage numbers because it is looking inside the guest as opposed to just looking from the hypervisor's perspective (which was not very useful).

0 Kudos
kdeitermann
Contributor
Contributor

Unfortunately, I find this change totally bad. Since today's operating systems use RAM for file cache without end, so all VMs in our environment now have 100% RAM Usage. As a result, one can no longer show the difference between real active RAM and the task manager RAM. Especially in the past we were able to find application administrators with errors in applications that do not use the RAM as it is available. As far as performance analysis is concerned, unfortunately, vROps has become a completely useless program. Especially because I can no longer query the active RAM in Metricen.

OsburnM
Hot Shot
Hot Shot

My experience with this so far (with Windows VMs at-least) is that they missed the mark here.  What I am seeing is they are looking at the "Free" memory metric as seen by Windows Task Manager and not the "Available" memory metric.

So a Windows VM that has 4GB RAM shows has having 1.1GB Available; however only 1MB Free.  vROps now reports this VM has having no memory free.

I was 100% on-board with this idea but seeing it in-action with using Free and not Available is a big problem.

nmanm0305
Enthusiast
Enthusiast

After upgrading from 6.6 to 6.7, we found that the Memory|Workload (%) metric was reporting abnormally high for almost every VM in our environment. After checking the a few individual VMs, we knew the metric was "wrong".

We replaced the Memory|Workload (%) with Memory|Guest Active Memory (%) as test. The measurement from Memory|Guest Active Memory (%) reported "correctly" (as Memory|Workload (%) ​used to).

txarls_89
Contributor
Contributor

Hi daphnissov

I have always heard that the most important metric about memory, is the " active memory" , because the OS's memory have different algorythm to measure the active memory and how some people said " the OS consume memory in cache" and so on.

I will  wait about future release, because this is confuse for us.

0 Kudos
sxnxr
Commander
Commander

There are two trains of thought on this and it is good or bad depending on where you are looking at this from.

Capacity

Most people will use the active memory when looking to reclaim memory because it makes the story to management believable and easy to have. We all know it is hard to get more hardware so being able to run a report to show your bosses that you can reclaim x TB of memory form VMs because they were never right sized correctly in the first place is the easy conversation.

Performance

If you are concerned about the performance of a VM and dont care on capacity then the in-guest stats are king. If you just look at a VM from a pure container using resources POV you dont care about performance.

In reality you need to care about both and you need to know how applications work from not just a capacity perspective but also performance.

Take SQL for example. If you have a DBA that keeps asking for more memory because the VM is running out of memory and crashing and he shows you a perfmon chart with it running at 100% to prove that he needs more he is not a very good DBA for a start he should limit the memory SQL can consume as not to use all the memory and starve windows.

Because SQL and other applications cache to RAM you are now relying on the DBA to know what he is doing and know when he needs more memory due to performance and not the windows perfmon. from a vrops perspective i want it look at the inguest. if the SQL is not using all the memory for cache then it should be reclained. but if it is then it should not.

If you look at a file server then the active and in-guest should be closer.

Do you want vrops to be more accurate and protect you better from resizing a VM to small because it was looking at active but now in-guest or would you rather it be more inaccurate but give you better looking numbers to go back to you bosses with but could cause performance problems

0 Kudos
bjp106
Contributor
Contributor

Can someone please advise how to enable the Memory|Guest Active Memory (%) metric? I do not see it even as an available metric to choose from. Thanks!

0 Kudos
nmanm0305
Enthusiast
Enthusiast

You may have to enable it in your policies.

pastedImage_0.png

If its already enabled but you're not seeing it, try selecting a specific VM to force it to show.

pastedImage_1.png

pastedImage_2.png

Once you select a specific VM, the metric should be available.

pastedImage_3.png

bjp106
Contributor
Contributor

That did the trick for me. Thank you

0 Kudos
jengl
Enthusiast
Enthusiast

Hi @ll,

I also think the big problem here is that cached memory is seen as used memory. There should be two different counters (used and cached) and maybe one summed up (used+cached).

The calculations should be definitely be derived from used memory without cached!

Iwan Rahabok​: Maybe this is something you can internally prioritize inside the vROPS dev team.

Thanks!

jengl

0 Kudos
paurgie
Contributor
Contributor

I'm with the "this is broken" crowd.   Maybe it works ok in "not windows" but since we have a high proportion of Windows guests, it breaks many metrics.

- All of the canned sizing metrics are useless now

- Memory monitoring is basically non-functional.   We noticed that all of our SAP app servers are now complaining about high memory utilization and their heatmap is shades of crimson.  The date chart is all red now where you used to be able to see exactly when most people got into the office based on the application server's graph, for example.

- Troubleshooting basically has to ignore memory now, because it's always high.

It's all fine and dandy if they change the metric calculation, but it shouldn't break a good portion of the "out of the box" functionality.  At this point, it's a bug, whether it's an implementation bug or a design bug, but the service is not working like it's described "on the box."

PaulFreedman
Enthusiast
Enthusiast

Has anyone raised this with VMware?

Is there a way to change the Mempry Usage metric to use the Guest Active Memory % metric so that all of the dashboards\alerts become relevant again?

0 Kudos
nmanm0305
Enthusiast
Enthusiast

Manually switching to the "Guest Active Memory %" metric on an ad hoc basis was the workaround recommended to me by a VMware consultant (pending an actual patch). I havent seen an official kb for it.

The canned alerts are bothersome. We're simply living with it for now; having informed all consumers of the alerts.

0 Kudos
TacoSauce
Enthusiast
Enthusiast

This is also the feedback which I got from GSS:

Would you mind follow recommendation from release notes in VROPS 6.7  and shee how it goes please?

https://docs.vmware.com/en/vRealize-Operations-Manager/6.7/rn/vRealize-Operations-Manager-67.html

Note: If interested in the older Memory|Usage (%) metric of virtual machines, which was based on active memory, use the Memory|Guest Active Memory (%) replacement metric. This out of the box metric is disabled and first needs to be enabled in the corresponding policy of a virtual machine.

0 Kudos
silus
Enthusiast
Enthusiast

The problem with this change it it is not looking at active memory from a guest perspective it is also counting cached memory.


Most modern OS will cache things in memory in favour of emptying cache when something requires the space.

So the hypervisor level 'active' memory was a useless metric and also this metric is also useless because it should not include cached memory. So with this metric most modern servers look like they have no memory.

0 Kudos
JimKnopf99
Commander
Commander

After upgrading to 6.7 all of our vm´s show 100 % memory usage. Not only some, ALL.

The memory metrics are useless.

And now, i have to open a case....

Frank

If you find this information useful, please award points for "correct" or "helpful".
silus
Enthusiast
Enthusiast

Is anyone from vmware going to comment on this car crash of a release?

I can't believe it made it out the door in this state

0 Kudos
daphnissov
Immortal
Immortal

(cc: sunnydua201110141​)

0 Kudos
sunnydua2011101
VMware Employee
VMware Employee

Thanks vExpert: Chip Zoller​ for adding me this thread.

I have gone through the comments from all the community members and I appreciate all the feedback both positive and not so positive (read negative) on this change.

I do want to echo the thoughts of txarls_89​ and being technical we all understand that the Active Memory at the hypervisor level is not the correct indicator of capacity when it comes to understanding of application behavior.

To run applications efficiently one has to understand what is going on inside the Guest Operating system and if possible, even better to understand whats happening at the application layer. The intention behind intorducing a guest os level metric is to ensure that you are able to get the insight into the guest os without having to install agents or have security holes in your app and hence we leverage VMware tools for this integration.

For technical folks on the thread, this is new metric is calculated using the following formula

Guest Metrics= GuestOS Needed Memory + (GuestOS Page in Rate * GuestOS Page Size) + VM Memory Overhead

Where VM Memory Overhead = VM Configured Memory - GuestOS Physically Usable Memory

The Memory|Usage % and Memory|Workload % metric is configured to automatically use the Guest OS Memory metrics which we collect from VMware tools.

Basically, we are leveraging he following 4 metrics from the guest OS which we are getting through our VMware Tools integration :

Guest|Needed Memory

Guest|Page In Rate

Guest|Page Size

Guest|Physically Usable Memory

Now, in cases where you do not have VMware Tools, this metric will automatically fall back to Memory Consumed % which is a high water mark to be rather conservative instead of being overly aggressive with Active Memory. It is essential to be on vSphere 6.0u1 (vCenter and ESXi) with the latest version of VMware tools to benefit out of the guest memory data. This integration has been in place since vROps 6.3 and now we are leveraging this metrics inside the product based on high customer demand of making sure that we are more closer to applications while assessing memory as memory at the hypervisor layer seems too aggressive to the app owners who are the unlimate consumers of IaaS.

We have also found a few cases where VMware tools is unable to supply these metrics due to some identified bugs. We are triaging the bugs and looking at possible solutions to fix them. Will keep you guys on this thread posted with the outcomes of this bug.

Also for people who still want to leverage the Active Memory for alerting, dashboards, reports etc. we do have the metric available in the policy and it can be ebabled easily to be reused in any content by a simple modification of this content.

Here is the extract from the release notes which explains the above mentioned behavior - vRealize Operations Manager 6.7 Release Notes

The Memory|Usage (%) metric of Virtual Machines considers memory usage from Guest OS perspective and not from hypervisor perspective.

In previous releases, the Memory|Usage (%) metric of virtual machines referred to the amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages. This was different from what you would see inside the Guest OS as a memory usage.
The formula of metric Memory|Usage (%) is now changed to (Memory|Utilization (KB) / Memory|Total Capacity (KB)) * 100.  Here the newly introduced Memory|Utilization (KB) metric depends on the Guest OS metric, which is provided through VMware Tools, and is available since vCenter Server 6.0 Update 1, ESXi 6.0 Update1, and VMware Tools - 9.10.5.  If these versions of  vCenter Server, ESXi, and VMware Tools are not met, the Memory|Utilization (KB) metric will fallback to "Memory|Consumed (KB)" which shows usage from hypervisor perspective and not Guest OS perspective.

Note: If interested in the older Memory|Usage (%) metric of virtual machines, which was based on active memory, use the Memory|Guest Active Memory (%) replacement metric. This out of the box metric is disabled and first needs to be enabled in the corresponding policy of a virtual machine.

Feel free to reach out to me with questions on this topic and would be ahppy to connect -  duas@vmware.com

Regards

Sunny Dua

Product Manager, vRealize Operations

Regards Sunny

View solution in original post