I'm constantly running into situations where VCOPS will tell me that from an active memory persprective a VM is happy, but if you look @ the actual VM its not happy, example below:
And in this case, VMware suggests we look to remove 3.2 GB and although I understand its not always gospel I need someone to help explain to me where I should go with these types of situations and how to better interpret the true usage and if more memory is required.
What is the configured memory for this VM? Demand will always be lower than usage unless there is contention. Typically, waste recommendations will take your configured memory and subtract the mem demand, smooth/filter out that data using the oversized VM criteria, then pop out a mem reclamation recommendation. Obviously that is simplified (reservations, etc change the logic a little), but that's the general idea.
i am facing something similar. We are checking our vm´s with nagios and there i have an memory warning of an vm. But vcops told me to reclaim 1 gb of memory.
I am not very happy with the reclaimable waist calculation.
Maybe i am doing something completely wrong. How to check in vcops if an vm needs more mem? I guess check the demand for the machine should be the perfect way.
How to explain an admin if nagios told him the machine has memory issues that vcops don´t see the same? Whom should i believe? Its baffled for me.
As an example. My vm has 4 GB of memory configured. Its an linux machine and has an demand of max 3 GB. But my admins wan´t mroe memory because nagios told him that the machine has not enough memory.
I can't predict what perf counters Nagios is looking at, so you'd have to tell me that end of it before I could have some insight in to the counter's meaning. I'd suggest looking at a process level to see what is consuming your memory. It is possible that you've got processes with reserved / heap memory pools, which are reducing the otherwise available mem to your system. This reserved/pool memory can be inactive, which means it is going to show up as inactive in vSphere/vCenter and thus be considered "waste" by vC Ops. There's no much you can do about this today in the way of the native vC Ops calculations. To get a more app-specific calculation of free memory on a virtual machine, you'd need to analyse the Guest-OS/App-level memory instead of the virtual hardware memory. Today, the vSphere UI only support vCenter metrics. In the future that limitation won't be the case, however we need to look at things in "today" terms. If you want to analyses the Guest-OS/App memory (from 3rd party data source in vC Ops)), you'd need to do this in the Custom UI using things like Top-N and whatnot.
nagios is only showing the snmp information that he is getting out from the os. So there is nothing special that nagios is showing up.
I try to dig deeper into memory management and vcops. Therefor i installed the hyperic client on one of our linux server. With the hyperic agent, i am able to see more memory values than with vcops.
It is showing me a better view of my swap usage as well as the memory usage itself. With that information, i was able to change the swappiness method of my linux machine, so that this is swapping after using 90 % of the allocated memory.
After that, the swap usage significantly going down.
With vcops, i am not able to see the swap usage. Swap in and Swap out rate where alway 0 kb. But i am not sure if those values are comparable.
Its a pitty that i did not saw those metrics with vcops.