I got a strange DRS activity in our LAB cluster.
The DRS level is set to fully automated.
I have two hosts memory consumption over 94% while remaining host memory consumption are under 80%, however VMs on the busy hosts do not Vmotion to other Hosts.
What's stranger is that the DRS is still migrating VMs from low-utilization host to the high-utilization hosts.
by looking at the screenshot below. Two hosts 10.50.169.13/14 is over 90% memory use. DRS is still migrating VMs from others to 10.50.169.13
is there a setting where i can setup a trigger "migrate VMs to other hosts whenever the current host is over 90% memory used"?
Can you please check the "active memory" metric for this host from performance tab & let me know?
Click on host>>performance tab (from VI client)>> advanced>> select memory from drop down.
do you have any DRS affinity rules active that your cluster needs to satisfy?
Also please note that it's not the target of DRS to balance the ressources inside the cluster. It's target is to ensure all VMs get the ressources they need.
So if your VMs are happy and get their configured ressources at a host with 97% RAM usage, DRS can be perfectly fine with it.
I like the overview of the attached screenshot way better when it comes to DRS cluster ressources. You can find if in the web client on the cluster summary tab.
Thanks for your comments.
I did not set any rules and affinity applied to the cluster on DRS.
I am just concerning about the RAM over utilization on single host? is there a way of balancing memory across all hosts in a cluster?
here is the summary of the cluster from Web client.
your ESXi hosts are set up with 128GB RAM each, correct? However on the Performance chart you have provided I see that you have Granted ~167 GB of memory to the ESXi host - is your target over-committing the host? Also these high-memory VMs that were migrated could be CPU intensive and the DRS was aiming for best Ready time inside the cluster.
Can you share a little bit more details on your VMs and the cluster setup? Also a screenshot of "Resource Distribution Chart" for the cluster would help us to further clarify the issue
Thanks a lot for your reply.
Yes, indeed we have memory over committing on all of our hosts. this is because we don't have all VMs running together.
Normally we start a group and turn off another group.
For the screenshot of "Resource Distribution Chart", where i can find it?
In vCenter 5.5, memory utilization metric used by DRS is as follows:
X%*IdleConsumedMemory + active memory
By default X is 25. (we can modify this value using adv option)
Consumed memory = active memory + Idle Consumed memory
In our case: Consumed memory is approx 130GB & active memory is just 7.8 GB. Hence Idle consumed memory would be 130-7.8=122 GB (approx)
Now memory considered by DRS is= 25%*122 + 7.8 = 30.5 + 7.8 = 38.3 : It is mean that DRS thinks just 38.3 GB memory is being used and remaining can be used when there is need to balance the load.
As per above calculation, moves recommended by DRS to that particular hosts are valid.
I am digging into memory % shown in first screenshot that you attached.
This is interesting.
I am just aware of this kind of calculation.
May I assume it keeps migrating VMs to that specific host and the memory utilization percent comes up to 100%? What will happen then?
is it possible for me to change the X to 70 or 80 in my case? All of our VMs are memory sensitive.
Yes, you can change X from 0 to 100.
To adjust this setting:
0 to 100 (as per your need)
Before you make these changes, plz make sure you are clear with this option.
I am digging into memory utilization %. I will come back to you on that.
as I see you use the Web Client, you can find it in the following section (we don't use the Web Client in our environment ... yet)
When Memory utilization reaches 100% the VMkernel will have engaged all its reclamation techniques and would be swapping out the VM memory to their .vswp files on to the hard drives, depending on their shares & reservations. You should keep the Free memory above 1965 MB (for 128GB RAM) to avoid TPS -> (1.231 MB Free) Balooning -> (616 MB Free) Compression -> (308 MB Free) Swapping mechanism.
Oh I see. Actually I am using VI client as well, i login web client occasionally.
swapping into hard disk is an unacceptable case in our environment as we don't have any SSD for cache.
Maybe it's time for me review each VM to see if any of them are RAM over committing.
is it possible for me to decrease granted RAM for each VM based on their average active in past weeks/months? this should fit the actual need it requires.
or Maybe i should apply budget to add more RAM on all hosts.
I think changing X from 25 to 80 or 100 will meet my requirement to migrate VMs into other hosts based on the memory utlization% especially when it is running over 90%.
am I right on that?
From the chart you can clearly see that on some hosts, there are many "low-memory" VMs, but on some there are few "large-memory" VMs - therefore the CPU load is offset by how much memory they take on each cluster node.
My best bet is to go for a RAM upgrade if tuning the memory parameter won't help. If you'd want to "optimize" VM RAM usage, you would have to inspect each and every VM, and compare its used memory in the peak times compared to the Consumed memory reported by the vSphere client - this would be pretty tedious seeing how many VMs you have.
No. This setting will apply to entire cluster. Setting this value to 100 is making DRS very conservative.
%X helps to keep enough memory available on the host to avoid the ballooning and swapping activities by existing VMs on that host.
Refer: slide 13-16 from this PPT :VMworld 2013: Performance and Capacity Management of DRS Clusters
Thanks for your advise.
It's now critical for us to consider increasing RAM on hosts.
We have around 400 VMs running so it might be impossible for me to go through each one to adjust memory allocation.
I've changed X from 25 to 85 to increase the weight of idle consumed memory.
I assume this will allow DRS to put the trigger when any host memory % is reaching high under Cluster - Hosts view.
Anyway, it must be the time for us to get more physical memories on hosts.
Again, I would like to say thank you for all your help and advises.
Good luck getting a sufficient budget for lots of new memory sticks and have fun upgrading the hardware