Recently some of our application owners are starting to ask for servers with 8 vCPU and 32+ GB of RAM. I am trying to establish a policy on what the max vCPU and RAM we will support and looking for some justification for those limits. Without that, app owners will just keep asking for more and more.
For quite a while I have always understood that there is more overhead to SMP than single vCPU VM's. vSphere 5 was supposed to make some substantial improvements to that, but you still have the issue of the scheduler that has to concurrently find an open slice on a bunch of cores for these SMP boxes to execute their instructions. Obviously the more VM's and the more SMP VM's the harder it is going to be for the scheduler to find slices of the physical cores available and you will start getting increased latency. There are many blogs that address this in detail so I don't think I need to spend much more time on that.
We have two clusters right now. Our Test/Dev cluster is four hosts, two socket 4 cores (8 cores total) and 40 GB of RAM per host. Our production cluster is six hosts, two socket 6 cores (12 cores total) and 144 GB of RAM per host.
My first thought is that vCPU should be less than the number of cores in one socket. Otherwise, I assume, each SMP instruction would need to utilitze a full socket just for that one VM. I recall reading something about there being issues if a VM has more vCPU than cores because of multi-threading across physical sockets.
So I would be suggesting, max 2 vCPU in our test/dev cluster and 4 vCPU in our production cluster. In addition, I would think that there would need to be some justification for even 4 vCPU. Keep in mind these are generic rules and there will always be exceptions. For instance if you have a main line of business application that you want to virtualize and you plan to basically dedicate a host to it. My guidelines though are for your average application owner that some department or business units needs some new application or an upgrade, etc.
On the RAM side, I have not found any guidelines yet, My biggest thing is that if you allocate too much RAM to one VM, DRS is going to only allow a handful of VM's on that host and that will cause more cpu pressure on the other hosts. Now if the application actually needs that RAM and you give it too little then you are also going to cause unnecessary swapping. My initial gut reaction is something like max RAM should be no larger than 1/2 the host's RAM or maybe 1/3, but there is no data to support this, it is just kind of a gut feel.
So, I am looking for opinions. Do you have policies around max vCPU and/or RAM? Does anyone see any drawbacks or problems with setting limits?
That is a nice document, but it really doesn't answer any of the questions I posed.
First, it does not address sizing vCPU against pCPU. I think this matters, but maybe I am wrong. There are NUMA performance issues if vCPU is greater than the number of physical cores in a socket. Additionally, I think the schedule is going to have a harder time scheduling a 4 vCPU box in a two socket 4 core (8 cores total) host than it would scheduling a 8 vCPU on a two socket 10 core (20 cores total) host. Assuming both hosts have the same vCPU over subscription, assume it is 4 to 1.
So I think there should be some mathematical basis to how the scheduler works and how it will be exponentially more difficult to find free cores when the ratio of vCPU to physical cores on a socket is higher.
I don't think that part of the conversation should be dismissed so quickly. Based on the linked PDF, I would assume it is theoretically possible to go through the exercise and determine that a 4 vCPU VM is running too hot, so you assign it its 5th or 6th vCPU on a 4 core socket and then you start hitting the NUMA penalties and the VM actually gets slower not faster.
My conversation here is not about right sizing a VM, it is about determining the max limits acceptable before you will degrade the performance of your farm or of your VM due to over-allocating a VM (that may actually need those resources).
Another way at looking at this topic, is if you determine a VM actually needs 8 vCPU, then the real question is how many cores do you need in your physical sockets to support that VM within a shared over-subscribed VMware host?
Or am I just making more of the limits of physical cores and there is more black magic. Relaxed co-scheduling can only do soo much. I have not read that in depth about it, but I don't think relaxed co-scheduling can simply skip instructions for an idle cpu. It may be able to schedule them less concurrently but they still need to run their idle instructions... Regardless if you are right-sizing anyway and you do need a 8 vCPU box, relaxed co-scheduling is not going to really help much anyway. So you are back to the question, assuming you are right-sizing your VM's, and assuming you are over-provisioning your cpu on a fairly common 4-6 to 1 ratio or so; How many physical cores do you need to support that 8 vCPU VM?
sharing my thoughts... I agree with you on NUMA penalties. max vCPUs per VM should be limited to cores within a socket How many physical cores do you need to support that 8 vCPU VM? To answer your question, first determine whether VM really needs it? because, if we allocate 8 vCPUs to a VM and if any of the core is not available for scheduling, remaining 7 vCPUs has to wait for the core to be available. This will also degrades VM performance. More vCPUs more risk. Below is the URL from intel which shows vCPUs/Core http://www.intel.in/content/dam/www/public/us/en/documents/white-papers/virtualization-xeon-core-cou... To Calculate average vCPUs/Core in your current environment.. http://enterpriseadmins.org/blog/scripting/general-vsphere-cluster-countsaverages/