- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TL;DR: In short don't worry about the RP structure, the actually allocation targets will flow to the VMs based on their location and the relative priority with the DRS group. it depends on you which VMs to place in which site\vm-group. If you distributed the VMs in an unbalanced manner, you will end up with contention, just because DRS cannot move the VMs to a host with enough resources.
Interesting question and fascinating to see your observation that RPs are not site-aware. But luckily for you, this lack of site-awareness will not impact your design. The reason why is that the distribution of resources is done on a consumer-based mapping. (it's all in the cluster deep dive book btw).
So let's start from the beginning, shares only matter when contention occurs, if you have no contention, resources are distributed to VMs that request it. I.e. resource allocation = demand.
Now when contention occurs, that where the resource pools claim their stake of the cluster resources and distribute it across its siblings (VMs inside the RP). Andreas already pointed out the scalable shares deep dive.
To answer your question we need to take a look at how a cluster resource pool structure works inside a distributed system. A cluster accumulates all the resources of all the hosts in the cluster and pools this at the root level (cluster). This becomes the cluster capacity. When it divides (divvy) the resources, it does this based on entitlement. Entitlement is a combination of resource allocation settings (R, L, S), VM activity, and the activity in the cluster. It calculates this relative priority. We introduced relative priority because we understand that not all workload is 100% all of the time. This sets us apart from every other scheduler out there, this makes it so cool, but also difficult to understand.
To give an example, when I have 3 VMs, each identically configured, it will be entitled to 33% of the resources. But this calculation is only true if all 3 VMs are active at the same time and active on the same resources at the same rate. We call this a worst-case scenario allocation, but this situation helps to explain how shares work. In reality, this won't happen often. So in a more likely scenario, one VM is active for 80%, one for 40%, and one is idling. Now that means that there are two VMs competing for resources. If there are enough resources to divvy amongst the two VMs, one VM will get the resources based on its 80% activity, the other one 40%. If there aren't enough resources, it will be divided. How? the resources are shared amongst the two competing VMs with the active shares. Now this means that they are now competing at a 50% rate. Each is entitled to 50% of the resources, but because one VM is only active for 40%, it will take the resources it needs and the remaining resources can be consumed by the VM which is active for 80%. The actual allocation of resources is done by the host-local CPU and memory scheduler. Based on the entitlement of the VM, time is allocated for the VM to consume those resources.
And this is the key to mapping a distributed hierarchy (resource pools) across hosts within the cluster. When a resource pool structure is created, a tree structure is created and the VMs are the leaves. When a VM is placed on a host, then that leave should hang from something and only that part of the tree is replicated to that host. So if you have 2 VMs each in its own RP pool attached to the root. If each VM is placed on a separate host, one host will have one part of the tree replicated to it, the other host, the other part. Once this is done, DRS will calculate the entitlement of the VMs and send over the information. In this story, you can replace your sites with these two hosts.
The trick to understanding this is the relative entitlement and the possibilities for DRS to actually find a host where the VM can allocate the resources it is entitled to. And that's where you have to play a role in it, as you have introduced a constraint. If a VM is entitled to X, the host-local scheduler attempts to provide those resources, but if contention occurs, the host local scheduler might not be able to provide those resources. Thus DRS attempts to find another distribution of VMs in such a way that the resources flow to the VMs that are entitled to it. With the VM groups aligned to site structure, you've instructed DRS to only move the VMs around the four hosts in each site, and thus there is a limited number of moves to make, a restriction of possible placement combinations. The resource pool structure and the entitlement calculation will be based on the actual placement of the VMs, and thus it depends on you which VMs to place in which site\vm-group. If you distributed the VMs in an unbalanced manner, you will end up with contention, just because DRS cannot move the VMs to a host with enough resources.
In short, don't worry about the RP structure, the actual allocation targets will flow to the VMs based on their location and the relative priority with the DRS group.