Solved: vCAC and HA/DRS Cluster resource pools

TheVMinator · ‎09-13-2013

If compute cluster resource pools are used on HA/DRS clusters, how do I maintain those resource pools properly when VCAC is implemented?

Previously, when VMware admins provisioned all the VMs manually, they could always keep track of what VMs were going into what resource pools on a compute cluster.

The more VMs you add to a resource pool with 5000 shares, the fewer shares are available to each VM. As an admin provisioning VMs directly without self-service tools in the picture, you can keep track of the shares, ratios and resulting processor and memory resources guaranteed to VMs during contention.

Now Enter VCAC into the picture. Users can request their own VMs via self service. VMware admin comes in and 50 new VMs showed up overnight. VCAC knows how much storage, processor, and memory were available and everything is fine from that perspective. But unless I'm mistaken VCAC has no way of tracking and maintaining relative ratios in processor and memory share values between different VMS on the same compute cluster. An admin still has to manually maintain that. Worse yet he has to now figure out what applications those were that showed up in the new vms from last nights VCAC provisioning on the cluster and make sure that the shares they are assigned during contention are proportionate to the shares assigned to other VMs in other HA/DRS resource pools. I'm sure there is a solution to this problem that someone has.

The VCAT specifies that:

"it is not considered a cloud if there are manual procedures that must be performed by the cloud administrator or service provider to provision cloud resources following a consumer request"

http://download3.vmware.com/vcat/vcat31_documentation_center/index.html#page/Introduction/1%2520Intr...

The idea here is that we shouldn't have self-service provisioning tied to manual processes that VMware admins have to do on their compute clusters after a VM is provisioned.

How do we both implement self-service for VM provisioning and NOT have to manually maintain compute cluster resource pool share values and ratios on HA/DRS clusters?

depping · ‎09-15-2013

TheVMinator wrote:

Also - more about the reasoning for resource pools. I think the reasoning was, that if we did get into a scenario where vms were competing for processor or memory resources, to have them there to insure that the critical vms (important SQL server) are guaranteed resources and stay up. At this point we haven't done enough analysis to know if and when that would happen.

I can understand where you are coming from, but this also means that if you carve up your cluster in to pools and these pools will fight for resources amongst eachother that you will need to make sure that you correctly configure the shares. Simply using "High / Medium / Low" is not going to work when the number of VMs isn't equally balanced, which it typically isn't.

So yes you can use VCAC to deploy your VMs. Yes you can use resource pools if you expect they will be over committing or can't afford to take the risk. If you do:

Write a script which configured shares on your resource pools based on the number of VMs in that pool and the relative priority. An example can be found here:

http://www.yellow-bricks.com/2010/02/24/custom-shares-on-a-resource-pools-scripted/

View solution in original post

depping · ‎09-13-2013

I guess the question starts with, why do you want to use resource pools in the first place? Are you expecting to overcommit your cluster? Do you need to provide resource guarantees (reservations) to a set of virtual machines? Or do you somehow need to limit a group of VMs?

abhilashhb · ‎09-13-2013

I feel the whole point of moving into cloud makes it mandatory for an organization to have enough resources. If you are worried about shares on VM and resource usage then it gets complicated to design it. Not that its not possible but it involves lot of workarounds with custom tuning and stuff. As far as the performance is concerned vCAC gets all the pools or clusters from vCloud director, so you would have chosen the type of model you want and you will know the amount of resources that are available on the VDC. Thats the reason vCAC gives the liberty for a cloud admin to decide the number of VM's that can be created in a given cluster. You could set Max number of VM's created and also quota per user. And if you are directly deploying VM's on vSphere then you will have to set the quota's accordingly. It all depends on whether you want to overcommit your resources or have enough resources and make sure they run without having to contend for resources. Its all in the way you design it. I might not be an expert but this is what i feel personally.

Abhilash B
LinkedIn : https://www.linkedin.com/in/abhilashhb/

TheVMinator · ‎09-13-2013

Duncan:

I think where we are at now is that we want to make sure that we have planned for a contention scenario. Unfortunately we have not done enough current state analysis to know if and when contention for resources would occur. We want to have critical production VMs insured that they will have resources even in contention, whereas development Vms don't need it.

Admittedly the cart is before the horse in that we don't know how close to being overcommitted we will be and we need more capacity planning and current state analsysis, and that decision I can't control personally. Given that scenario - do you think we should simply not use resource pools at all on clusters?

Abhilash:

Thanks for input.

TheVMinator · ‎09-13-2013

Also - to clarify - in this scenario vcloud director is not being used. Only VCAC is being used.

TheVMinator · ‎09-13-2013

Also - more about the reasoning for resource pools. I think the reasoning was, that if we did get into a scenario where vms were competing for processor or memory resources, to have them there to insure that the critical vms (important SQL server) are guaranteed resources and stay up. At this point we haven't done enough analysis to know if and when that would happen. Currently our clusters are not experiencing contention with respect to RAM and CPU. However many more vms need to be added and they could be eventually. With VCAC, we can control how much cluster resources are assigned to VMs during provisioning. That helps VCAC make a decision as to whether another VM can be provisioned on a cluster or not. If there is not enough RAM, it won't provision it. However, after VCAC provisions the VMS, it has no control over what happens if there is contention afterward. After VCAC makes its provisioning decision - it is not monitoring memory usage real-time. If after VCAC did is provisioning, someone adds vms outside of VCAC, or VMs on the cluster start contending for memory for any reason, VCAC can't manage the contention.

Are we saying that if we are using VCAC then the best strategy is to keep cluster utilization low enough so that there will never be contention for RAM or CPU? At least in theory it seems that if you needed to overcommit your cluster it would be hard to use VCAC and resource pools together because you would have to manually monitor your compute cluster resource pools based on what happened in VCAC. It seems like cluster based resource pools and VCAC are both great technologies, but if the goal is automation and maximizing consolidation ratios, do they work well together?

In my case perhaps it is best not to use compute cluster resource pools? If so that is important to understand, but I still wonder I guess in general if there is a plan to make VCAC work so as to merge the goal of automation that VCAC accomplishes with the control that resource pools give during conention, without losing the value of the principle that the VCAT talks about:

"it is not considered a cloud if there are manual procedures that must be performed by the cloud administrator or service provider to provision cloud resources following a consumer request" [like going in and adusting resource pool share values on the compute cluster]

It seems like the ideal solution (if it doesn't exist already) would be for there to be a way that your provisioning tool and your tool for managing contention afterward for cpu and ram (compute cluster resource pools) can work together and allow you to have self provisioning and also an automated way of maintaining share value ratios for provisioned VMs and controlling what happens during contention in the compute cluster. Is this currently possible or a good idea for the future or is my thinking off? Thanks for your input.

(Also in case it wasn't clear note that when I say "compute cluster resource pools" I mean the resource pools that you configure on the HA/DRS cluster in vCenter and not the resource pool objects that appear in VCAC as both things are called "resource pools" but have different meanings)

depping · ‎09-15-2013

TheVMinator wrote:

Also - more about the reasoning for resource pools. I think the reasoning was, that if we did get into a scenario where vms were competing for processor or memory resources, to have them there to insure that the critical vms (important SQL server) are guaranteed resources and stay up. At this point we haven't done enough analysis to know if and when that would happen.

I can understand where you are coming from, but this also means that if you carve up your cluster in to pools and these pools will fight for resources amongst eachother that you will need to make sure that you correctly configure the shares. Simply using "High / Medium / Low" is not going to work when the number of VMs isn't equally balanced, which it typically isn't.

So yes you can use VCAC to deploy your VMs. Yes you can use resource pools if you expect they will be over committing or can't afford to take the risk. If you do:

Write a script which configured shares on your resource pools based on the number of VMs in that pool and the relative priority. An example can be found here:

http://www.yellow-bricks.com/2010/02/24/custom-shares-on-a-resource-pools-scripted/

TheVMinator · ‎09-16-2013

OK thanks again. This script looks like a great way to automate the maintenance of the share ratios between resource pools. However I still see a problem and and possible solution -

Problem:

I define what I want my ratios to look like between resource pools. For example, I have a high priority resource pool with 10000 shares and and a low priority resource pool with 5000 shares.

Both resource pools start with 5 vms each.

Then, another team uses VCAC to provision 30 VMS in the "high" resource pool overnight. I run my "respoolshares.ps1" script once a week. However, this is Monday and I don't run my script until Friday. In the meantime, I have this scenario in vSphere:

35 VMs in the "high" resource pool getting 10000shares / 35vms or 285 shares per vm

5 vms in the "low" resource pool getting 5000shares / 5vms or 1000 share per VM.

Then on Tuesday contention occurs. My "high" vms end up getting about 28% of the resource of my low vms and go down. So my resource pool design was defeated, before I could run my weekly script on Friday.

Once I turn my environment over for self-provisioning, I can't possibly keep track anymore of everything that goes on through VCAC. The bigger it is and the more teams that can self-provision, the harder it is to constantly know when I have to run my script to prevent my share values from being skewed.

VCAC has its own version of "resource pools" that do help make sure that the cluster doesn't provision a VM where there isn't enough disk space, CPU or RAM, but it doesn't maintain share values between compute-cluster resource pools or adjust them after VMs are provisioned.

Solution:

Instead of managing resource pools entirely withing vCenter Server, what if I could tell VCAC to reach ino into vCenter Server, determine what the share ratios are between resource pools, and dynamically adjust them during provisioning in the same way that my "respoolshares.ps1" script does?

It seems like in the long term evolution of cloud and self-provisioning, there has got to be a way to decouple resource pool management from vCenter and allow the provisioning tool to keep the share values in proportion. Once my self-provisioning jobs became frequent enough, 5, 10, 20 a day, it would be impossible to always run a script against vCenter Server to maintain my compute cluster resource pools. It seems like if a solution doesn't eventually develop to allow self-provioning without the manual maintenance of resource pools, that resource pools would cease to be usable in the context of mass self-provisioning.

Even if it were possible to be diligent enough with running scripts to contstantly keep every set of resource pools on every cluster in their proper share ratios, then if I am trying to move to a true cloud it still it seems like the principle of cloud computing is still being violated which says:

"it is not considered a cloud if there are manual procedures that must be performed by the cloud administrator or service provider to provision cloud resources following a consumer request" [in this case running scripts to adjust resource pool share values on the compute cluster after self-proviosing takes place]

Am I thinking correctly that VCAC should eventually work more smoothly with compute-cluster resource pools, or is there a different way of thinking about this?

Thanks for input

ShibbyB · ‎09-16-2013

If you would like to use resource pools, and leverage the script that was provided by Duncan, you could designate the resource pools within each of the Virtual Reservations you create within vCAC. If the script to do the rebalance is not super intensive that you wouldn't want it to run more regularly, you could have it called using the Workflow Designer, DailyHypervisor has an example of how to call a PowerCLI script for setting cores instead of physical CPUs that would probably just as easily call the rebalance script when a new system is provisioned. You could probably also just schedule it to run on a regular basis if that is an easier solution.

I think that vCAC is this semi-vendor agnostic workflow engine that works on the assumption that there is a finite set of compute power that you are allocating to your business units through reservations. If you choose to create reservations in such a way that it will cause resource contention if users use what has been allocated to them, then the hypervisor of choice will use its native mechanism for dealing with the resource contention.

When I think of offering a cloud type service, and leveraging a tool like vCAC, there is some implied performance standards. If you are planning for resource contention and building a solution around that, seems like a difficult pitch to people to leverage the environment. I would start looking at capacity management and what that looks like in a cloud solution, how do we trend the usage of our resources, and ensure that we can meet expected growth?

That being said, I do think there is an opportunity for vCAC to handle reservations with perhaps some different options. The static Memory / Storage might be nice if you could specify something like a % of a given cluster resource, so if you extended a datastore, or added a host, you wouldn't have to go in (if you choose the % type reservation) to go allocate that new space on each reservation.

TheVMinator · ‎09-16-2013

OK thanks for the input. So it seems like in general the opinion is that if I'm moving to cloud, and using VCAC, and doing self provisioning, the normal way of designing would be that I don't overcommit my cluster enough to require compute cluster resource pools. It seems like this is the consensus? It does seem like attempting to merge the script running with each provisioning operation adds some clunkyness and and extra moving part, and is itself hard to sell to management as a required part of the design.

Is this what most people with VCAC, self provisioning, and cloud - moving away from compute-cluster resource pools?

depping · ‎09-19-2013

TheVMinator wrote:

OK thanks for the input. So it seems like in general the opinion is that if I'm moving to cloud, and using VCAC, and doing self provisioning, the normal way of designing would be that I don't overcommit my cluster enough to require compute cluster resource pools. It seems like this is the consensus? It does seem like attempting to merge the script running with each provisioning operation adds some clunkyness and and extra moving part, and is itself hard to sell to management as a required part of the design.

Is this what most people with VCAC, self provisioning, and cloud - moving away from compute-cluster resource pools?

I have also filed a feature request to support these scenarios in the future, meaning the DRS will adjust shares for you based on the number of VMs and the priority you selected.

TheVMinator · ‎09-23-2013

OK that is good to know that this might be on the roadmap then and we'll keep our eyes open. In the short term then, we might forgoe using resource pools and rely on capacity planning to prevent overcommittal. However if this feature becomes available, we could configure our compute cluster resource pools for at least the possibility of overcommittal if it happens.

All

vCAC and HA/DRS Cluster resource pools