Hello,
Before I applied DRS to my lab I could power on about 80-100 VMs per host before the hosts SC would go unreachable. Why it goes unreachable I havent the slightest clue. I increased it's memory to 800mb but it's still persisting. Anyways, with DRS now enabled...I can only do around 70-80 VMs per the cluster.
Are VM's handled differently with DRS enabled then without. I didnt think DRS would look at the resources the VM would require to power on. I would rather it power on and elect for resources from the resource pool even if the resources arent there.
Since vmotion does require some resources on the hosts - I am wondering if that is taken into account when DRS is enabled thus reducing the resources available to start vms - also when you increased the SC memory I hope you increased the swap space as well if was not at 1.6 GB
My resource pools are delegated to use 85% of the available virtual machine resources. This will leave 15% for vmotion and all of those goodies. Thing is, when I power on all of the VM's in blocks of 5-10 VMs at a time, the service console will become unavailable and ESX core dumps out the rear.
Oh yes, the swap file is 2GB on each ESX host.
Would it matter that I'm using the software initiator for iSCSI? Before DRS enabled I ran the VM's on the local hosts disks rather the SAN. I dont know..it's an option and a possibility im looking into.
Edit: I have had this issue on both a fresh install of 3.5 and one of my test hosts running 3.5.
Did you enable HA? If so, that could explain it, because HA uses a "slot size" to determine how many VMs to allow to start.
Without HA, the basic admission control policy is something like this:
- - Are all named resources available (i.e. Port Groups, .vmdk files, RDM volumes, etc.)
- - Are all "VM local" devices available (i.e. cd-rom, floppy, serial/parallel ports, etc.)
- - Is there a Reservation (CPU or RAM) assigned to the VM?
- - Yes: Can Reservation be satisfied in the target Resource Pool? If not, is Expandable Reservation enabled? If ER is on, can Reservation be satisfied in the RP ancestory?
- - Can vmkernel swap space be allocated to satisfy (RAM Allocation) - (RAM Reservation)?
If all of those conditions (that's all I can think of off the top of my head...) cannot be satisfied on a single host, the VM isn't allowed to start.
Ken Cline
Technical Director, Virtualization
VMware Communities User Moderator
There is no reservations, there is however a resource pool limiting the resources allocated. A virtual machine shouldnt have an allocation by default. It's based on share % and resource elections i thought.
I've run into a new issue at the moment where DRS will register the virtual machines to the 2 beefy servers in the cluster registering well over 130 VM's per. That's enough to completely allocate the VCPU limit by ESX.
There is no reservations, there is however a resource pool limiting the resources allocated. A virtual machine shouldnt have an allocation by default. It's based on share % and resource elections i thought.
If you have no reservations at the VM level, then Shares will come into play during periods of contention and will give priority access to VMs with higher share allotments.
I've run into a new issue at the moment where DRS will register the virtual machines to the 2 beefy servers in the cluster registering well over 130 VM's per. That's enough to completely allocate the VCPU limit by ESX.
I'd suggest opening a support ticket on this one. DRS should not violate one of the basic system constraints by assigning a VM to a host with no available vCPUs!
Ken Cline
Technical Director, Virtualization
VMware Communities User Moderator
Hi FLeiXuis, Can you confirm that DRS is trying to start a VM on a host that has more than the maximum number of vCPUs already allocated? If that's the case, I'll follow up with product management. Thanks, Robert
Robert Dell'Immagine, Director of VMware Communities
I will confirm that this is exactly what is happening. Is there anything you guys may need to validate this as well...it's a real pain to see my cluster with over 130 vm's registered to a single server whereas the others have around 30-40.
EDIT:
I'm not at work right now, but I will most definitely take a lot of screenshots for you tomorrow.
I pinged product management. Thanks for reporting this. - Robert
Robert Dell'Immagine, Director of VMware Communities
it's a real pain to see my cluster with over 130 vm's registered to a single server whereas the others have around 30-40.
Just an FYI...DRS is not in the business of balancing workloads. It's job is to eliminate/minimize resource contention. If there's no contention for resources on those hosts that have lots of VMs, then DRS has nothing to do. What you're seeing is a case where DPM could possibly do you some good in consolidating the workloads onto as few hosts as possible, allowing you to put a couple hosts to sleep.
Ken Cline
Technical Director, Virtualization
VMware Communities User Moderator
This might be a bug. Please enter a support request and reference PR# 317033.
Thanks,
Ulana
DRS Product Manager
The big problem is, DRS is allowing VM's to power on a host that doesnt have enough VCPU's. When looking at the cluster, I have 3 old Dell 6850's with 2 newer Dell 6850's. The 2 newer ones are much faster of course so DRS is loading it with a lot of VM's on power up and with migrations. The other 3 hosts have 20-30 VM's at any given time while those 2 have 130+.
That's what I thought you were saying...Robert will address this with the product management team because that should not be happening. It would probably be helpful if you could open a support request and include a link to this thread within it. Neither VMotion nor DRS should not be violating a design constraint of the hypervisor - and that's what it sounds like they are doing.
Ken Cline
Technical Director, Virtualization
VMware Communities User Moderator