VMware Cloud Community
Texiwill
Leadership
Leadership
Jump to solution

Update 2 VMWare HA Oddness... Not enough resources

Hello,

I have an issue that is quite bizarre. My cluster consists of 2 DL380G5s with 16GBs each and 2 quad core pCPUs. I have VMware HA enabled and I was trying to determine why failover did not occur when I lost one node. VMware HA is configured to allow up to 1 host failure and VMs can start if availability constraints are violated. Everything is the default settings otherwise.....

I can start several VMs after a reboot of the single node but after a bit of time I get 'Not Enough Resources'. Now this message occurs until I bring the 2nd node back into the cluster. This seems very odd to me. It is for this reason I have VMware HA capability so why is it telling me I have not enough resources? When I clearly have plenty of resources. Currently using about 1Ghz out of 8x2.333Ghz and 5.33GB out of 16GB of memory.

I thought perhaps it was an issue with the HA cluster so I rebuilt it per the KB Article (i.e. deleted the old cluster and created a new one, adding back in the host). I was able to boot VMs for a short while, then once more got into this state. I have also gone through and verified all the resources on all powered on and powered off VMs are set to the default and 'Normal' settings. One other item I have 6 Resource Pools all 'Normal & Expandable' plus 2 Resource Pools under one of them. All others have no child resource pools. It makes no difference in which resource pool I attempt to boot a VM even the parent pool of the cluster gives the same error 'Insufficient Resources'

I just fixed the other node, brought it in and all is now fine, but this is not proper. I should be able to still boot VMs anytime.... I used to be able to do so until I upgraded to Update 2. This is indeed a puzzler for me, everything looks just fine and works for a short period after boot, which is in itself peculiar.

I can reproduce this by placing the 2nd node in Maintenance mode as well.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
1 Solution

Accepted Solutions
jasonboche
Immortal
Immortal
Jump to solution

VirtualCenter 2.5 Update 3 was designed to resolve the issues. There are no new features released in this build - it's all bug fixes.






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://www.vmware.com/communities/content/community_terms/][/i]

[Minneapolis Area VMware User Group Leader|http://communities.vmware.com/community/vmug/us-central/minneapolis][/i]

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+

View solution in original post

0 Kudos
23 Replies
ThompsG
Virtuoso
Virtuoso
Jump to solution

Hi,

I believe they (VMware) have tighten the thumbs screws around HA in Update 2. If you read the Known Issues in VMware Infrastructure 3 Release Notes () you can see that other issues have arisen around this.

Kind regards,

Glen

0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

None of those issues address my concern. The 2nd host has no VMs so the Enter Maintenance Mode issue is not a problem. There is only one SC network in use, so that is not an issue. In fact this is just not a known issue. The problem is that the system is reporting no resources when there clearly are plenty of resources if a node is down and allow VMs to boot if they violate resources is also checked. In essence I believe I should NEVER see this message.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
weinstein5
Immortal
Immortal
Jump to solution

Unless I did not see it - do you have any reservations set either on the Resource Pools or VMs?

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

All Reservations are what ever the defaults are set when you create a reservation which implies that they are 'Normal' and Expandable. However, if I also move the VM to the 'cluster' which is the top level resource pool the problem also appears. As stated, I have also gone through and made sure all the VMs have no reservations and everything is set to normal.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

Hi,

As mentioned, the link was not directly related to your problem, however was hi-lighting that there is an issue around HA, Update 2 and failover. Too me, it appears that VMware have changed the parameters around HA and even if you (believe you) have enough resources to run all VM's of one host. Update 2 seems to have lowered the threshold for resources, which is highly annoying.

We have experienced the same problem you have, but thankfully most of our clusters have more than 2 hosts so it doesn't effect us as much as other customers.

Best regards,

Glen

0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

Either way I can still not see a reason why this works sometimes and fails after a little bit of time. If I for example restart hostd it may allow one system to boot if I hit the proper time window. Or if I join the system to a new cluster it will work for a short period of time as well.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

Hi,

Not sure if you are still chasing this one or not but found this while browsing the forums tonight. Seems to give a better explanation as too what is happening here:

Part way down this post, it gives how VMware are now calculating failover.

Trust this helps and not makes it worse.

Kind regards,

Glen

Texiwill
Leadership
Leadership
Jump to solution

Hello,

Helpful information. But does not really address the question very well. If I only have one node, and nothing is unlimited I cannot boot a VM which implies HA fails.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
admin
Immortal
Immortal
Jump to solution

This is a known issue in U2 which will be fixed in a future patch/update release. The root cause of the issue in this thread and the one described in the other thread you mentioned is the same. The basic issue is that in an HA-DRS cluster, even when HA admission control is turned off (ie. you're allowing vms to power on even if it violates the HA constraint) DRS will still reserve some failover capacity for HA. This prevents vms from being automatically evacuated off a host entering maintenance mode (though you should be able to migrate them manually) and powering on vms in some cases.

Texiwill
Leadership
Leadership
Jump to solution

Hello,

Any idea when such a patch should be forthcoming. It severely limits my functionality and is keeping many of my SMB customers from upgrading to U2 at the moment. Most of them want the per VM failover....


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
zemotard
Hot Shot
Hot Shot
Jump to solution

I have had same error when my Host had an hardware problem.

I hope this issue will be soon release ...

This update 2 is really not prefect ...

Best Regards If this information is useful for you, please consider awarding points for "Correct" or "Helpful".
0 Kudos
Randy_B
Enthusiast
Enthusiast
Jump to solution

Has there been any word from Vmware on when a patch for this issue will be released?

0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

None yet. Also I tried setting das.vmMinCpuMHz and das.vmMinMemoryMB to 0 and 1 with no change in behavior.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
admin
Immortal
Immortal
Jump to solution

Hello,

VMware is aggressively working to release a fix ASAP, however we are unable to provide a timeline at the moment. Please do contact the VMware support team if you have further questions. Rest assured, this fix will be available soon!

Thanks,

The VMware Team

0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

Here is a new wrinkle.... No problems with HA (i.e. nothing in logs, nothing within VIC/VC) but with both systems part of an HA cluster and been running this way for weeks now. I go to power on a VM and I get Insufficient Resources....

THere are not insufficient resources. I Have systems that are barely doing anything and I get this. If I go into maintenance mode on one server I get the same thing. Now I restarted HA and now things seem to work. However, HA on Update 2 is basically toast. When will a fix be forthcoming? Can VMware not just undo anything you did from Update 1 to Update 2 so that HA actually works and is beneficial?


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
admin
Immortal
Immortal
Jump to solution

Can you enabled verbose logging on the VC server and try power on the vm again? Check the VC server logs (vpxd-*.log) check for "Slot info" - that will help diagnose the issue?

0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

The cause of this latest issue was that ftPerl was in Defunct state. Which in effect isolated the node and I now went back to the original complaint.... If I can provide any assistance in furthering the patch please let me know.

I am not sure why ftPerl went defunct. esxcfg-vswif also was in a defunct state and quite a few other things.....

Update 2 has some serious issues.......


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

I just tested this against the 9/17/2008 patches and this is not fixed yet. When will we see a solution?


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
jasonboche
Immortal
Immortal
Jump to solution

I also have a thread opened here: http://communities.vmware.com/thread/168518 regarding VMs that will not evacuate a host that is placed into maintenance mode. This issue is very annoying and impacts us when we need to perform planned maintenance (ie. 9/17 VMware patches). I am unhappy that the fix was not rolled out in the 9/17 patches.

Jas






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]

Minneapolis Area VMware User Group Leader

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+
0 Kudos