VMware Cloud Community
emmar
Hot Shot
Hot Shot

Maintenance Mode is not moving VMs off ESX host, so failing

We have an ESX 3.5 and VC 2.5 install.

VMotion is functioning fine and when DRS is set to Auto the VMs are migrated off when put into Maintenance mode.

But when we set the DRS to partial or manual or disabled, maintenance mode is not attempting to vmotion off the VMs and is stuck at 2% until will cancel or manually vmotion the VMs off.

We even get a message along the lines of. "there are virtual machines on this host, it will not be put into maintenance mode until they are migrated off or shut down"

So because of this things like UM are not working for us at the moment as it cant migrate the VMs off.

Can anyone think of any reasons!

Thanks,

E

Reply
0 Kudos
41 Replies
Erik_Zandboer
Expert
Expert

Hi Edouard,

It is solely dependent on the cluster settings. To see if any rules have been applied, right click on your cluster, and choose "edit settings". From there, click on "rules" under DRS. If nothing is there -> no rules have been applied.

Visit my blog at http://www.vmdamentals.com
Reply
0 Kudos
dina_mark
Contributor
Contributor

We see this behavior when the VMs are connected to client-based CD/DVD drives (rather than to ISOs in the data store.) Unfortunately, it has been our experience that once the Enter Maintenance Mode task is initiated, Edit Settings is not available on the offending VM. The last time it happened we had to gracefully shut down a VM which was then evacuated. Just our $0.02. Cheers!

Reply
0 Kudos
sbeaver
Leadership
Leadership

You know you can cancel a maintenance mode task? I have been caught be this also and had to cancel the task to clear up the problem. I have not tested this but I bet with powershell and the scripts available it might work to try to disconnect the CD that way

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**
Reply
0 Kudos
AntonVZhbankov
Immortal
Immortal

I have the same problem with ESX 3.5. But ESX 3.5 update 1 now can vmotion all the live VM's, the only thing you need - DRS.

Even in manual mode DRS generate suugetions to migrate VMs from host.

EMCCAe, HPE ASE, MCITP: SA+VA, VCP 3/4/5, VMware vExpert XO (14 stars)
VMUG Russia Leader
http://t.me/beerpanda
Reply
0 Kudos
ibewhoiam
Contributor
Contributor

Found the solution for me:

-It was all about the 'admission control' HA setting. Previously, I used 'do not power on if they violate availability constraints' setting, which allowed maint. mode migrations. I had to change this to 'allow to be powered on even if they violate availability constraints'.

Found this in a different thread:

http://communities.vmware.com/thread/117505

Hope this helps everyone :):D

Reply
0 Kudos
BorisKul
Contributor
Contributor

Solution don't work after installing Update 2.

Reply
0 Kudos
Erik_Zandboer
Expert
Expert

Hi BorisKul,

Could you specify what your problem is exactly, and which solution is not working for you?

Visit my blog at http://www.vmdamentals.com
Reply
0 Kudos
BorisKul
Contributor
Contributor

Virtual machines do not migrate on other host at presence only two hosts in cluster even if to change adjustments HA at established Update 2. Switching-off HA in general helps.

Reply
0 Kudos
fede_pg
Contributor
Contributor

Hello I've got the same issue described in the first message.

I want to point out that migrating VMs manually between the hosts works like a charm.

My scenario is:

- two ESX server 3.5 u2, fresh install

- A VC Server, 2.5 u2

- Cluster with HA + DRS enabled

- Shared Storage (EVA 4400 SAN)

- Same network configuration

- DRS SET TO FULLY AUTOMATED

- NO DRS RULES SET

- ADMISSION CONTROL SET TO ALLOW AVAILABILITY CONSTRAINTS

- VMs does not have any local cdrom, floppy, newtork ...

- NAME RESOLUTION IS OK

Any suggestion? I can't find a valid reason for this behavior ... it's the first time I see a cluster refusing to automatically vmotion VMs!

Trying to put a host in Maintenance Mode times out with the errors you see in the attachment.

Thanks in advance.

Reply
0 Kudos
BorisKul
Contributor
Contributor

Virtual machines do not migrate on other host at presence only two hosts

in cluster even if to change adjustments HA at established Update 2.

Switching-off HA in general helps.

Sorry for broken english.

Boris.

Reply
0 Kudos
fede_pg
Contributor
Contributor

Confirmed: turning off HA in the cluster fixes the problem, but I hope VMware could quickly provide us with a valid solution, 'cause I'd like to have BOTH HA and DRS enabled Smiley Happy .

Reply
0 Kudos
BenConrad
Expert
Expert

I saw this on the 3.5 U2 notes, thought it was interesting:

Virtual Machine Migrations Are Not Recommended When the ESX Server Host Is Entering the Maintenance or Standby Mode

No virtual machine migrations will be recommended (or performed, in fully automated mode) off of a host entering maintenance or standby mode, if the VMware HA failover level would be violated after the host enters the requested mode. This restriction applies whether strict HA admission control is enabled or not.

Adding to this complexity is how VC 2.5.x figures out HA resources:

http://virtualgeek.typepad.com/virtual_geek/2008/06/so-how-exactly.html

Ben

Reply
0 Kudos
Erik_Zandboer
Expert
Expert

Indeed. All makes sense now. If you have a two-node cluster, putting one host into maintenance mode will always violate the failover level of one host. So I think that disabling HA before you go into maintenance mode is the only way to get a two-node cluster migrating in case of a maintenance mode request...

Visit my blog at http://www.vmdamentals.com
Reply
0 Kudos
minerat
Enthusiast
Enthusiast

Is anyone else ticked off by the maintenance mode change? You'd think that if I don't care about strict admission control, then it should work fine. Having this option is especially important given the completely over the top conservative HA calculations in 3.x. I removed two reservations (modest - 1ghz cpu, 1GB ram) from 2 VMs and my failover capacity went from 0 hosts to 2 hosts in a 4 host cluster. I knew pre-update 2 that it was over conservative - maintenance mode never sent any resources into 80+% when I dropped a host for updates, but after reading that article, I'm flabbergasted. How is determining the slot size to be as large as the MAX cpu reservation * vCPUs + MAX memory a decent estimate of worst case scenario. One VM with a reservation can throw off the HA caluclations for a multi-host cluster - dropping the number of slots in half. There's no way in hell that a single reservation has this kind of impact on true load capacity of the cluster. Why not determine the min (unreserved most likely) and then calculate the slot size for reserved vms as multiples of the min? Sure, you might have a little more complication to ensure that enough contiguious slots (slots on the same host) are available for your "4 slot" vm, but isn't this much more indicative of worst case scenarios as opposed to all of your VMs needing the same amount of resources that you've specified for potentially a single reservation on a single vm?

Reply
0 Kudos
fabian_bader
Enthusiast
Enthusiast

Vielen Dank für Ihre Nachricht.

Ich bin zur Zeit nicht im Haus.

Ab Dienstag, den 02.09.2008, bin ich wieder für Sie erreichbar.

Ihr Mail wird nicht weitergeleitet!

In dringenden Fällen wenden Sie sich bitte an meine Kollegen (mailto:rz@deutschebkk.de).

Mit freundlichen Grüßen

Fabian Bader

IT und Datenmanagement

ITD Rechenzentrum

Deutsche BKK Stuttgart

Deutsche BKK

38439 Wolfsburg

Telefon: (0711) 8913 - 616

Telefax: (0711) 135358 - 616

http://www.deutschebkk.de

mailto:fabian.bader@deutschebkk.de

+++ Attraktive Angebote rund um die Gesundheit und für Ihr Wohlbefinden unter www.gesundheitswelt-direkt.de +++

Reply
0 Kudos
abaum
Hot Shot
Hot Shot

I am not sure if that was the answer. I've had the same problem for months now and I am running U1. I can't even do a regular vmotion. What I have found is that if I select "low priority" when doing a Vmotion, it works fine. Some folks found a mention in VM KB that certain Broadcom chipsets have issues. Based on this, some folks moved from Broadcom NICs to Intel NICs and the problem went away.

adam

Reply
0 Kudos
allencrawford
Enthusiast
Enthusiast

I ran into this problem today, but the reason my hosts would not go into maintenance mode is the NX bit was enabled on one of the two hosts while it was disabled on the other host. Of course, if you've tried a manual VMotion already you'd see the same error I did:

Unable to migrate from hostA to hostB: Host CPU is incompatible with the virtual machine's requirements at CPUID level 0x80000001 register 'edx'.

blah blah

blah blah

Mismatch detected for these features:

  • NX/XD (data execution prevention). If the virtual machine...

Unfortunately, we only have two hosts in this particular cluster, so we'll need to incur some VM downtime to reboot the host and change the setting in the BIOS to our standard.

Reply
0 Kudos
rgv75
Enthusiast
Enthusiast

This is a known bug on U2 and it was fixed with VC 2.5 U3 and ESX 3.5 U3, released on 11/6/2008. I haven't tried installing U3, but am in the process of downloading it now.

Reply
0 Kudos
dslatkin
Contributor
Contributor

Has anyone been able to confirm that updating to U3 for ESX and VC will resolve this issue?

Reply
0 Kudos
AndyMcM
Enthusiast
Enthusiast

Just to reply to this issue that lots of people were having, currently setting up a 2 Node cluster based on U3.

Can confirm that the issues is now fixed.

By default it will still happen you have to change the HA cluster option to "Allow VMs to be powered on even if they violate availability contraints." after which things just work great.

A.

Reply
0 Kudos