VMware Cloud Community
Anders_Gregerse
Hot Shot
Hot Shot

Maintenance mode will not do vmotion (HA enabled cluster)

Hi

I have a cluster that consist of 6 hosts where manual vmotion works fine, but when I put my hosts in Maintenance mode, it just doesn't vmotion the vm's to the other hosts in the cluster. This problem started after we had some major network problems and every host have been rebooted after the network problem was solved. I've done vmkping and looked in different logs without finding any clues. Any idea on how to fix this (so that I can update to Update 2 without putting to much manual work into it)?

Anders

0 Kudos
30 Replies
Anders_Gregerse
Hot Shot
Hot Shot

My support guy have forwarded it to engineering, but currently no published solution exists. He agreed as well as other vmware engineers that it seems that the capacity calculation is not correct in my case (this doesn't mean that vmware acknowledge that it is a bug).

0 Kudos
dmgenesys
Contributor
Contributor

Also - same problem that started only with 3.5 U2 update... However, about those calculations - I have 3-node clusters. Allowed failover - 1 node and VC reports capacity as 2 nodes before maintenance... Then I took a look at logs on VC - right before the maintenance mode it reports VM slots available twice the number that is required.... So, how can anyone say this is not a bug? Smiley Happy

P.S. This update 2 definetely has some fixes and new features that we all longed for. But also in the past week I have seen some bugs that if I known before would definetely stop me from upgrading.

0 Kudos
Anders_Gregerse
Hot Shot
Hot Shot

I got inspired by what you wrote and checked my reservations. I had some important servers that had 3gb memory reserved each (they are using 100%). If that reservation is removed I have a failovercapacity of 2 host suddenly. There is for sure something wrong with the calculation of HA.

0 Kudos
bretti
Expert
Expert

This is really annoying.

Another option for a workaround is to select all the VMs on the host, right click and select migrate. Instead of choosing a specific host, choose your cluster, DRS will figure out what hosts to put them on and the vmotioning will start. Only issue is if you are using resource pools in your cluster. Then it gets messy trying to match the VM to the pool during a VMotion.

0 Kudos
tomaddox
Enthusiast
Enthusiast

An update on my experience with this issue: I have not upgraded to ESX 3.5 U2, but I have upgraded Virtual Center 2.5 to U2, and now I'm having the problem. Apparently, the issue is in the Virtual Center code, not the ESX code.

0 Kudos
Anders_Gregerse
Hot Shot
Hot Shot

The problem is without question in the VirtualCenter, where the calculation of available resources is unfavourable if you're using reservations and might result in zero failover capacity. I guess all we can do is wait for a solution from vmware and avoid reservations. Those who experience the problem should open a service request with vmware so that the scope of the problem is known to them.

0 Kudos
tomaddox
Enthusiast
Enthusiast

The annoying thing is that the calculation problem has been very well-known and has existed since 3.0 and gone unresolved. Since it was easy to work around for HA, I didn't worry about it. Now, some bright mind at VMware has made the erroneous calculation a dependency for VMotion, which is bad enough, and in particular a dependency for maintenance mode, which just seems stupid. If I’m putting a host into maintenance mode, I expect guests to be moved unconditionally, barring some real technical problem. This kind of cluelessness/carelessness seems unlike VMware, but it does seem like EMC, so I hope it doesn’t forecast the future.

Done ranting.

0 Kudos
dmgenesys
Contributor
Contributor

An interesting thing... After an upgrade to U2 and an installation of the express-patch, I decided to try to reinstall U2 from scratch - do a clean install. Once I finished with a lab install - the same lab installation that did exhibit the behaviour in the subj - I can do vmotion with no problem - and no issues with HA calculations... I haven't changed the number of VMs running nor reservations. The diffirence between patch express and clean install in build numbers - express: 3.5.0.110181 - clean: 3.5.0.110268... Go figure...

0 Kudos
tomaddox
Enthusiast
Enthusiast

Is that the build # for ESX Server or for VC? Either way, your build numbers are significantly different from mine.

0 Kudos
dmgenesys
Contributor
Contributor

Those are the ESX builds... Following U2 upgrade to VC - no further patches were applied - the build there - 2.5.0.104215.

0 Kudos
Jonesie
Contributor
Contributor

I'm getting this in the latest build of vCenter 2.5 (Build 174768). Was there ever a fix for this?

Thanks - David

0 Kudos