Reply to Message

View discussion in a popup

Replying to:
karlg100
Contributor
Contributor

I'm having the same issue.  look at the vpxd.log and grep for drmLogger.  if you see faults, that could give you a clue as to why DRS says it can't put a host into maintenance mode.

Some of my VMs I know have issues (stale backup flags) but even after clearing that out, there are sometimes "host affinity" issues.  but even if I clear those, hosts are still not entering maintenance mode and no errors in vpxd.log as to why DRS is unhappy with any of the hosts.

But my issue is I can clear all the faults, and the DRS API tells VUM that there are no hosts that can enter maintenance mode.

Here's a workaround that will work well, especially if you have VSAN enabled:

-schedule a patch job for all the hosts you want to re-mediate

-if you're lucky, some of your systems might patch

-allow the job to get stuck (watch the /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server-log4cpp.log for errors like the below)

[2018-01-17 18:45:05:274 'VciTaskBase.VciClusterJobDispatcherTask{1111}' 140371731937024 INFO]  [vciClusterJobSchedulerTask, 1160] No of hosts recommended by DRS API : 0

[2018-01-17 18:45:05:274 'VciTaskBase.VciClusterJobDispatcherTask{1111}' 140371731937024 INFO]  [vciClusterJobSchedulerTask, 440] DRS API did not return any hosts

-go to a host that hasn't been patched (try the one with the fewest VMs first).  Go to the VM tab, and manually vMotion all the hosts off to a patched or another host

-after this, you should see in the vmware-vum-server-log4cpp.log that DRS will report the host is ready to be patched, which will trigger VUM to initiate a matenance mode, patch, reboot, and take the host out of maintenance mode.

-when the host is back up, repeat for all remaining hosts

-if you don't have VSAN, you could do this to multiple hosts at a time to the maximum number of concurrent hosts you configured in your VUM ojb

I have opened a ticket and they are looking into my issue.

Open a ticket with support as well so we get some priority behind fixing this.

Reply
0 Kudos