VMware Cloud Community
danraleigh
Contributor
Contributor
Jump to solution

TImeout for putting a host to maintenance mode

I have a reboot script that i found online and modified for my purposes however the problem i am running in to is some hosts in our environment have VM's that will not migration.  Some cases its due to CDDrive connected, others due to some lingering issues in our environment we are still trying to resolve.

Anyway - here's the deal, PowerCli script does a rolling reboot of the hosts in a defined cluster.

Issue - gets to a host that cant enter maintenance mode for whatever reason

Solution - here is where i need some help.  Is there a way to put a "timeout" to wait XX minutes for host to enter maintenance mode.  Say 60 or 120 minutes, if failed to enter maintenance mode, cancel current host, move to next host in cluster.

Here is snippet of what i am working with

     what i would like if possible to put some type of monitoring before the reboot to do what i mentioned above

Function RebootESXiServer ($CurrentServer) {

# Get Server name

$ServerName = $CurrentServer.Name

# Put server in maintenance mode

Write-Host "#### Rebooting $ServerName ####"

Write-Host "Entering Maintenance Mode"

Set-VMhost $CurrentServer -State maintenance -Evacuate | Out-Null

# Reboot server

Write-Host "Rebooting"

Restart-VMHost $CurrentServer -confirm:$false | Out-Null

Tags (1)
0 Kudos
1 Solution

Accepted Solutions
LucD
Leadership
Leadership
Jump to solution

As a matter of fact, there is, just not via the Set-VMHost cmdlet.
If you look at the EnterMaintenanceMode method, you'll notice it has a Timeout property in the parameters you pass to the method.

It could look something like this

$esx = Get-VMHost -Name MyEsx

$spec = New-Object VMware.Vim.HostMaintenanceSpec


# Only needed when the node is part of a VSAN cluster

$spec.VsanMode = New-Object VMware.Vim.VsanHostDecommissionMode

$spec.VsanMode.ObjectAction = [VMware.Vim.VsanHostDecommissionModeObjectAction]::evacuateAllData

# End VSAN


$timeout = 60       # Timeout in seconds

$evacuatePoweredOffVMs = $false

$esx.ExtensionData.EnterMaintenanceMode($timeout,$evacuatePoweredOffVMs,$spec)


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

View solution in original post

0 Kudos
13 Replies
daphnissov
Immortal
Immortal
Jump to solution

This is PowerShell so best to move to PowerCLI.

0 Kudos
danraleigh
Contributor
Contributor
Jump to solution

do you have a suggestion on how to do it via PowerCLI?

0 Kudos
daphnissov
Immortal
Immortal
Jump to solution

Please move message to PowerCLI sub-forum.

0 Kudos
sjesse
Leadership
Leadership
Jump to solution

if you edit your power you can place it in the powercli section, you'll get a better response, and probably a response from the best in the business

0 Kudos
LucD
Leadership
Leadership
Jump to solution

As a matter of fact, there is, just not via the Set-VMHost cmdlet.
If you look at the EnterMaintenanceMode method, you'll notice it has a Timeout property in the parameters you pass to the method.

It could look something like this

$esx = Get-VMHost -Name MyEsx

$spec = New-Object VMware.Vim.HostMaintenanceSpec


# Only needed when the node is part of a VSAN cluster

$spec.VsanMode = New-Object VMware.Vim.VsanHostDecommissionMode

$spec.VsanMode.ObjectAction = [VMware.Vim.VsanHostDecommissionModeObjectAction]::evacuateAllData

# End VSAN


$timeout = 60       # Timeout in seconds

$evacuatePoweredOffVMs = $false

$esx.ExtensionData.EnterMaintenanceMode($timeout,$evacuatePoweredOffVMs,$spec)


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
danraleigh
Contributor
Contributor
Jump to solution

thanks, i will try and modify current script in lab and see if this will work. 

0 Kudos
danraleigh
Contributor
Contributor
Jump to solution

LucD -

Thanks, that works for the timeout, however it hangs when it then tries to go to the Reboot step, since sever is not in maintenance mode.   How do i had some error handling to skip to the new server if the $currenthost times out.  Thanks in advance for your help on this.


Here is error

#### Rebooting labesxi01.labdomain.net ####

Entering Maintenance Mode

Exception calling "EnterMaintenanceMode" with "3" argument(s): "Operation timed out."

At C:\PowerCli\reboot-vmcluster3.ps1:93 char:1

+ $esx.ExtensionData.EnterMaintenanceMode($timeout,$evacuatePoweredOffV ...

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException

    + FullyQualifiedErrorId : VimException

Rebooting

Restart-VMHost : 11/24/2019 12:37:12 PM Restart-VMHost          You cannot perform this operation in the current state. Use

Force parameter to force reboot operation.

At C:\PowerCli\reboot-vmcluster3.ps1:100 char:1

+ Restart-VMHost $CurrentServer -confirm:$false | Out-Null

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : InvalidArgument: (:) [Restart-VMHost], ViError

    + FullyQualifiedErrorId : Client20_ComputeResourceServiceImpl_RestartVmHost_ForceNotSpecified,VMware.VimAutomation

   .ViCore.Cmdlets.Commands.RestartVMHost

Below is what is am working with

Function RebootESXiServer ($CurrentServer) {

# Get Server name

$ServerName = $CurrentServer.Name

# Put server in maintenance mode

Write-Host "#### Rebooting $ServerName ####"

Write-Host "Entering Maintenance Mode"

$esx = Get-VMHost -Name $CurrentServer

$spec = New-Object VMware.Vim.HostMaintenanceSpec

$timeout = 60       # Timeout in seconds

$evacuatePoweredOffVMs = $false

$esx.ExtensionData.EnterMaintenanceMode($timeout,$evacuatePoweredOffVMs,$spec) -er

# Reboot server

Write-Host "Rebooting"

Restart-VMHost $CurrentServer -confirm:$false | Out-Null

# Wait for Server to show as down

do {

sleep 15

$ServerState = (get-vmhost $ServerName).ConnectionState

}

while ($ServerState -ne "NotResponding")

Write-Host "$ServerName is Down"

# Wait for server to reboot

do {

sleep 120

$ServerState = (get-vmhost $ServerName).ConnectionState

Write-Host "Waiting for Reboot ..."

}

while ($ServerState -ne "Maintenance")

Write-Host "$ServerName is back up"

# Exit maintenance mode

Write-Host "Exiting Maintenance mode"

Set-VMhost $CurrentServer -State Connected | Out-Null

Write-Host "#### Reboot Complete####"

Write-Host ""

}

Stop-Transcript

##################

## MAIN

##################

foreach ($ESXiServer in $ESXiServers) {

RebootESXiServer ($ESXiServer)

}

0 Kudos
LucD
Leadership
Leadership
Jump to solution

You could, before doing the reboot, check if the node is actually in maintenance mode.

If you do that in a loop, the script could just wait for maintenance mode.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
danraleigh
Contributor
Contributor
Jump to solution

i dont see how that would help, i would be back to where i was, which was the host stuck going to maintenance mode.

The solution you provided earlier worked perfect for a timeout if the host was unable to successfully enter maintenance mode. But i need a way for error checking to skip the reboot step if an error occurred (timeout) putting the host in maintenance mode.

0 Kudos
LucD
Leadership
Leadership
Jump to solution

If the call to EnterMaintenance mode comes back, there are 2 possibilities.

  1. The host is maintenance mode, continue with the reboot
  2. The host is not in maintenance mode, so it seems the timeout happened. Do not reboot.

If you check the actual mode of the node after the call, you can use that to decide what to do, reboot or not.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
danraleigh
Contributor
Contributor
Jump to solution

I see what you mean now, thanks

0 Kudos
danraleigh
Contributor
Contributor
Jump to solution

LucD,

I am not sure if  there is a cleaner way of doing this, but it seems to be working now with the assistance of one of my co-workers.  Thanks for pointing me in the right direction.

Function RebootESXiServer ($CurrentServer) {
# Get Server name
$ServerName = $CurrentServer.Name
$ServerState = (get-vmhost $ServerName).ConnectionState
# Put server in maintenance mode
Write-Host "#### Rebooting $ServerName ####"
Write-Host "Entering Maintenance Mode"
$esx = Get-VMHost -Name $CurrentServer
$spec = New-Object VMware.Vim.HostMaintenanceSpec
$timeout = 30       # Timeout in seconds
$evacuatePoweredOffVMs = $false
$esx.ExtensionData.EnterMaintenanceMode($timeout,$evacuatePoweredOffVMs,$spec)


#When server enters maintenance mode correctly it was happening too quickly and the next step to reboot was not being executed.  I put a sleep here so the server #has time to enter maintenance mode fully.
Write-Host = $CurrentServer Host Status $ServerState - Wait 30 seconds, check again
Start-Sleep 30
Write-Host = $CurrentServer Host Status $ServerState


# Reboot server
#Write-Host "Rebooting"
if($CurrentServer.ConnectionState -eq "Maintenance")
{
  write-host "The host, $CurrentServer,  is in maintenance mode,
Restarting"
  Restart-VMHost $CurrentServer -confirm:$false | Out-Null
}
 
ElseIf ($CurrentServer.ConnectionState -ne "Maintenance")
{
write-host "The host, $CurrentServer,  did not enter maintenance mode, continue to next host"

continue

}
                   
                    
            


0 Kudos
LucD
Leadership
Leadership
Jump to solution

If it works then it is good.

I personally would wait for the EnteredMaintenanceMode event.
Start fetching events for the ESXi node with a Start time set to the time you launched the EnterMaintenanceMode method.

Do this in a loop with a sleep of for example 5 seconds.

Then get the events of the last 5 seconds for the ESXi node again.

Something like this

$gaveCommand = Get-Date

$esx.ExtensionData.EnterMaintenanceMode($timeout,$evacuatePoweredOffVMs,$spec)


$start = $gaveCommand

while (-not (Get-VIEvent -Entity $esx -Start $start | where{$_ -is [VMware.Vim.EnteredMaintenanceModeEvent]})){

  sleep 5

  $start.AddSeconds(5)

}


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos