Re: Esxcli "Connection error"

StephenMoll · ‎03-13-2020

I have been running a script in "local.sh" on some of our hosts that does several things, amongst which is a task to bring a host out of maintenance mode.

We have been using this script for some years, starting at a time when we were using vSphere 6.5.

We are now on 6.7u2 and on one of our systems the script has started playing up.

Especially this bit:

MaintenanceModeStatus=$(esxcli system maintenanceMode get)

case $MaintenanceModeStatus in

Enabled)

logger -s "AUTO-START : Exiting Maintenance Mode"

vim-cmd hostsvc/maintenance_mode_exit

if [ $? -ne 0 ]

then

logger -s "AUTO-START : Maintenance Mode exit failed"

fi

;;

Disabled)

logger -s "AUTO-START : Already out of Maintenance Mode"

;;

*)

logger -s "AUTO-START : Invalid MaintenanceMode status - $MaintenanceModeStatus"

esac

We have started seeing log entries for an "Invalid MaintenanceMode status" appearing for both hosts that use the script on one of the systems.

$MaintenanceModeStatus is coming back as "Connection error".

Putting a sleep delay before this section of the script seems to help, but we would like to understand why.

When "local.sh" is run, is possibly the case that not all services in VCSA are fully ready, and calls to get information may come back invalid, empty or unexpected values?

NathanosBlightc · ‎03-13-2020

Hi

Try the localcli command instead of esxcli and give back the result

Please mark my comment as the Correct Answer if this solution resolved your problem

StephenMoll · ‎03-14-2020

I might try that.

I also going to propose that we put a test script in local.sh that makes lots of maintenance mode status requests in a tight loop and logs the results. Then we hopefully baseline the readiness time of the host and put either a fixed delay or alternatively change the request to loop until the response is either "Enabled" or "Disabled" as we expect. In the latter case though I am not sure if it would be prudent to put a limit on the loop, i.e. would we need to cover of the possibility that we don't ever get the expected responses?

I suppose in reality we could remove the check altogether, and simply request a maintenance mode exit whenever the script runs. That assumes that the request can't suffer the same "Connection error" of course.

NathanosBlightc · ‎03-14-2020

would we need to cover of the possibility that we don't ever get the expected responses?

Maybe somehow ...

Please test two ways:

1.Try this instead of esxcli: vimsh -n -e /hostsvc/maintenance_mode_exit (vim-cmd has only enter option not exit)

2. Run other esxcli / localcli system syntax (instead of maintenanceMode) when the host is in maintenanceMode and check their operations

Please mark my comment as the Correct Answer if this solution resolved your problem

StephenMoll · ‎03-14-2020

Not really keen to go the "localcli" route. But will try it if permitted to do so.

I suspect at the moment that the issue I am seeing is simply to do with timing. The script is trying to do things before the host is really ready. All this worked flawlessly under ESXi 6.5 for nearly two years. ESXi 6.7 definitely behaves differently in subtle ways.

Finally vim-cmd most definitely does have an exit maintenance mode, I use it quite often:

Copied from SSH session just now:

[root@MollESXi:~] vim-cmd hostsvc/

Commands available under hostsvc/:

advopt/ enable_ssh refresh_firewall

autostartmanager/ firewall_disable_ruleset refresh_services

datastore/ firewall_enable_ruleset reset_service

datastorebrowser/ get_service_status runtimeinfo

firmware/ hostconfig set_hostid

net/ hosthardware standby_mode_enter

rsrc/ hostsummary standby_mode_exit

storage/ login start_esx_shell

summary/ logout start_service

vmotion/ maintenance_mode_enter start_ssh

connect maintenance_mode_exit stop_esx_shell

cpuinfo pci_add stop_service

disable_esx_shell pci_remove stop_ssh

disable_ssh queryconnectioninfo task_list

enable_esx_shell querydisabledmethods updateSSLThumbprintsInfo