VMware Cloud Community
bayupw
Leadership
Leadership
Jump to solution

ESXi 6.5a in a vSAN cluster cannot exit maintenance mode

Hello

I have a lab setup on vSphere 6.5 (ESXi & vCenter) running on 3 node vSAN.

My problem is I can't get one of the host to exit maintenance mode.

I've tried rebooting the host but no luck.

It just stuck at 15% then failed with error: "A general system error occurred: HTTP error response: Method Not Allowed"

Tried exiting maintenance mode from vCenter, Host Client, and SSH console but same error.

Any idea where should I look for error or maybe this is a known issue?

pastedImage_1.png

pastedImage_0.png

pastedImage_2.png

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
0 Kudos
1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

Hello Bayu,

Have you checked the vSAN Maintenance Mode status of this host?

This is independant of vCenter/host level MM mode level.

You should be able to see this using cmmds-tool & specify DECOM_STATE against hosts.

(Apologies I am not at work right now and cannot access my notes for specific cmmds-tool commands, though if you see state 6 means it is in vSAN MM, state 4 means in the process of entering(wait on resync/vMotion etc.)

If this is a lab and no super-important data residing on it and/or components are healthy (state 7) then you could try leaving vSAN cluster on this host then seeing if you can exit MM and join it back, dropping it out of the vCenter cluster may also help.

Let me know how this works out, I will update with the vSAN MM cmmds-state commands tomorrow when I have access.

Bob

-o- If you found this comment useful or answer please select as 'Answer' and/or click the 'Helpful' button ,please ask follow-up questions if you have any -o-

View solution in original post

4 Replies
TheBobkin
Champion
Champion
Jump to solution

Hello Bayu,

Have you checked the vSAN Maintenance Mode status of this host?

This is independant of vCenter/host level MM mode level.

You should be able to see this using cmmds-tool & specify DECOM_STATE against hosts.

(Apologies I am not at work right now and cannot access my notes for specific cmmds-tool commands, though if you see state 6 means it is in vSAN MM, state 4 means in the process of entering(wait on resync/vMotion etc.)

If this is a lab and no super-important data residing on it and/or components are healthy (state 7) then you could try leaving vSAN cluster on this host then seeing if you can exit MM and join it back, dropping it out of the vCenter cluster may also help.

Let me know how this works out, I will update with the vSAN MM cmmds-state commands tomorrow when I have access.

Bob

-o- If you found this comment useful or answer please select as 'Answer' and/or click the 'Helpful' button ,please ask follow-up questions if you have any -o-

bayupw
Leadership
Leadership
Jump to solution

Thanks Paul/Bob (not sure how should I addres you)

I did try exiting maintenance mode before vCenter was powered on and after vCenter was on, I don't know if this makes any difference.

I'll try to use vSAN MM cmmds-state & probably remove it from the cluster if possible and add it back.

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

You can check the vSAN MM state of all hosts in a vSAN cluster by using this from any node in the cluster:

# cmmds-tool find -t NODE_DECOM_STATE -f json

As I said previously, 0=not in vSAN MM, 4=entering and 6=in vSAN MM.

From looking at this again though I get the feeling the problem is not at the vSAN level.

Maybe check the vSAN MM state as mentioned above (if=0 then not a mismatch case here) and try leave cluster and attempt exit MM (between these two will likely rule out vSAN).

Any chance you could attach vmkernel.log, vobd.lod & hostd.log from this host covering a time when this was attempted for more potential clues?

Bob

-o- If you found this comment useful or answer please select as 'Answer' and/or click the 'Helpful' button ,please ask follow-up questions if you have any -o-

TheBobkin
Champion
Champion
Jump to solution

Hello Bayu,

What's that Install Agent task that failed before the enter MM then failed to leave MM?

If none of the vSAN tests/options advised pan out you could always re-install ESXI on this host(vSAN won't care provided you don't touch its disks XD).

After re-install is just a case of adding and tagging a vmk interface for vSAN Traffic and adding it back to the cluster:

https://kb.vmware.com/kb/2059091

Bob

-o- If you found this comment useful or answer please select as 'Answer' and/or click the 'Helpful' button ,please ask follow-up questions if you have any -o-