VMware Cloud Community
aj800
Enthusiast
Enthusiast
Jump to solution

VM Web Console / Remote Console stopped working

We recently noticed that on a few of our hosts, the VM web and remote consoles stopped working.  In the Web console you get in the browser when you click on the link from the VM's Summary page, it shows "The console has been disconnected).

Some additional context & details:

In this environment, we recently renewed the ESXi hosts' SSL/TLS certificates and we are using custom certs issued to us from our CA.  When we set up this environment, and a matching/mirror one, we installed the first 3 hosts in both, then added several more later.  It's those hosts that are experiencing this issue, and it appears that the certs are valid and uniform across all of them (not just the 3 in each, but ALL hosts except for one not yet renewd) in this environment, so I can't tell if this a certificate issue or some configuration issue we missed that has come back to haunt us.

In the mirrored environment, as noted, I had made a mistake applying the new certificate so we haven't renewed it yet since we're waiting for it to be issued... so that host is currently using the self-signed VMware certificate it defaulted to.  However, on this host, the console access DOES work, which leads me to believe that this issue may be certificate-related, but I'm unsure.

Does anyone have any ideas what could be the issue and how to correct it?  Thanks.

Reply
0 Kudos
1 Solution

Accepted Solutions
aj800
Enthusiast
Enthusiast
Jump to solution

So after some troubleshooting and tracking, it turned out to be a certificate-related issue between vSphere/ESXi (we're using 6.7 for both, by the way):

We built our environments starting with 3 hosts each.  We added certs to the hosts/vCenter after setting up the clusters in each environment. and before adding more hosts to them.  This was evident because the host certs for the original 3 in each environment had expired before the others.  When we replaced them all at the same time to enure they were all valid (the others were set to expire not long after the originals, but hadn't yet), console access (both web and remote) had stopped, but only on the first 3 hosts in each environment where we had applied certs first.

To fix this:

As an older post here suggested, we put the host into maintenance mode (migrated all VMs off to other hosts), then disconnected it (menu -> Connect -> Disconnect), then removed the host from vSphere (menu -> Remove from inventory).  Then, the host was brought back (Cluster menu -> Add hosts...) using the FQDN and root credentials.  When you add the host, you must also add the host back to the virtual distributed switch it was connected to before it was removed (if your host was connected to a vDS).  Assign the correct uplinks to the appropriate physical NICs (vmnics), then you can exit maintenance mode to bring the host back online.

An important note in my case, however: we're using a LAG for the uplinks to 2 vmnics, and the link icons in the vDS for the host did not show as green (up), even after exiting maintenance mode.  I'm not sure why.  I've seen this before and I believe a reboot had fixed it then.  Also, none of the port groups were displayed on the host's vDS (host menu -> Configure -> Networking -> Virtual Switches) until it was out of maintenance mode, but I think this much is normal behavior.  VMs automatically migrated to the host successfully once it was rejoined to vSphere and out of maintenance mode, and were fully operational without interruption.  The consoles were accessible, as well, on all hosts.

View solution in original post

Reply
0 Kudos
1 Reply
aj800
Enthusiast
Enthusiast
Jump to solution

So after some troubleshooting and tracking, it turned out to be a certificate-related issue between vSphere/ESXi (we're using 6.7 for both, by the way):

We built our environments starting with 3 hosts each.  We added certs to the hosts/vCenter after setting up the clusters in each environment. and before adding more hosts to them.  This was evident because the host certs for the original 3 in each environment had expired before the others.  When we replaced them all at the same time to enure they were all valid (the others were set to expire not long after the originals, but hadn't yet), console access (both web and remote) had stopped, but only on the first 3 hosts in each environment where we had applied certs first.

To fix this:

As an older post here suggested, we put the host into maintenance mode (migrated all VMs off to other hosts), then disconnected it (menu -> Connect -> Disconnect), then removed the host from vSphere (menu -> Remove from inventory).  Then, the host was brought back (Cluster menu -> Add hosts...) using the FQDN and root credentials.  When you add the host, you must also add the host back to the virtual distributed switch it was connected to before it was removed (if your host was connected to a vDS).  Assign the correct uplinks to the appropriate physical NICs (vmnics), then you can exit maintenance mode to bring the host back online.

An important note in my case, however: we're using a LAG for the uplinks to 2 vmnics, and the link icons in the vDS for the host did not show as green (up), even after exiting maintenance mode.  I'm not sure why.  I've seen this before and I believe a reboot had fixed it then.  Also, none of the port groups were displayed on the host's vDS (host menu -> Configure -> Networking -> Virtual Switches) until it was out of maintenance mode, but I think this much is normal behavior.  VMs automatically migrated to the host successfully once it was rejoined to vSphere and out of maintenance mode, and were fully operational without interruption.  The consoles were accessible, as well, on all hosts.

Reply
0 Kudos