VMware Cloud Community
Mangosniper
Enthusiast
Enthusiast

Second vCenter cluster items not visible in vSphere webinterface some weeks after update

Hello everyone,

We have currently 2 VMware vSphere Essentials Kit, thus 6 ESXI Servers and 2 vCenter Appliances. They were running in version 6.0 till we made an upgrade to 6.7 last year. We have an External Platform Services Controller, to which both VCSAs are connected to.

I just realized that despite working in the beginning and also before the upgrade, I can no longer see the inventory items of both of the vCenters when browsing the web interface of one of them. If I connect to vcsa01 I can see the VMs, Folders, Datacenter and Hosts of that cluster but not vcsa02. If I connect to vcsa02 its the other way round. I always see the entry of the other vcenter though, in the past I could just expand the items like I can for the one I logged in to.

2021-01-12 11_53_08-Window.png

Unfortunately I am not the original guy who set everything up but I merely maintained it. We have a single SSO to which both of the vcenters (should) use afaik. I don't know if something like Enhanced link mode is activated.

I didnt change anything regarding permissions during or after the upgrade. I only worked on the certificates (Dont mind the certificate warning the screenshot, I was connected from a device that is missing the root certificate).

If you could point me to any troubleshooting guide or tell me which more information to gather (e.g. logs, Screenshots of specific parts of the config) that would be very nice. I doubt that from the information I offered yet, someone can already give a clue whats wrong, but you can of course suprise me if you want.

Greetings,

Niklas

Reply
0 Kudos
10 Replies
msripada
Virtuoso
Virtuoso

1. If you are using vSphere Essentials license for vcenter, I believe the ELM or Enhanced linked mode is not supported.

https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-6ADB06EF-E342-...

 

2. If it was working fine till 2,3 days  back, check if there is any permissions issue due to which it might have stopped working. If you are logging in with SSO admin and still face the same issue, raise SR with GSS after validating licenses. 

thanks,

MS

Mangosniper
Enthusiast
Enthusiast

Hey msripada,

thinks for the quick answer. Regarding 1). Okay that makes it easier to troubleshoot at least as I can leave out everything regarding ELM. Regarding 2.) I am logged in with the Administrator account of the local domain that the vcenter uses (I suppose):

image.png

But I used that account to check before my initial posting here so nothing changed, I still cant see the items of the second vcenter. What might be odd is that I cant see any roles or global permissions. I have to admit I am not sure if this is "normal"

image.pngimage.png

Regarding Support: I think we are using unsupported hardware... afaik that disqualifies us for any kind of official support more or less immediately.

Reply
0 Kudos
bryanvaneeden
Hot Shot
Hot Shot

Hi @Mangosniper,

Regarding the support, AFAIK the vCenter doesn't have any hardware requirements (except for how many vCPU/RAM/Disk etc). So you should be able to raise a support case if you have the support on the license.

And on the missing vCenter Server in the Web Client, I am just going to spit some things out on here that I always use:

  • Are the vCenters already rebooted?
  • Nothing happened right and it worked before? How about any changes on the network? 
  • You should be able to login to vcsa01 and see the roles from vcsa01, and vice versa for the other vCenter Server.

Please issue the following commands and provide the output:

In the /usr/lib/vmware-vmdir/bin folder:
./vdcrepadmin -f showservers -h localhost -u administrator
./vdcrepadmin -f showpartnerstatus -h localhost -u administrator

In the /usr/lib/vmware-vmafd/bin folder:
./vmafd-cli get-ls-location --server-name localhost
./vmafd-cli get-site-name --server-name localhost
./vmafd-cli get-domain-name --server-name localhost

In the /opt/likewise/bin folder:
./domainjoin-cli query --stop --all

This should give us a good example off your environment. Please do this on both vCenters. 

Visit my blog at https://vcloudvision.com!
Mangosniper
Enthusiast
Enthusiast

Hello, @bryanvaneeden 

Regarding the support, AFAIK the vCenter doesn't have any hardware requirements (except for how many vCPU/RAM/Disk etc). So you should be able to raise a support case if you have the support on the license.

Thats already a very valuable information for me, thanks.

  • Are the vCenters already rebooted?

Yes. Just did it again to be sure. Issue is still present.

  • Nothing happened right and it worked before? How about any changes on the network? 

Well, I am not the guy who tends to say "I did nothing, nothing changed". Of course something changed. People were working with the system, creating VMs, our operational IT decommissioned our core switch and replaced it with two new redundant switches. But nothing I would, from my experience, corelate with the now seen behavior.

  • You should be able to login to vcsa01 and see the roles from vcsa01, and vice versa for the other vCenter Server.

As you can see from the screenshots in my prior post, I seem to not be able to see any roles. its the same on both vCenters.

Here are the infos you requested. The folder "/usr/lib/vmware-vmdir/bin" only exists on the PSC, therefore I could run the commands only on that device. I also obfuscated our domain name, probably not really necessary but better safe than sorry. I can however assure that there is always our correct domain behind every blurred text.

PSC:

2021-01-13 09_24_19-mRemoteNG - confCons.xml - psc.png

2021-01-13 09_25_01-mRemoteNG - confCons.xml - psc.png

2021-01-13 09_25_56-mRemoteNG - confCons.xml - psc.png

vCenter 01

2021-01-13 09_28_59-mRemoteNG - confCons.xml - vcsa01.png

vCenter 02

2021-01-13 09_30_24-mRemoteNG - confCons.xml - vcsa02.png

 

 

Reply
0 Kudos
bryanvaneeden
Hot Shot
Hot Shot

Hi @Mangosniper,

You are absolutely correct, some commands only exist on the PSC because the PSC have those services and the VCSA don't. So looking at the output, it does seem to be correct on the vCenter side.

But on the PSC side, it looks like the ELM mode is no longer correctly working. With the "./vdcrepadmin -f showservers -h localhost -u administrator" command you should see both vCenter Servers like below:

root@vcsa02 [ /usr/lib/vmware-vmdir/bin ]# ./vdcrepadmin -f showservers -h localhost -u administrator
password:
cn=vcsa01,cn=Servers,cn=default-site,cn=Sites,cn=Configuration,dc=vsphere,dc=local
cn=vcsa02,cn=Servers,cn=default-site,cn=Sites,cn=Configuration,dc=vsphere,dc=local

It seems that we can breake the ELM mode, but some documentation refers this as being irreversible (if you go out of the SSO Domain), which we obviously don't want. What you can do is the following:

This should give you enough information on ELM repointing and such. I hope this helps.

Visit my blog at https://vcloudvision.com!
Mangosniper
Enthusiast
Enthusiast

Hey @bryanvaneeden ,

thanks for your further help. 

First, when executing

 

root@vcsa02 [ /usr/lib/vmware-vmdir/bin ]# ./vdcrepadmin -f showservers -h localhost -u administrator

 

only the PSC shows up.

Mangosniper_0-1610700165947.png

I read your blogposts and tried to repoint accordingly. However the following command does not work at all.

 

cmsso-util domain-repoint -m pre-check --src-emb-admin "administrator" --replication-partner-fqdn "vcsa01.mydomain.de" --replication-partner-admin administrator --dest-domain-name "vcsd.local"

 

I get a warning about wrong syntax / missing arguments.

 

usage:
       Repointing vCenter Server from one Platform Services Controller to
       another Platform Services Controller in a different domain. The
       repointing operation migrates Tags, Authorization & License data to
       another Platform Services Controller.

       For example, to repoint vCenter Server from Platform Services Controller
       in vpshere.local domain to a Platform Services Controller in nsx.local
       domain.

       cmsso-util domain-repoint
              --mode [pre-check|execute]
              --src-psc-admin <administrator>
              --dest-psc-fqdn <target-psc-fqdn>
              --dest-psc-admin <administrator>
              --dest-domain-name <corp.local>
              --dest-vc-fqdn <target-vc-fqdn>

 

So I followed the instructions to execute

 

cmsso-util domain-repoint -m pre-check --src-psc-admin "administrator" --dest-psc-fqdn "psc.mydomain.de" --dest-psc-admin administrator --dest-domain-name "vcsd.local"

 

And after entering all passwords I get

 

Source Platform Services Controller host name = psc.geutebrueck.de, Target Platform Services Controller host name = psc.mydomain.de Repointing to the same Platform Services Controller node is not supported.

 

on both, vcsa01 and vcsa02.

Regarding the Ports. I did not found any place yet where to specify open/closed ports. In the management area of the vcsa there are no firewall rules:

image.png

The ESXIs where the vcsa01, vcsa02 and psc are running on are on the same subnet with no physical or virtual firewalls in between. So the ports should be reachable IMHO. But I will try also with netcat or something else today if possible.

Probably I will make a support call with VMWare if we really are entitled. I dont want to further consume your time, thus It was already a great help and a lot to learn for me 😉

Greetings,

Mango

Reply
0 Kudos
Mangosniper
Enthusiast
Enthusiast

So, in the meantime it came out that our support plan is not running anymore... So no ticket for me.

I also saw that the two clusters are at least in some way still connected as I can see events from both

Mangosniper_0-1612436734161.png

 

I will continue searching for a solution and post it here if I find one.

Reply
0 Kudos
bryanvaneeden
Hot Shot
Hot Shot

Hi @Mangosniper ,

Sorry that I didn't get back to you any sooner on this. I've reviewed your previous posts and this is what I've come up with:

  • The cmsso-util command syntax is indeed something that was a bit off on my blogposts. That is because I've used it for a repoint with vCenters that had Embedded PSC nodes. You have an external PSC so that would change the syntax to the syntax you have already found. So that would be the correct syntax. 
  • Since the vCenters don't allow a repoint, it looks like everything is correctly configured and already pointing to the current PSC. Like I said before, repointing/removing the connection looks like it's irriversible, so I will not suggest to do this.
  • You mentioned all the vCenters and the PSC are located on the same host. Would that also mean they are connected to the same Portgroup and vlan? If so, yes that should ensure that the ports are accessible. The firewall settings in the vCenter VAMI are empty, so that also shouldn't be an issue here.
  • Looking at your last post, I can see that you can see vCenter events from the other vCenter server, while being logged into the first one. That is good, this means that events are being replicated to each other.

Another thing you would be able to check is if the log files get replicated over to the other appliance. You can do this by logging into the "vcsa01", doing some stuff in the HTML5 UI for "vcsa02", and checking the log files from "vcsa01" if the log lines show up. The log file you should be looking into would be the "/var/log/vmware/vsphere-ui/vsphere_client_virgo.log".

Other than that. I am unsure what could be the issue at this point. I suggest to try another vCenter update on your end to the latest version that is available. That might help.

Let me know if you find anything, or of there is anything else I can help you with.

Visit my blog at https://vcloudvision.com!
Reply
0 Kudos
Mangosniper
Enthusiast
Enthusiast

Wow.. I found the issue. In the end I just didn't import the CA certificate from which I asked our IT to create the machine certificates for the VCSAs. I finally found that scouting through the logs. The vAPI Endpoint was not healthy

Mangosniper_0-1613054953850.png

I checked the logs via "cat /var/log/vmware/vapi/endpoint/endpoint.log" and found

Mangosniper_1-1613055013718.png

and

Mangosniper_3-1613055945151.png

That gave me the final hint.

Thanks a lot @bryanvaneeden for your effort. And for everyone else who runs into the same issue, hope this helps.

Greetings,

Mango

Reply
0 Kudos
bryanvaneeden
Hot Shot
Hot Shot

Hi @Mangosniper ,

There you have it :). Certificates is a pain in the ass, but most of the time there is a reference somewhere in the log files that will help.

Nice that you've found it!

Visit my blog at https://vcloudvision.com!
Reply
0 Kudos