VMware Cloud Community
baubau01
Contributor
Contributor

SSO Design Question

Hi guys,

Our large enterprise needs to upgrade to 5.1 asap. The only thing that holds us back is this SSO piece of the puzzle. We cuuently have multiple vcenters around 20 across the globe in America, Europe, Asia.  Knowing that each Vmware service will be affected by SSO we need to design this properly

We got about 3 options right now not sure which one we are better to go with

Option 1(I personally think we should go with): Each region like America, Europe, Asia will have a single SSO DB and all the Vcenters will be in the same SSO HA. Each SSO in the region will have a SQL cluster behind it. This will be a combination of SSO HA mode and SSO multi-site HA mode.

Option 2: Because the SSO DB is so small and the WAN traffic wont be much to the SQL cluster just have one SSO DB and every Vcenter will be joined to that SSO DB in HA mode. ( In case something does happen with SSO DB all the 20Vcenters will be locked, nobody will be able to log in to them)

Option 3: Each Vcenter will have it`s own local SSO DB and be managed locally. If there are problems with a Vcenter the rest of vcenter wont be affected at all. Problem with this setup ( We want to have 10 vcenters linked to each other, but thats only possible if all Vcenter point to the same SSO DB in the HA mode)

What do you guys think ? How should we go.. please correct me if I got any miss understandning of how all of this works.

Reply
0 Kudos
32 Replies
WasimShaikh
Enthusiast
Enthusiast

I sent u a PM with my email.

Reply
0 Kudos
Cheride
Contributor
Contributor

Ok.. Here is my update as of today morning.

we had issues with installing the web client and was getting error ( cert mismatch) when trying to point to the URL to SSO VIP.

Per VMware support advice , we created a self-sign cert on F5 and I was able to complete the Web Client installation successfully.

Next step:

Login to web client. When attempting to login we are getting the error “ Failed to connect to VMware lookup service. SSL certificate verification failed.

Waiting for a call from Support Engineer. Updates to follow.

Reply
0 Kudos
WasimShaikh
Enthusiast
Enthusiast

Have you installed SSL on SSO server?

Is your vcenter server have SSL configured?

Reply
0 Kudos
KBaillie08
Contributor
Contributor

If you have created certs as per http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=203501... copy the ca_certificates.crt file to c:\programdata\vmware\ssl folder on the webclient server.

Reply
0 Kudos
WasimShaikh
Enthusiast
Enthusiast

Got any updates on configuration part?

What are the changes that you made on SSO nodes?

I did complete all the Server setup, and all vCenter services were online.

But the Failover was not working as it should. 😞

Reply
0 Kudos
Cheride
Contributor
Contributor

No…we worked all day friday…we are now stuck at ssl certificate validation issue.

We made some progress when we change the .crt file to .pem . we also applied the same file on F5.

Reply
0 Kudos
WasimShaikh
Enthusiast
Enthusiast

Ok,

Did you try what KBaillie suggested?

Coz thats the solution in most of the cases.
copy the ca_certificates.crt file to c:\programdata\vmware\ssl even if SSO is not having directory \vmware\ssl create it and copy Root64.cer to it and rename it to ca_certificates.crt

Reply
0 Kudos
KBaillie08
Contributor
Contributor

Spent some time with support over the weekend on this issue. When failing over to the secondary SSO node the following error was displayed on the web client:

Failed to communicate with the vCenter Single Sign On Server http://\webapps\sso-adminServer\WEB-INF\web.xml

This makes the secondary node the same as the primary, you then have to ensure active passive access to the SSO nodes is configured on the load balancer. This also means if a change is made to the primary the web.xml file needs to be manually copied to the secondary. I haven’t completed testing yet, but you may have to restart the SSO service on the secondary node once failover has occurred to allow authentication.

Another useful insight from engineering was the following:

If the primary SSO node fails and VC is restarted while primary node is down then VC will be unable to authenticate user access, because it relies on the admin service running on the primary SSO node. In this scenario the secondary node is basically useless.

Reply
0 Kudos
KBaillie08
Contributor
Contributor

For anyone who is experiencing "Error 20010: Failed to update lookup service" when installing the secondary SSO node there is a workaround that has been provided by VMware.

Note: this error occured for me when configuring the primary node with custom SSL certs before installing the secondary node (VMware recommends installing the primary and secondary before configuring custom certs)

If anyone from VMware is reading this.. your software should be flexible enough to add SSO nodes as your environment needs it!!

1. Download ssojavalib.zip from here

https://ftpsite.vmware.com:443/download?domain=FTPSITE&id=aca05026b7e014fcf42df94d9b36e874

(If this file no longer exists you will need to contact VMware support)

2. On the secondary SSO node launch the SSO installer and proceed until the Welcome screen.

3. Extract the ssojavalib.zip and place in c:\temp

4. Change directory to %temp%.  Open directories (the directory will have a long ID {DEC5C346-414B-.....}) until you find a sub-dir with the name ssojavalib.

5. Replace the entire ssojavalib folder with the folder that was extracted to c:\temp

6. Continue with the SSO install.

Hope this helps someone out there.

Reply
0 Kudos
WasimShaikh
Enthusiast
Enthusiast

Hi,

Finally SSO is load balanced!!..

I came to update the post and read your reply.

Error 20010: fuj it man, I did approx 9-10 re-installation coz of that.. and then I figured out better to install both nodes without updating SSL on node1.

Even updating the ssolscli.jar file did not work.

I updated ssolscli.jar on both nodes after installing SSO.

the server.xml file I think was the key to make it happen.. I was doing a silly mistake.. didn't put the jvmRoute="SrvName" i.e., jvmRoute="SSOA"

did this changes on both nodes, Node1 have SSOA and Node2 have SSOB.

after installtion of SSO, updated root-trust.jks on both nodes.

and updated the endpoints to https://ssoa.domain.com:7444/lookupservice/sdk

I also prepared a complete step-by-step and  will upload the article once I do the testing throughly..

But for now, I am able to login to vCenter web client even if Node1 is offline / disconnected..

I can see on LB that traffic for sso-adminserver is not going through.. and yes thats coz the service is available on Primary node only.

And in my lab, i dont need to restart service to make it work after primary node has dropped out.

I will restart the vCenter server while Node1 is offline and will check what happens..

If what engrr said, then there is no point in putting this all together..

Reply
0 Kudos
WasimShaikh
Enthusiast
Enthusiast

Hey,

If you can ask one more question from my side to tech suppor that will be helpful..
After all bits started working when I check STS Certificate tab (logged in with admin@system-domain)

why there is only 1 Chain, and that too is having Node1 cert with Root Authority as Self Signed?

even through both the Nodes is having updated certs, (root-trust.jks)

Reply
0 Kudos
Cheride
Contributor
Contributor

Here is the work around to address the issue you have mentioned below.

Ff the vCenter server vpxd process is stopped for any reason it wont be able to restart until the primary SSO is available again. This is because the SSO-adminserver which include restarting the vCenter services cannot be started to from the secondary. ( SSO-Admin service is no available in secondary).

A workaround for this issue is as follows,

1. Copy the following from Primary to Secondary SSO server.

\webapps\sso-adminServer\WEB-INF\web.xml

2. Also you would need to redirect SSO admin traffic to the newly promoted Primary on your Load Balancer.

3.Restart the SSO service

You can add this to the secondary now or you can save the file and add it to the secondary node if the primary ever goes down.

This will allow you to restart the vCenter from the secondary node.

Reply
0 Kudos
orthohin
Enthusiast
Enthusiast

That's right.

Never trust a computer you can't throw out a window
Reply
0 Kudos