jpotrzeba
Contributor
Contributor

Issue in upgrading VCSA 6.7 -> 7.0 (Failed Step 2 Install - A problem occurred while - Starting VMware Security Token Service...)

Hello,

We are currently seeing the following error when upgrading to VCSA 7.0 from 6.7, this error occurs durning step 2 of the new VCSA.

Error

Encountered an internal error. Traceback (most recent call last): File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 1752, in main vmidentityFB.boot() File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 367, in boot self.registerTokenServiceWithLookupService() File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 654, in registerTokenServiceWithLookupService raise e File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 650, in registerTokenServiceWithLookupService dynVars=dynVars) File "/usr/lib/vmware-cm/bin/cloudvmcisreg.py", line 710, in cloudvm_sso_cm_register serviceId = do_lsauthz_operation(cisreg_opts_dict) File "/usr/lib/vmware/site-packages/cis/cisreglib.py", line 1058, in do_lsauthz_operation ls_obj = LookupServiceClient(ls_url, retry_count=60) File "/usr/lib/vmware/site-packages/cis/cisreglib.py", line 314, in __init__ self._init_service_content() File "/usr/lib/vmware/site-packages/cis/cisreglib.py", line 294, in do_retry return req_method(self, *args, **kargs) File "/usr/lib/vmware/site-packages/cis/cisreglib.py", line 304, in _init_service_content self.service_content = si.RetrieveServiceContent() File "/usr/lib/vmware/site-packages/pyVmomi/VmomiSupport.py", line 556, in <lambda> self.f(*(self.args + (obj,) + args), **kwargs) File "/usr/lib/vmware/site-packages/pyVmomi/VmomiSupport.py", line 368, in _InvokeMethod return self._stub.InvokeMethod(self, info, args) File "/usr/lib/vmware/site-packages/pyVmomi/SoapAdapter.py", line 1448, in InvokeMethod conn.request('POST', self.path, req, headers) File "/usr/lib/python3.7/http/client.py", line 1252, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.7/http/client.py", line 1298, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.7/http/client.py", line 1247, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.7/http/client.py", line 1026, in _send_output self.send(msg) File "/usr/lib/python3.7/http/client.py", line 966, in send self.connect() File "/usr/lib/vmware/site-packages/pyVmomi/SoapAdapter.py", line 1085, in connect six.moves.http_client.HTTPSConnection.connect(self) File "/usr/lib/python3.7/http/client.py", line 1414, in connect super().connect() File "/usr/lib/python3.7/http/client.py", line 938, in connect (self.host,self.port), self.timeout, self.source_address) File "/usr/lib/python3.7/socket.py", line 727, in create_connection raise err File "/usr/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused

Resolution

This is an unrecoverable error, please retry install. If you encounter this error again, please search for these symptoms in the VMware Knowledge Base for any known issues and possible resolutions. If none can be found, collect a support bundle and open a support request.

We are currently unsure why this error is occurring. This error appears as the Security Token Service (STS) is starting up during the install.

0 Kudos
21 Replies
scott28tt
VMware Employee
VMware Employee

Moderator: Thread moved to the vSphere: Upgrade & Install area.

0 Kudos
Nawals
Expert
Expert

Hi,

Looks like have DNS issue. Here is the KB for same issue VMware Knowledge Base

NKS Please Mark Helpful/correct if my answer resolve your query.
0 Kudos
nirmalgnair
VMware Employee
VMware Employee

Hi @jpotrzeba,

This issue can happen if SSO Administrator account is not authorized to add service to the Lookup Service and this can be because SSO Administror is not available in Builtin Administrator group in VMDIRD.

To check the same;

Connect to the Source PSC using jXplorer.

As you can see here, SSO Administrator user is missing under Builtin - Administrators.

If missing go ahead and add the user.

To add the user,

Go to Builtin - Administrators

Go to Table Editor.

Right click on member space and click on Add another Value

On the value field, add : cn=Administrator,cn=Users,dc=vsphere,dc=local

If SSO domain name is not vsphere.local, change it accordingly.

Submit the changes and run the upgrade again. You cannot use the same failed appliance so you need to start from the beginning after deleting the failed appliance.

Regards,

Nirmal Nair

Install-Upgrade Specialist

0 Kudos
scott28tt
VMware Employee
VMware Employee

Your images are not visible:

Screenshot 2020-06-08 at 09.11.51.png

0 Kudos
jpotrzeba
Contributor
Contributor

Hello nirmalgnair​,

Thanks for the response! I am unable to see your images that you have supplied. We did make a change away from vsphere.local to our domain name so it matches. Is there a conflict going on with that?

Thanks!

0 Kudos
nirmalgnair
VMware Employee
VMware Employee

Hi @jpotrzeba,

Not sure why the screenshot did not get attach properly.

I wanted to check if the SSO User is a Part of Builtin - Administrators group. (This time I have attached the screenshot again)

In your case if its missing we need to add or if you still have the failed appliance we can check further on this failure,

I would suggest you to take an SSH to the Appliance and go to /var/log/firstboot and check the firstbootInfrastructure.log

Last line will say which component failed. For Eg :

INFO firstbootInfrastructure [Failed] /usr/lib/vmidentity/firstboot/vmidentity-firstboot.py is complete

In this case, it failed at vmidentity firstboot.

In the same way if you can get us the stdout and stderr log files we can check further on this issue.

Regards,

Nirmal Nair

Install-Upgrade Specialist

0 Kudos
jpotrzeba
Contributor
Contributor

Hello nirmalgnair​,

I will check with JXplorer.

In firstbootinfrastructure.log, I do not see any failed components, but I do see some warning messages on "No service dependency was injected..." for vmware-vmon, vmware-stsd & vmdird:

2019-06-22T00:36:37.201Z INFO firstbootInfrastructure [Finished] /usr/lib/vmware-perfcharts/firstboot/perfcharts_firstboot.py is complete

2019-06-22T00:36:37.203Z INFO firstbootInfrastructure Firstboot duration: 460 sec

2019-06-22T00:36:37.203Z INFO firstbootInfrastructure First boot is a success

2019-06-22T00:36:37.204Z INFO firstbootInfrastructure Begin processing all service dependencies for "embedded"

2019-06-22T00:36:37.205Z INFO firstbootInfrastructure Dependencies for vmware-vmon already processed internally

2019-06-22T00:36:37.205Z INFO firstbootInfrastructure Service hvc is integrated with vmware-vmon

2019-06-22T00:36:37.205Z INFO firstbootInfrastructure Service netdumper is integrated with vmware-vmon

2019-06-22T00:36:37.205Z INFO firstbootInfrastructure Injecting service dependencies for vmdns

2019-06-22T00:36:37.205Z INFO firstbootInfrastructure /etc/init.d/vmdnsd depends on vmdird

2019-06-22T00:36:37.206Z WARNING firstbootInfrastructure No service dependency was injected into /etc/rc.d/init.d/vmdnsd

2019-06-22T00:36:37.206Z INFO firstbootInfrastructure Service vsm is integrated with vmware-vmon

2019-06-22T00:36:37.206Z INFO firstbootInfrastructure Service vsan-health is integrated with vmware-vmon

2019-06-22T00:36:37.206Z INFO firstbootInfrastructure Service trustmanagement is integrated with vmware-vmon

2019-06-22T00:36:37.206Z INFO firstbootInfrastructure Injecting service dependencies for pod

2019-06-22T00:36:37.206Z INFO firstbootInfrastructure /etc/init.d/vmware-pod depends on vmafdd vmcad

2019-06-22T00:36:37.206Z ERROR firstbootInfrastructure /etc/init.d/vmware-pod not found!

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Service rbd is integrated with vmware-vmon

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Injecting service dependencies for vmafdd

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure /etc/init.d/vmafdd depends on lwsmd

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Changing dependencies in "/etc/rc.d/init.d/vmafdd", new deps: "{'lwsmd', '$network', '$remote_fs', 'lsassd'}", old deps: "['$network', '$remote_fs', 'lsassd']"

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Service topologysvc is integrated with vmware-vmon

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Service statsmonitor is integrated with vmware-vmon

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Service vsan-dps is integrated with vmware-vmon

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Service vpxd is integrated with vmware-vmon

2019-06-22T00:36:37.207Z INFO firstbootInfrastructure Injecting service dependencies for visl-integration

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service vsphere-client-servicemarker is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service vmsyslogcollector is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service applmgmt is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service analytics is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service eam is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service mbcs is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service rhttpproxy is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service vsphere-client is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service vapi-endpoint is integrated with vmware-vmon

2019-06-22T00:36:37.208Z INFO firstbootInfrastructure Service pschealth is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Service cm is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Service vpostgres is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Service vpxd-svcs is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Service sps is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Service sca is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Service cis-license is integrated with vmware-vmon

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure Injecting service dependencies for sts

2019-06-22T00:36:37.209Z INFO firstbootInfrastructure /etc/init.d/vmware-stsd depends on vmware-sts-idmd

2019-06-22T00:36:37.210Z WARNING firstbootInfrastructure No service dependency was injected into /etc/rc.d/init.d/vmware-stsd

2019-06-22T00:36:37.210Z INFO firstbootInfrastructure Injecting service dependencies for soluser

2019-06-22T00:36:37.210Z INFO firstbootInfrastructure Service imagebuilder is integrated with vmware-vmon

2019-06-22T00:36:37.210Z INFO firstbootInfrastructure Service vmonapi is integrated with vmware-vmon

2019-06-22T00:36:37.210Z INFO firstbootInfrastructure Service perfcharts is integrated with vmware-vmon

2019-06-22T00:36:37.210Z INFO firstbootInfrastructure Service vcha is integrated with vmware-vmon

2019-06-22T00:36:37.210Z INFO firstbootInfrastructure Injecting service dependencies for vmware-cis-config

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Service vsphere-ui is integrated with vmware-vmon

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Service vmware-postgres-archiver is integrated with vmware-vmon

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Injecting service dependencies for vmca

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure /etc/init.d/vmcad depends on vmdird

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Changing dependencies in "/etc/rc.d/init.d/vmcad", new deps: "{'$remote_fs', '$network', 'vmdird'}", old deps: "['$network', '$remote_fs', 'vmafdd']"

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Service certificatemanagement is integrated with vmware-vmon

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Service content-library is integrated with vmware-vmon

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Service vmcam is integrated with vmware-vmon

2019-06-22T00:36:37.211Z INFO firstbootInfrastructure Injecting service dependencies for vmdird

2019-06-22T00:36:37.212Z INFO firstbootInfrastructure /etc/init.d/vmdird depends on vmafdd

2019-06-22T00:36:37.212Z WARNING firstbootInfrastructure No service dependency was injected into /etc/rc.d/init.d/vmdird

2019-06-22T00:36:37.212Z INFO firstbootInfrastructure Service updatemgr is integrated with vmware-vmon

2019-06-22T00:36:37.212Z INFO firstbootInfrastructure Injecting service dependencies for likewise

2019-06-22T00:36:37.212Z INFO firstbootInfrastructure Injecting service dependencies for idm

2019-06-22T00:36:37.212Z INFO firstbootInfrastructure /etc/init.d/vmware-sts-idmd depends on vmcad vmdird

2019-06-22T00:36:37.212Z INFO firstbootInfrastructure Changing dependencies in "/etc/rc.d/init.d/vmware-sts-idmd", new deps: "{'vmcad', '$network', '$remote_fs', 'vmdird'}", old deps: "['vmafdd', 'vmcad', 'vmdird', '$network', '$remote_fs']"

2019-06-22T00:36:37.213Z INFO firstbootInfrastructure Injecting service dependencies for vsphere-client-postinstall

2019-06-22T00:36:37.213Z INFO firstbootInfrastructure Injecting service dependencies for dbconfig

2019-06-22T00:36:37.277Z INFO firstbootInfrastructure End processing all service dependencies

2019-06-22T00:36:37.504Z INFO firstbootInfrastructure Changing vMon default start profile to ALL

2019-06-22T00:36:37.512Z WARNING firstbootInfrastructure stopping status aggregation...

Thanks,

Josh

0 Kudos
jpotrzeba
Contributor
Contributor

JXplorer shows that the Administrator is part of the Administrators Builtin Group:

cn=Administrator,cn=Users,dc=vsphere,....

Instead of .local, we have our domain name:

dc=vsphere, dc=ad, dc=...., dc=com

Thanks,

Josh

0 Kudos
nirmalgnair
VMware Employee
VMware Employee

Hi Josh,

So its not a permission issue.

Is it possible for you to get us the firstbootInfrastructure.log. You can use WinSCP to connect to the VCSA.

Regards,

Nirmal Nair

0 Kudos
nirmalgnair
VMware Employee
VMware Employee

@jpotrzeba

Also below logs

vmidentity-firstboot.py_XXXX.stdout.log

vmidentity-firstboot.py_XXXX.stderr.log

0 Kudos
jpotrzeba
Contributor
Contributor

nirmalgnair

Attached logs.

Thanks,

Josh

0 Kudos
nirmalgnair
VMware Employee
VMware Employee

Hi @jpotrzeba,

Seems like you uploaded the logs from the source 6.7 VCSA and not from the failed 7.0.

We need logs from the 7.0 VCSA. Do you still have the failed appliance available?

Regards,

Nirmal Nair

0 Kudos
jpotrzeba
Contributor
Contributor

hello nirmalgnair​,

Yep, my apologies, sending updated logs now.

Thanks!

0 Kudos
nirmalgnair
VMware Employee
VMware Employee

Hi @jpotrzeba,

Thank you for uploading the logs. Have you changed the SSO domain (by default vsphere.local) to your actual AD Domain : ad.secunetics.com ? I guess you did from one of your previous update.

If yes, then its an unsupported configuration.

Request you to perform the rollback if you have taken a snapshot or perform another PNID change back to vsphere.local (after taking a snapshot)

Regards,

Nirmal Nair

0 Kudos
daphnissov
Immortal
Immortal

Instead of .local, we have our domain name:

dc=vsphere, dc=ad, dc=...., dc=com

Should have never done that. The SSO domain name is for internal purposes and is not supposed to match any external identity sources like Active Directory. The default of vsphere.local should not have been altered.

0 Kudos
jpotrzeba
Contributor
Contributor

Hey nirmalgnair​,

Thanks for the suggestion on changing back to vsphere.local!

Is there anything that I should be wary of about making this change? Are there any typical issues that may happen? I was planning on following the steps provided from this link:

Repointing vCenter Server to another SSO Domain - VMware vSphere Blog

Thanks,

Josh

0 Kudos
jpotrzeba
Contributor
Contributor

Hello nirmalgnair​,

I went ahead and attempted the domain repoint, which failed on the "Reinstalling Platform Controller Services" step.

root@vsphere [ ~ ]# cmsso-util domain-repoint -m execute --src-emb-admin Administrator --dest-domain-name vsphere.local

Enter Source embedded vCenter Server Admin Password :

The domain-repoint operation will export License, Tags, Authorization data

before repoint and import after repoint.

WARNING: Global Permissions for the source vCenter Server system will be lost. The

         administrator for the target domain must add global permissions manually.

         Source domain users and groups will be lost after the Repoint operation.

         User 'Administrator@vsphere.local' will be assigned administrator role on the

         source vCenter Server system.

         The default resolution mode for Tags and Authorization conflicts is Copy,

         unless overridden in the conflict files generated during pre-check.

         Solutions and plugins registered with vCenter Server must be re-registered.

         Before running the Repoint operation, you should backupof all nodes

         including external databases. You can use file based backups to restore in

         case of failure. By using the Repoint tool you agree to take the responsibility

         for creating backups, otherwise you should cancel this operation.

         Starting with vSphere 6.7, VMware announced a simplified vCenter Single Sign-On

         domain architecture by enabling vCenter Enhanced Linked Mode support for

         vCenter Server Appliance installations with an embedded Platform Services

         Controller. You can use the vCenter Server converge utility to change the

         deployment topology from an external Platform Services Controller to an

         embedded Platform Services Controller with support for vCenter Enhanced Linked

         Mode. As of this release, the external Platform Services Controller

         architecture is deprecated and will not be available in future releases. For

         more information, see https://kb.vmware.com/s/article/60229

         The following license keys are being copied to the target Single Sign-On

         domain. VMware recommends using each license key in only a single domain. See

         "vCenter Server Domain Repoint License Considerations" in the vCenter Server

         Installation and Setup documentation.

Repoint Node Information:

         Source embedded vCenter Server:vsphere.ad.secunetics.com

All Repoint configuration settings are correct; proceed? [Y|y|N|n]: y

Starting License export                                                         ... Done

Starting Authz Data export                                                      ... Done

Starting Tagging Data export                                                    ... Done

Export Service Data                                                             ... Done

Uninstalling Platform Controller Services                                       ... Done

Stopping all services                                                           ... Done

Updating registry settings                                                      ... Done

Re-installing Platform Controller Services                                      ... Failed

Repoint failed. Restore from backup

root@vsphere [ ~ ]# cmsso-util domain-repoint -m execute --src-emb-admin Administrator --dest-domain-name vsphere.local

Enter Source embedded vCenter Server Admin Password :

Repoint operation failed in an earlier attempt. Restore the backup of the nodes and retry.

Thanks,

Josh

0 Kudos
jpotrzeba
Contributor
Contributor

Found these entries in /var/log/vmware/vmdird/vmdird-syslog.log

2020-06-09T23:59:40.167553+00:00 err vmdird  t@139909432080128: SASLSessionStep: sasl error (-13)(SASL(-13): authentication failure: client evidence does not match what we calculated. Probably a password error)

2020-06-09T23:59:40.168166+00:00 err vmdird  t@139909432080128: VmDirSendLdapResult: Request (Bind), Error (49), Message ((49)(SASL step failed.)), (0) socket (127.0.0.1)

2020-06-09T23:59:40.168939+00:00 err vmdird  t@139909432080128: Bind Request Failed (127.0.0.1) error 49: Protocol version: 3, Bind DN: "cn=Administrator,cn=Users,dc=vsphere,dc=ad,dc=secunetics,dc=com", Method: SASL

0 Kudos
nirmalgnair
VMware Employee
VMware Employee

Hi Josh,

Did you revert back to the snapshot. Is the VCSA up and running fine now?

Regards,

Nirmal Nair

0 Kudos