I have a VCenter 6.5 U3v running on Windows 2012. I just updated the VCenter to U3v a couple of weeks ago. The VCenter had been operating normally since then.
I found out the VCenter instance stopped functioning a couple of days ago. I attempted VCenter restarts and Windows reboots to resolve the issue to no avail. Looking at the command line, I see the following when trying to start the STS service:
2024-03-28T15:27:14.384Z ERROR Starting service: VMwareSTS, Exception: (1053, 'StartService', 'The service did not respond to the start or control request in a timely fashion.')
Error executing start on service VMwareSTS. Details {
"resolution": null,
"detail": [
{
"args": [
"VMwareSTS"
],
"id": "install.ciscommon.service.failstart",
"localized": "An error occurred while starting service 'VMwareSTS'",
"translatable": "An error occurred while starting service '%(0)s'"
}
],
"componentKey": null,
"problemId": null
}
Service-control failed. Error {
"resolution": null,
"detail": [
{
"args": [
"VMwareSTS"
],
"id": "install.ciscommon.service.failstart",
"localized": "An error occurred while starting service 'VMwareSTS'",
"translatable": "An error occurred while starting service '%(0)s'"
}
],
"componentKey": null,
"problemId": null
}
First, I understand the appliance may be a better solution but I want to keep this on Windows. Secondly, the "error 1053" is an immediate response as opposed to waiting 30s. I presume this means that it is a default error code thrown by STS and that maybe something is interfering with STS service. The vxpd, vmafdd, vMon log files are not advancing so I don't see any information about the error. What log file should I check to look for more information?
I haven't performed a reinstall as the 6.5U3v Installer wants to uninstall vCenter first. I don't have any data backups of the vCenter. I could try to rebuild the data but it would be painful.
What other ideas should I try?
I ended up uninstalling and re-installing. There may have been certificate issues with the STS service. At least all my certs are valid for another 2 years so I'm OK for now.
I encountered several errors such that I needed to clean the system with uninstalls and start over. I used several articles including https://communities.vmware.com/t5/VMware-vCenter-Discussions/Can-t-Uninstall-vCenter-Server/m-p/3011... and https://communities.vmware.com/t5/vCenter-Server-Discussions/Vcenter-Server-6-5d-Installation-error-.... I may have referenced several other articles too.
Couple of other references:
- Comment in this blog was useful b/c it mentions LEAF certs can expire and lead to problems: https://luchodelorenzi.com/2020/05/28/proactively-checking-and-replacing-sts-certificate-on-vsphere-...
Let me first state that you're running an ancient version of vSphere, on top of a Windows Server version that is out of support. I highly recommend you upgrade to a version of vSphere that is still supported and switch to the appliance. You're exposing yourself to unneeded security risks here.
Now that we have that out of the way, given the age of your environment, could you check if your certificates are still valid? It could be that one of the vCenter certs has expired and that's causing the service to fail.
I had just replaced the certificates on Oct 11, 2023. These are self-signed certificates as opposed to certs from a Cert-Authority. AFAIK, the certificates are valid until 2025. I ran a powershell script to check certificate expirations and I don't see any that have expired.
Some background info and what I've done so far:
VCenter 6.5 is stand-alone on a Windows machine managing 10 nodes and 50 VMs. The VM hosts are running either ESXi 5.5 or 6.5. Others manage those hosts on obsolete hardware so I don't have authority to upgrade those ESXis. With that out of the way here's what I've done recently. I updated the ROOT CA back on Oct 11, 2023 by using the "fixsts.py" script as mentioned here: https://kb.vmware.com/s/article/76719. On March 13, I upgraded VCenter 6.5U3p to 6.5U3v to fix a log4j issue. The VCenter was operational after a server reboot.
Only recently (March 23-ish?) did the VCenter go down. I looked through multiple articles but haven't resolved my issue.
2024-04-02T00:46:28.246Z:t@24484:ERROR: [Error - 3, ..\vecsserviceapi.c:1507]
2024-04-02T00:57:08.057Z:t@22824:INFO: vmafdd: stop
2024-04-02T00:57:14.423Z:t@14260:ERROR: [Error - 183, ..\vecsserviceapi.c:189]
2024-04-02T00:57:14.424Z:t@14260:ERROR: [Error - 183, ..\authservice.c:36]
2024-04-02T00:57:14.429Z:t@14260:ERROR: [Error - 183, ..\vecsserviceapi.c:189]
2024-04-02T00:57:14.430Z:t@14260:ERROR: [Error - 183, ..\authservice.c:36]
2024-04-02T00:57:14.435Z:t@14260:ERROR: [Error - 183, ..\vecsserviceapi.c:189]
2024-04-02T00:57:14.437Z:t@14260:ERROR: [Error - 183, ..\authservice.c:36]
2024-04-02T00:57:15.625Z:t@21892:INFO: VmAfdRpcServerCheckAccess: request from ncalrpc:[62400]
2024-04-02T00:57:15.627Z:t@14260:INFO: RPC service status (listening)
2024-04-02T00:57:15.629Z:t@14260:INFO: Registry key value for Super Logging: 0
2024-04-02T00:57:15.630Z:t@14260:INFO: Super Logger object is created.
2024-04-02T00:57:15.632Z:t@14260:INFO: Starting Roots Fetch Thread, VmAfdInitCertificateThread
2024-04-02T00:57:15.692Z:t@14260:INFO: Started Roots Fetch Thread successfully, VmAfdInitCertificateThread
2024-04-02T00:57:15.695Z:t@14260:INFO: Starting Pass Refresh Thread, VmAfdInitPassRefreshThread
2024-04-02T00:57:15.697Z:t@14260:INFO: Started Pass Refresh Thread successfully, VmAfdInitPassRefreshThread
2024-04-02T00:57:15.699Z:t@14260:INFO: Starting the CDC State machine, CdcInitStateMachine
2024-04-02T00:57:15.701Z:t@14260:INFO: Started CDC State Machine Thread successfully, CdcInitStateMachine
2024-04-02T00:57:15.703Z:t@14260:INFO: Starting CDC Caching Thread, CdcInitCdcCacheUpdate
2024-04-02T00:57:15.705Z:t@14260:INFO: Started CDC Cache Thread successfully, CdcInitCdcCacheUpdate
2024-04-02T00:57:15.707Z:t@14260:INFO: vmafdd: started!
2024-04-02T00:57:18.754Z:t@21648:ERROR: [Error - 9127, ..\ldap.c:170]
2024-04-02T00:57:18.755Z:t@21648:ERROR: [Error - 9127, ..\rootfetch.c:256]
2024-04-02T00:57:18.757Z:t@21648:INFO: Failed to update trusted roots. Error [9127]
2024-04-02T00:58:18.057Z:t@21648:ERROR: [Error - 4312, ..\rootfetch.c:684]
2024-04-02T00:58:18.119Z:t@21648:INFO: Added cert to VECS DB: 460340545a790dc8d822dbf6ba54623612dbe644
2024-04-02T00:58:18.200Z:t@21648:INFO: VecsSrvDeleteCertificate: Deleted cert (alias 5c4bb762bcc068cd13ce0f9e8e8c37b68f7f717f) from store 3
2024-04-02T00:58:18.204Z:t@21648:INFO: VecsDeleteFileWithRetry: successfully deleted cert file: D:\ProgramData\VMware\vCenterServer\cfg\certs\85db7385.r0
2024-04-02T00:58:18.212Z:t@21648:INFO: VecsFillVacantFileSlot: copied D:\ProgramData\VMware\vCenterServer\cfg\certs\85db7385.r1 to D:\ProgramData\VMware\vCenterServer\cfg\certs\85db7385.r0
2024-04-02T00:58:18.216Z:t@21648:INFO: VecsDeleteFileWithRetry: successfully deleted cert file: D:\ProgramData\VMware\vCenterServer\cfg\certs\85db7385.r1
2024-04-02T00:58:18.222Z:t@21648:INFO: VecsDeleteFileWithRetry: successfully deleted cert file: D:\ProgramData\VMware\vCenterServer\cfg\certs\899b2435.r0
2024-04-02T00:58:18.227Z:t@21648:INFO: VecsFillVacantFileSlot: copied D:\ProgramData\VMware\vCenterServer\cfg\certs\899b2435.r1 to D:\ProgramData\VMware\vCenterServer\cfg\certs\899b2435.r0
2024-04-02T00:58:18.231Z:t@21648:INFO: VecsDeleteFileWithRetry: successfully deleted cert file: D:\ProgramData\VMware\vCenterServer\cfg\certs\899b2435.r1
2024-04-02T00:58:18.237Z:t@21648:INFO: VecsDeleteFileWithRetry: successfully deleted cert file: D:\ProgramData\VMware\vCenterServer\cfg\vmware-vpx\docRoot\certs\899b2435.r1
2024-04-02T00:58:18.241Z:t@21648:ERROR: [Error - 3, ..\vecsserviceapi.c:1507]
Anyone have any ideas?
I ended up uninstalling and re-installing. There may have been certificate issues with the STS service. At least all my certs are valid for another 2 years so I'm OK for now.
I encountered several errors such that I needed to clean the system with uninstalls and start over. I used several articles including https://communities.vmware.com/t5/VMware-vCenter-Discussions/Can-t-Uninstall-vCenter-Server/m-p/3011... and https://communities.vmware.com/t5/vCenter-Server-Discussions/Vcenter-Server-6-5d-Installation-error-.... I may have referenced several other articles too.
Couple of other references:
- Comment in this blog was useful b/c it mentions LEAF certs can expire and lead to problems: https://luchodelorenzi.com/2020/05/28/proactively-checking-and-replacing-sts-certificate-on-vsphere-...