For some reason in the last couple of days we have seen that the vmware-sps service has failed to start. This began after an issue I had with the appliance running out of drive space. I expanded a partition and deleted the audit.log, which had grown to a crazy size.. that fixed the space issue, but I then noticed the problem with the 'cannot connect to profile-driven storage service' error I get whenever I try to edit settings of any VM using the web vsphere client.
I've tried to start the service manually via SSH and it fails to start.. kicking back out to the command line before completing the startup.
Any guidance on where to look or what to look at would be greatly appreciated! I'm no expert on either vmware or linux, so please be as specific as possible, to keep me out of trouble.
Thank you!
Go into /var/log/vmware/vmware-sps on the appliance and attach the sps.log, sps-runtime.log.stderr, and sps-runtime.log.stdout files so someone can have a look. Attempt the start the service, watch it fail, then pull the logs so they capture the failure.
I have an sps.log file and a wrapper.log file, but don't see or know how to create the other log files you mention. Having some trouble getting those log files OFF the server though.
When I look at the sps.log file I see this, which seems suspicious.. At least this is where it seems to break in the logging sequence..
2017-11-17T06:40:01.968-08:00 [WrapperSimpleAppMain] ERROR opId= com.vmware.sps.StorageMain - Exception when running SPS service
org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'httpServerEndpoint' defined in class path resource [../conf/pbm-spring-config.xml]: Could not resolve placeholder 'pbm.http.port' in string value "${pbm.http.port}"; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'pbm.http.port' in string value "${pbm.http.port}"
at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:211)
at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer.processProperties(PropertyPlaceholderConfigurer.java:222)
at org.springframework.beans.factory.config.PropertyResourceConfigurer.postProcessBeanFactory(PropertyResourceConfigurer.java:86)
at com.vmware.pbm.util.SpringService.<init>(SpringService.java:52)
at com.vmware.pbm.app.PbmLocalService.initialize(PbmLocalService.java:114)
at com.vmware.pbm.app.PbmLocalService.<init>(PbmLocalService.java:94)
at com.vmware.pbm.app.PbmLocalService.getInstance(PbmLocalService.java:150)
at com.vmware.sps.StorageMain.loadPbmService(Unknown Source)
at com.vmware.sps.StorageMain.main(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.tanukisoftware.wrapper.WrapperSimpleApp.run(WrapperSimpleApp.java:290)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException: Could not resolve placeholder 'pbm.http.port' in string value "${pbm.http.port}"
at org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174)
at org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126)
at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer$PlaceholderResolvingStringValueResolver.resolveStringValue(PropertyPlaceholderConfigurer.java:258)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveStringValue(BeanDefinitionVisitor.java:282)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:204)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitGenericArgumentValues(BeanDefinitionVisitor.java:159)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitBeanDefinition(BeanDefinitionVisitor.java:85)
at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:208)
... 14 more
Check the pbm.properties file in VCSA if it is empty or blank..
Location of pbm.properties : /usr/lib/vmware-vpx/sps/conf/pbm.properties
If you have a different VCSA, you can compare both the file and check that.
Thanks,
MS
Thank you for the reply! I've looked in the file you mention..
I see a pbm.serverGuid = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx value in there..
The error log, to me, looks like something in java is failing.. but I can't make heads or tails of it..
Sent PM to you. Can you try that and share the results.
Thanks,
MS
I tried as you suggested and there was no apparent improvement. PBM errors out in pre-check.. I can't seem to move or migrate any VM's any longer.. nor can I even open the 'edit' configuration of any Virtual Machine. They are running fine, but Vcenter just can't move them, nor can I.
As a note.. I was able to SSH into the appliance before the GUI was fully up and just one time I was actually able to start the sps service.. but when the gui finished starting up the service had stopped and could not be started again..
Hi,
have you managed to fix this anyhow?
Thanks!
JG
Hi All,
Failed to start vmware-sps service after "Regenerate a new VMCA Root Certificate and replace all certificates" . vCenter is running fine but sps service failing with below error. VCSA 6.5 with external PSC controller set-up. Any suggestions please?
2021-02-01T10:37:23.986Z [main] DEBUG opId=sps-Main-175158-315 com.vmware.vim.storage.common.task.retry.SimpleRetryHandler - Retry count = 69
2021-02-01T10:37:33.986Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.external.VpxdConnectionInitializer - Component manager connection initialized successfully for vSphere Profile-Driven Storage Service.
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up SSO info. Use cache is true
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up Component Manager cache for service info.
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.external.VpxdConnectionInitializer - Solution user initialized successfully for vSphere SPS.
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up vCenter info. Use cache is true
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up Component Manager cache for service info.
2021-02-01T10:37:33.988Z [main] DEBUG opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - vCenter client created successfully.
2021-02-01T10:37:34.019Z [main] ERROR opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - Failed while connecting to vpxd service:
com.vmware.vim.vmomi.client.exception.SslException: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured
at com.vmware.vim.vmomi.client.common.impl.ResponseImpl.setError(ResponseImpl.java:252)
at com.vmware.vim.vmomi.client.http.impl.HttpExchange.run(HttpExchange.java:51)
at com.vmware.vim.vmomi.client.http.impl.HttpProtocolBindingBase.executeRunnable(HttpProtocolBindingBase.java:226)
at com.vmware.vim.vmomi.client.http.impl.HttpProtocolBindingImpl.send(HttpProtocolBindingImpl.java:110)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl$CallExecutor.sendCall(MethodInvocationHandlerImpl.java:613)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl$CallExecutor.executeCall(MethodInvocationHandlerImpl.java:594)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl.completeCall(MethodInvocationHandlerImpl.java:345)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl.invokeOperation(MethodInvocationHandlerImpl.java:305)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl.invoke(MethodInvocationHandlerImpl.java:179)
at com.sun.proxy.$Proxy81.retrieveContent(Unknown Source)
at com.vmware.vim.storage.common.util.VpxdSSOConnection.createVpxdService(VpxdSSOConnection.java:162)
at com.vmware.vim.storage.common.util.VpxdSSOConnection.<init>(VpxdSSOConnection.java:97)
at com.vmware.vim.storage.common.util.VpxdSSOConnectionFactory.initAdminSSOConnection(VpxdSSOConnectionFactory.java:30)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.initVpxdSSOAdminConnection(VpxdConnectionInitializer.java:155)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.call(VpxdConnectionInitializer.java:82)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.call(VpxdConnectionInitializer.java:33)
at com.vmware.vim.storage.common.task.retry.CallableRetryDecorator.call(CallableRetryDecorator.java:43)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.initAdminVpxdConnection(VpxdConnectionInitializer.java:54)
at com.vmware.sps.StorageMain.commonInitialization(StorageMain.java:133)
at com.vmware.sps.StorageMain.main(StorageMain.java:35)
Caused by: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured
at com.vmware.vim.vmomi.client.http.impl.ClientExceptionTranslator.translate(ClientExceptionTranslator.java:54)
... 20 more
2021-02-01T10:37:33.988Z [main] DEBUG opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - vCenter client created successfully.
2021-02-01T10:37:34.019Z [main] ERROR opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - Failed while connecting to vpxd service:
com.vmware.vim.vmomi.client.exception.SslException: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured
Based on the above, there are two issues I can see..
SPS service cannot talk to vpxd service. vpxd service might have crashed or in stopped state
Second is with certificate issue and you can try to run the lsdoctor tool from KB https://kb.vmware.com/s/article/80469
Ensure vcenters snapshots has to be taken before running that tool
thanks,
MS
Thanks for the response. Vpxd is running fine. Able to connect to vCenter and performing all operations. however, to take this appliance backup, it is looking for sps service to be running. I have regenerated VMCA root certificates to fix vCenter login issue one week ago. After that, seems everything is working fine. Only issue with this sps service. Getting "Server certificate chain is not trusted and thumbprint verification is not configured" error in sps.log. Even not finding any VMware KBs about this issue.
I ran lsdoctor tool and it's recommended to perform "Recommended Action": "Please run python ls_doctor.py --trustfix option on this node." Did that, but no luck.
Below is the services status:
Running:
applmgmt lwsmd vmafdd vmonapi vmware-cm vmware-content-library vmware-eam vmware-netdumper vmware-perfcharts vmware-rbd-watchdog vmware-rhttpproxy vmware-sca vmware-statsmonitor vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui
Stopped:
vmcam vmware-imagebuilder vmware-mbcs vmware-sps vmware-updatemgr vmware-vcha
I am suspecting some other issues on the VC apart from certificate issue.
can you get me the output file psc.txt and also the sso sitename of the VC
/usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name --server-name localhost
/usr/lib/vmidentity/tools/scripts/lstool.py list --url http://localhostL7080/lookupservice/sdk > /tmp/psc.txt
thanks,
MS
https://drive.google.com/file/d/1Foj96-aaOriuIr8Vlo45OFSJud64VnEv/view?usp=sharing - Please find psc file in this link.
and below is the output,
root@vprdpsc01 [ /tmp ]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name --server-name localhost
PRODVBS3001
Hi, I am not seeing any issue with service but the thumbprint needs to be correct. If lsdoctor is not working please contact GSS
thanks,
MS
Hi,
I ran lsdoctor list option and not getting any errors /Warnings. Shall I try -solutionusers and --rebuild options of lsdoctor?
Problem is we don't have vendor support. Below is the output of lsdoctor, no issues.
python lsdoctor.py -l
ATTENTION: You are running a reporting function. This doesn't make any changes to your environment.
You can find the report and logs here: /var/log/vmware/lsdoctor
2021-02-03T14:38:41 INFO main: You are reporting on problems found across the SSO domain in the lookup service. This doesn't make changes.
2021-02-03T14:38:42 INFO live_checkCerts: Checking services for trust mismatches...
2021-02-03T14:38:42 INFO generateReport: Listing lookup service problems found in SSO domain
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprdpsc01.bc.jsplc.net (External PSC).
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprodvc01.bc.jsplc.net (Embedded).
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprdpsc02.bc.jsplc.net (External PSC).
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprodvsa01.bc.jsplc.net (Support Assistant).
Thankyou.
Hi,
Thankyou so much for fixing the issue.
Nagaravi, It's unclear what the fix was. Can you please elaborate?
can you please share how it's fixed?