OptoIT
Contributor
Contributor

vmware-sps service fails to start - vCSA 6U3

For some reason in the last couple of days we have seen that the vmware-sps service has failed to start. This began after an issue I had with the appliance running out of drive space. I expanded a partition and deleted the audit.log, which had grown to a crazy size..  that fixed the space issue, but I then noticed the problem with the 'cannot connect to profile-driven storage service' error I get whenever I try to edit settings of any VM using the web vsphere client.

I've tried to start the service manually via SSH and it fails to start.. kicking back out to the command line before completing the startup.

Any guidance on where to look or what to look at would be greatly appreciated!  I'm no expert on either vmware or linux, so please be as specific as possible, to keep me out of trouble.

Thank you!

0 Kudos
15 Replies
daphnissov
Immortal
Immortal

Go into /var/log/vmware/vmware-sps on the appliance and attach the sps.log,  sps-runtime.log.stderr, and sps-runtime.log.stdout files so someone can have a look. Attempt the start the service, watch it fail, then pull the logs so they capture the failure.

0 Kudos
OptoIT
Contributor
Contributor

I have an sps.log file and a wrapper.log file, but don't see or know how to create the other log files you mention.  Having some trouble getting those log files OFF the server though.

When I look at the sps.log file I see this, which seems suspicious..  At least this is where it seems to break in the logging sequence..

2017-11-17T06:40:01.968-08:00 [WrapperSimpleAppMain] ERROR opId= com.vmware.sps.StorageMain - Exception when running SPS service

org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'httpServerEndpoint' defined in class path resource [../conf/pbm-spring-config.xml]: Could not resolve placeholder 'pbm.http.port' in string value "${pbm.http.port}"; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'pbm.http.port' in string value "${pbm.http.port}"

at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:211)

at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer.processProperties(PropertyPlaceholderConfigurer.java:222)

at org.springframework.beans.factory.config.PropertyResourceConfigurer.postProcessBeanFactory(PropertyResourceConfigurer.java:86)

at com.vmware.pbm.util.SpringService.<init>(SpringService.java:52)

at com.vmware.pbm.app.PbmLocalService.initialize(PbmLocalService.java:114)

at com.vmware.pbm.app.PbmLocalService.<init>(PbmLocalService.java:94)

at com.vmware.pbm.app.PbmLocalService.getInstance(PbmLocalService.java:150)

at com.vmware.sps.StorageMain.loadPbmService(Unknown Source)

at com.vmware.sps.StorageMain.main(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at java.lang.reflect.Method.invoke(Unknown Source)

at org.tanukisoftware.wrapper.WrapperSimpleApp.run(WrapperSimpleApp.java:290)

at java.lang.Thread.run(Unknown Source)

Caused by: java.lang.IllegalArgumentException: Could not resolve placeholder 'pbm.http.port' in string value "${pbm.http.port}"

at org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174)

at org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126)

at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer$PlaceholderResolvingStringValueResolver.resolveStringValue(PropertyPlaceholderConfigurer.java:258)

at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveStringValue(BeanDefinitionVisitor.java:282)

at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:204)

at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitGenericArgumentValues(BeanDefinitionVisitor.java:159)

at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitBeanDefinition(BeanDefinitionVisitor.java:85)

at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:208)

... 14 more

0 Kudos
msripada
Virtuoso
Virtuoso

Check the pbm.properties file in VCSA if it is empty or blank..

Location of pbm.properties : /usr/lib/vmware-vpx/sps/conf/pbm.properties

If you have a different VCSA, you can compare both the file and check that.

Thanks,

MS

0 Kudos
Opto22
Contributor
Contributor

Thank you for the reply!  I've looked in the file you mention..

I see a pbm.serverGuid = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx value in there..

The error log, to me, looks like something in java is failing.. but I can't make heads or tails of it..

0 Kudos
msripada
Virtuoso
Virtuoso

Sent PM to you. Can you try that and share the results.

Thanks,

MS

0 Kudos
Opto22
Contributor
Contributor

I tried as you suggested and there was no apparent improvement. PBM errors out in pre-check.. I can't seem to move or migrate any VM's any longer.. nor can I even open the 'edit' configuration of any Virtual Machine.  They are running fine, but Vcenter just can't move them, nor can I.

As a note.. I was able to SSH into the appliance before the GUI was fully up and just one time I was actually able to start the sps service.. but when the gui finished starting up the service had stopped and could not be started again..

0 Kudos
JayGrinch
Contributor
Contributor

Hi,

have you managed to fix this anyhow?

Thanks!

JG

0 Kudos
Nagaravi
Contributor
Contributor

Hi All, 

Failed to start vmware-sps service after "Regenerate a new VMCA Root Certificate and  replace all certificates" . vCenter is running fine but sps service failing with below error. VCSA 6.5 with external PSC controller set-up. Any suggestions please?  

 

2021-02-01T10:37:23.986Z [main] DEBUG opId=sps-Main-175158-315 com.vmware.vim.storage.common.task.retry.SimpleRetryHandler - Retry count = 69
2021-02-01T10:37:33.986Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.external.VpxdConnectionInitializer - Component manager connection initialized successfully for vSphere Profile-Driven Storage Service.
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up SSO info. Use cache is true
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up Component Manager cache for service info.
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.external.VpxdConnectionInitializer - Solution user initialized successfully for vSphere SPS.
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up vCenter info. Use cache is true
2021-02-01T10:37:33.987Z [main] INFO opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.ComponentManagerService - Looking up Component Manager cache for service info.
2021-02-01T10:37:33.988Z [main] DEBUG opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - vCenter client created successfully.
2021-02-01T10:37:34.019Z [main] ERROR opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - Failed while connecting to vpxd service:
com.vmware.vim.vmomi.client.exception.SslException: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured
at com.vmware.vim.vmomi.client.common.impl.ResponseImpl.setError(ResponseImpl.java:252)
at com.vmware.vim.vmomi.client.http.impl.HttpExchange.run(HttpExchange.java:51)
at com.vmware.vim.vmomi.client.http.impl.HttpProtocolBindingBase.executeRunnable(HttpProtocolBindingBase.java:226)
at com.vmware.vim.vmomi.client.http.impl.HttpProtocolBindingImpl.send(HttpProtocolBindingImpl.java:110)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl$CallExecutor.sendCall(MethodInvocationHandlerImpl.java:613)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl$CallExecutor.executeCall(MethodInvocationHandlerImpl.java:594)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl.completeCall(MethodInvocationHandlerImpl.java:345)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl.invokeOperation(MethodInvocationHandlerImpl.java:305)
at com.vmware.vim.vmomi.client.common.impl.MethodInvocationHandlerImpl.invoke(MethodInvocationHandlerImpl.java:179)
at com.sun.proxy.$Proxy81.retrieveContent(Unknown Source)
at com.vmware.vim.storage.common.util.VpxdSSOConnection.createVpxdService(VpxdSSOConnection.java:162)
at com.vmware.vim.storage.common.util.VpxdSSOConnection.<init>(VpxdSSOConnection.java:97)
at com.vmware.vim.storage.common.util.VpxdSSOConnectionFactory.initAdminSSOConnection(VpxdSSOConnectionFactory.java:30)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.initVpxdSSOAdminConnection(VpxdConnectionInitializer.java:155)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.call(VpxdConnectionInitializer.java:82)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.call(VpxdConnectionInitializer.java:33)
at com.vmware.vim.storage.common.task.retry.CallableRetryDecorator.call(CallableRetryDecorator.java:43)
at com.vmware.vim.storage.common.external.VpxdConnectionInitializer.initAdminVpxdConnection(VpxdConnectionInitializer.java:54)
at com.vmware.sps.StorageMain.commonInitialization(StorageMain.java:133)
at com.vmware.sps.StorageMain.main(StorageMain.java:35)
Caused by: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured
at com.vmware.vim.vmomi.client.http.impl.ClientExceptionTranslator.translate(ClientExceptionTranslator.java:54)
... 20 more

0 Kudos
msripada
Virtuoso
Virtuoso

2021-02-01T10:37:33.988Z [main] DEBUG opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - vCenter client created successfully.
2021-02-01T10:37:34.019Z [main] ERROR opId=sps-Main-175158-315 com.vmware.vim.storage.common.util.VpxdSSOConnection - Failed while connecting to vpxd service:
com.vmware.vim.vmomi.client.exception.SslException: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured

 

Based on the above, there are two issues I can see..

SPS service cannot talk to vpxd service. vpxd service might have crashed or in stopped state

Second is with certificate issue and you can try to run the lsdoctor tool from KB https://kb.vmware.com/s/article/80469

Ensure vcenters snapshots has to be taken before running that tool

thanks,

MS

 

0 Kudos
Nagaravi
Contributor
Contributor

Thanks for the response. Vpxd is running fine. Able to connect to vCenter and performing all operations. however, to take this appliance backup, it is looking for sps service to be running. I have regenerated VMCA root certificates to fix vCenter login issue one week ago. After that, seems everything is working fine. Only issue with this sps service. Getting "Server certificate chain is not trusted and thumbprint verification is not configured"  error in sps.log. Even not finding any VMware KBs about this issue. 

 

I ran lsdoctor tool and it's recommended to perform "Recommended Action": "Please run python ls_doctor.py --trustfix option on this node." Did that, but no luck. 

 

Below is the services status: 

Running:
applmgmt lwsmd vmafdd vmonapi vmware-cm vmware-content-library vmware-eam vmware-netdumper vmware-perfcharts vmware-rbd-watchdog vmware-rhttpproxy vmware-sca vmware-statsmonitor vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui
Stopped:
vmcam vmware-imagebuilder vmware-mbcs vmware-sps vmware-updatemgr vmware-vcha

0 Kudos
msripada
Virtuoso
Virtuoso

I am suspecting some other issues on the VC apart from certificate issue. 

can you get me the output file psc.txt and also the sso sitename of the VC 

/usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name --server-name localhost

/usr/lib/vmidentity/tools/scripts/lstool.py list --url http://localhostL7080/lookupservice/sdk  > /tmp/psc.txt 

thanks,

MS

 

 

0 Kudos
Nagaravi
Contributor
Contributor

https://drive.google.com/file/d/1Foj96-aaOriuIr8Vlo45OFSJud64VnEv/view?usp=sharing  - Please find psc file in this link.

and below is the output, 


root@vprdpsc01 [ /tmp ]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name --server-name localhost
PRODVBS3001

0 Kudos
msripada
Virtuoso
Virtuoso

Hi, I am not seeing any issue with service but the thumbprint needs to be correct. If lsdoctor is not working please contact GSS 

thanks,

MS

0 Kudos
Nagaravi
Contributor
Contributor

Hi, 

I ran lsdoctor list option and not getting any errors /Warnings. Shall I try  -solutionusers  and --rebuild  options of lsdoctor? 

Problem is we don't have vendor support. Below is the output of lsdoctor, no issues.  

python lsdoctor.py -l


ATTENTION: You are running a reporting function. This doesn't make any changes to your environment.
You can find the report and logs here: /var/log/vmware/lsdoctor

2021-02-03T14:38:41 INFO main: You are reporting on problems found across the SSO domain in the lookup service. This doesn't make changes.
2021-02-03T14:38:42 INFO live_checkCerts: Checking services for trust mismatches...
2021-02-03T14:38:42 INFO generateReport: Listing lookup service problems found in SSO domain
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprdpsc01.bc.jsplc.net (External PSC).
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprodvc01.bc.jsplc.net (Embedded).
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprdpsc02.bc.jsplc.net (External PSC).
2021-02-03T14:38:42 INFO generateReport: No issues detected in the lookup service entries for vprodvsa01.bc.jsplc.net (Support Assistant).

Thankyou. 

Tags (1)
0 Kudos
Nagaravi
Contributor
Contributor

Hi, 

Thankyou so much for fixing the issue. 

0 Kudos