Hi,
We are getting 503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x00007fe978004950] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe) error when opening the vCenter log in page.
The Appliance Management is functional and health status is showing all Good.
Overall Health Good (Last checked Jul 21, 2020, 04:26:43 PM)
CPU Good
Memory Good
Database Good
Storage Good
Swap Good
Seems to be the same issue as posted here 503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x000...
service-control --status
Stopped:
pschealth vmcam vmware-certificatemanagement vmware-content-library vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-perfcharts vmware-rbd-watchdog vmware-sca vmware-sps vmware-topologysvc vmware-updatemgr vmware-vapi-endpoint vmware-vcha vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsan-dps
Running:
applmgmt lwsmd vmafdd vmcad vmdird vmdnsd vmonapi vmware-analytics vmware-cis-license vmware-cm vmware-eam vmware-pod vmware-postgres-archiver vmware-rhttpproxy vmware-statsmonitor vmware-sts-idmd vmware-stsd vmware-vmon vmware-vpostgres vsphere-client vsphere-ui
Hi nettech1,
Please check and validate if the STS certificates are valid.
If not replace the STS certificate.
Regards,
Sudeshna Sarkar
Install-Upgrade Specialist
_______________________________________________________________________________________________________
"Did you find this helpful? Let us know by completing this survey (takes 1 minute!)"
Hi nettech1,
As mentioned , you started seeing this issue after the vcsa upgrade.
The vcsa was upgraded from which version ?
Do you have a snapshot ?
You are getting that error as the vpxd service is stopped.
Please stop and start all the services and see what is the first service failing.
So that you can proceed to check with the logs to find out the details of the failure.
Regards,
Sudeshna Sarkar
Install-Upgrade Specialist
Hi,
No we don't have a snapshot. The upgrade was from 6.7.0.42200
service-control --start
Operation not cancellable. Please wait for it to finish...
Performing start operation on profile: ALL...
Service-control failed. Error: Failed to start services in profile ALL. RC=1, stderr=Failed to start sca, vapi-endpoint, vpxd-svcs services. Error: Operation timed out
Investigating if the solution is posted here VMware Knowledge Base
Do service-control --start --all
Once the operation fails
Please paste the snippets of vpxd-svcs.log , vpxd , vsphere_client_virgo.log
Also check df -h for space related issue .
Cheers!
vmon-cli starts these services in order: eam, cis-license, rhttpproxy, vmonapi, statsmonitor, applmgmt, sca, vsphere-client etc.
all services before sca are running, so sca is the next to be started
look in /var/log/vmware/sca - vmware-sca is the first service in the startorder that fails.
df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 2.9G 0 2.9G 0% /dev
tmpfs 3.0G 784K 3.0G 1% /dev/shm
tmpfs 3.0G 684K 3.0G 1% /run
tmpfs 3.0G 0 3.0G 0% /sys/fs/cgroup
/dev/sda3 11G 6.3G 3.8G 63% /
tmpfs 3.0G 480K 3.0G 1% /tmp
/dev/mapper/seat_vg-seat 9.8G 323M 8.9G 4% /storage/seat
/dev/mapper/netdump_vg-netdump 985M 1.3M 916M 1% /storage/netdump
/dev/mapper/imagebuilder_vg-imagebuilder 9.8G 23M 9.2G 1% /storage/imagebu ilder
/dev/mapper/dblog_vg-dblog 15G 134M 14G 1% /storage/dblog
/dev/sda1 120M 35M 77M 31% /boot
/dev/mapper/core_vg-core 25G 735M 23G 4% /storage/core
/dev/mapper/updatemgr_vg-updatemgr 99G 95M 94G 1% /storage/updatem gr
/dev/mapper/autodeploy_vg-autodeploy 9.8G 34M 9.2G 1% /storage/autodep loy
/dev/mapper/db_vg-db 9.8G 324M 8.9G 4% /storage/db
/dev/mapper/log_vg-log 9.8G 3.3G 6.0G 36% /storage/log
/dev/mapper/archive_vg-archive 50G 47G 13M 100% /storage/archive
sca-gc.log.0.current ?
Java HotSpot(TM) 64-Bit Server VM (25.241-b07) for linux-amd64 JRE (1.8.0_241-b07), built on Dec 11 2019 02:22:16 by "java_re" with gcc 7.3.0
Memory: 4k page, physical 6093152k(826388k free), swap 27254776k(27253960k free)
CommandLine flags: -XX:CompressedClassSpaceSize=67108864 -XX:ErrorFile=/var/log/vmware/sca/sca-error-%p.log -XX:+ForceTimeHighResolution -XX:GCLogFileSize=1048576 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/vmware/sca -XX:InitialHeapSize=33554432 -XX:-LoopUnswitching -XX:MaxHeapSize=67108864 -XX:NumberOfGCLogFiles=10 -XX:ParallelGCThreads=1 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintReferenceGC -XX:ThreadStackSize=256 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseGCLogFileRotation -XX:+UseParallelGC
2020-07-22T11:25:20.679+0000: 0.391: [GC (Allocation Failure) 2020-07-22T11:25:20.681+0000: 0.393: [SoftReference, 0 refs, 0.0000156 secs]2020-07-22T11:25:20.681+0000: 0.394: [WeakReference, 240 refs, 0.0000099 secs]2020-07-22T11:25:20.681+0000: 0.394: [FinalReference, 588 refs, 0.0001553 secs]2020-07-22T11:25:20.681+0000: 0.394: [PhantomReference, 0 refs, 0 refs, 0.0000037 secs]2020-07-22T11:25:20.681+0000: 0.394: [JNI Weak Reference, 0.0000073 secs][PSYoungGen: 8704K->992K(9728K)] 8704K->1923K(31744K), 0.0024891 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2020-07-22T11:25:21.089+0000: 0.802: [GC (Allocation Failure) 2020-07-22T11:25:21.093+0000: 0.805: [SoftReference, 0 refs, 0.0000191 secs]2020-07-22T11:25:21.093+0000: 0.805: [WeakReference, 11 refs, 0.0000047 secs]2020-07-22T11:25:21.093+0000: 0.805: [FinalReference, 0 refs, 0.0000031 secs]2020-07-22T11:25:21.093+0000: 0.805: [PhantomReference, 0 refs, 0 refs, 0.0000029 secs]2020-07-22T11:25:21.093+0000: 0.805: [JNI Weak Reference, 0.0000030 secs][PSYoungGen: 9696K->1008K(9728K)] 10627K->3441K(31744K), 0.0035093 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2020-07-22T11:25:21.334+0000: 1.047: [GC (Allocation Failure) 2020-07-22T11:25:21.337+0000: 1.049: [SoftReference, 0 refs, 0.0000183 secs]2020-07-22T11:25:21.337+0000: 1.049: [WeakReference, 90 refs, 0.0000070 secs]2020-07-22T11:25:21.337+0000: 1.049: [FinalReference, 490 refs, 0.0005694 secs]2020-07-22T11:25:21.337+0000: 1.050: [PhantomReference, 0 refs, 0 refs, 0.0000043 secs]2020-07-22T11:25:21.337+0000: 1.050: [JNI Weak Reference, 0.0000030 secs][PSYoungGen: 9712K->1024K(9728K)] 12145K->5022K(31744K), 0.0036322 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
2020-07-22T11:25:21.820+0000: 1.532: [GC (Allocation Failure) 2020-07-22T11:25:21.829+0000: 1.542: [SoftReference, 0 refs, 0.0000224 secs]2020-07-22T11:25:21.829+0000: 1.542: [WeakReference, 13 refs, 0.0000054 secs]2020-07-22T11:25:21.829+0000: 1.542: [FinalReference, 0 refs, 0.0000030 secs]2020-07-22T11:25:21.829+0000: 1.542: [PhantomReference, 0 refs, 0 refs, 0.0000030 secs]2020-07-22T11:25:21.829+0000: 1.542: [JNI Weak Reference, 0.0000029 secs][PSYoungGen: 9728K->1008K(9728K)] 13726K->6093K(31744K), 0.0097723 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
2020-07-22T11:25:22.304+0000: 2.016: [GC (Allocation Failure) 2020-07-22T11:25:22.317+0000: 2.029: [SoftReference, 0 refs, 0.0000231 secs]2020-07-22T11:25:22.317+0000: 2.029: [WeakReference, 17 refs, 0.0000052 secs]2020-07-22T11:25:22.317+0000: 2.029: [FinalReference, 0 refs, 0.0000028 secs]2020-07-22T11:25:22.317+0000: 2.029: [PhantomReference, 0 refs, 1 refs, 0.0000029 secs]2020-07-22T11:25:22.317+0000: 2.029: [JNI Weak Reference, 0.0000100 secs][PSYoungGen: 9712K->1024K(9728K)] 14797K->7555K(31744K), 0.0134825 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]
2020-07-22T11:25:22.693+0000: 2.405: [GC (Allocation Failure) 2020-07-22T11:25:22.699+0000: 2.411: [SoftReference, 0 refs, 0.0000232 secs]2020-07-22T11:25:22.699+0000: 2.411: [WeakReference, 38 refs, 0.0000067 secs]2020-07-22T11:25:22.699+0000: 2.411: [FinalReference, 337 refs, 0.0003419 secs]2020-07-22T11:25:22.699+0000: 2.411: [PhantomReference, 0 refs, 1 refs, 0.0000049 secs]2020-07-22T11:25:22.699+0000: 2.411: [JNI Weak Reference, 0.0000058 secs][PSYoungGen: 9728K->2937K(17408K)] 16259K->9468K(39424K), 0.0083775 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
2020-07-22T11:25:23.036+0000: 2.748: [GC (Allocation Failure) 2020-07-22T11:25:23.048+0000: 2.760: [SoftReference, 0 refs, 0.0000237 secs]2020-07-22T11:25:23.048+0000: 2.760: [WeakReference, 43 refs, 0.0000086 secs]2020-07-22T11:25:23.048+0000: 2.760: [FinalReference, 699 refs, 0.0007088 secs]2020-07-22T11:25:23.048+0000: 2.761: [PhantomReference, 0 refs, 1 refs, 0.0000050 secs]2020-07-22T11:25:23.048+0000: 2.761: [JNI Weak Reference, 0.0000067 secs][PSYoungGen: 17273K->3552K(17920K)] 23804K->11689K(39936K), 0.0133381 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2020-07-22T11:25:23.319+0000: 3.032: [GC (Allocation Failure) 2020-07-22T11:25:23.327+0000: 3.039: [SoftReference, 0 refs, 0.0000370 secs]2020-07-22T11:25:23.327+0000: 3.039: [WeakReference, 45 refs, 0.0000095 secs]2020-07-22T11:25:23.327+0000: 3.039: [FinalReference, 738 refs, 0.0041054 secs]2020-07-22T11:25:23.331+0000: 3.043: [PhantomReference, 0 refs, 1 refs, 0.0000092 secs]2020-07-22T11:25:23.331+0000: 3.043: [JNI Weak Reference, 0.0000067 secs][PSYoungGen: 17888K->3553K(13824K)] 26025K->14022K(35840K), 0.0119752 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2020-07-22T11:25:23.345+0000: 3.057: [GC (Metadata GC Threshold) 2020-07-22T11:25:23.350+0000: 3.062: [SoftReference, 0 refs, 0.0000199 secs]2020-07-22T11:25:23.350+0000: 3.062: [WeakReference, 12 refs, 0.0000051 secs]2020-07-22T11:25:23.350+0000: 3.062: [FinalReference, 113 refs, 0.0000628 secs]2020-07-22T11:25:23.350+0000: 3.062: [PhantomReference, 0 refs, 0 refs, 0.0000037 secs]2020-07-22T11:25:23.350+0000: 3.062: [JNI Weak Reference, 0.0000058 secs][PSYoungGen: 6323K->2564K(15872K)] 16792K->14426K(37888K), 0.0053359 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2020-07-22T11:25:23.350+0000: 3.063: [Full GC (Metadata GC Threshold) 2020-07-22T11:25:23.366+0000: 3.078: [SoftReference, 0 refs, 0.0000227 secs]2020-07-22T11:25:23.366+0000: 3.078: [WeakReference, 186 refs, 0.0000256 secs]2020-07-22T11:25:23.366+0000: 3.078: [FinalReference, 1033 refs, 0.0002079 secs]2020-07-22T11:25:23.366+0000: 3.078: [PhantomReference, 0 refs, 0 refs, 0.0000038 secs]2020-07-22T11:25:23.366+0000: 3.078: [JNI Weak Reference, 0.0000065 secs][PSYoungGen: 2564K->0K(15872K)] [ParOldGen: 11861K->8773K(22016K)] 14426K->8773K(37888K), [Metaspace: 20987K->20987K(86016K)], 0.0635572 secs] [Times: user=0.04 sys=0.00, real=0.06 secs]
2020-07-22T11:25:23.499+0000: 3.212: [GC (Allocation Failure) 2020-07-22T11:25:23.500+0000: 3.213: [SoftReference, 0 refs, 0.0000107 secs]2020-07-22T11:25:23.500+0000: 3.213: [WeakReference, 0 refs, 0.0000035 secs]2020-07-22T11:25:23.500+0000: 3.213: [FinalReference, 380 refs, 0.0002215 secs]2020-07-22T11:25:23.500+0000: 3.213: [PhantomReference, 0 refs, 0 refs, 0.0000036 secs]2020-07-22T11:25:23.500+0000: 3.213: [JNI Weak Reference, 0.0000056 secs][PSYoungGen: 10240K->1769K(15872K)] 19013K->10542K(37888K), 0.0013215 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
Heap
PSYoungGen total 15872K, used 9941K [0x00000000feb00000, 0x0000000100000000, 0x0000000100000000)
eden space 10240K, 79% used [0x00000000feb00000,0x00000000ff2fb318,0x00000000ff500000)
from space 5632K, 31% used [0x00000000ffa80000,0x00000000ffc3a480,0x0000000100000000)
to space 5632K, 0% used [0x00000000ff500000,0x00000000ff500000,0x00000000ffa80000)
ParOldGen total 22016K, used 8773K [0x00000000fc000000, 0x00000000fd580000, 0x00000000feb00000)
object space 22016K, 39% used [0x00000000fc000000,0x00000000fc8915d8,0x00000000fd580000)
Metaspace used 22842K, capacity 23182K, committed 23296K, reserved 86016K
class space used 2696K, capacity 2783K, committed 2816K, reserved 65536K
Check this:
/dev/mapper/archive_vg-archive 50G 47G 13M 100% /storage/archive
Quick question: during the upgrade, have you changed any certificates/DNS hostnames?
Did you run a file based backup?
Yes, I did check archive_vg-archive, does not seem to be a problem
DNS has not been changed, but a few months ago we configured a cert based authentication and placed a public cert in authorized_keys file on vcenter.
Speaking of certs, the self signed cert generated by vcsa expired Tuesday, July 21, 2020 7:07:30 PM
Used option 8 to regenerate all certs. - VMware Knowledge Base
vcenter now has a new cert, however
2020-07-23T00:59:40.223Z INFO certificate-manager please see service-control.log for service status
2020-07-23T00:59:48.928Z INFO certificate-manager Command executed successfully
2020-07-23T00:59:48.929Z INFO certificate-manager all services stopped successfully.
2020-07-23T00:59:48.929Z INFO certificate-manager None
2020-07-23T00:59:58.938Z INFO certificate-manager Running command :- service-control --start --all
2020-07-23T00:59:58.938Z INFO certificate-manager please see service-control.log for service status
Service-control failed. Error: Failed to start services in profile ALL. RC=1, stderr=Failed to start vpxd-svcs, vapi-endpoint services. Error: Operation timed out
2020-07-23T01:07:49.961Z ERROR certificate-manager None
2020-07-23T01:07:49.964Z ERROR certificate-manager Error while starting services, please see service-control log for more details
2020-07-23T01:07:49.966Z ERROR certificate-manager {
"detail": [
{
"localized": "An error occurred while invoking external command : 'None'",
"args": [
"None"
],
"id": "install.ciscommon.command.errinvoke",
"translatable": "An error occurred while invoking external command : '%(0)s'"
},
"Error while starting services, please see service-control log for more details"
],
"problemId": null,
"resolution": null,
"componentKey": null
}
2020-07-23T01:07:49.968Z ERROR certificate-manager please see /var/log/vmware/vmcad/certificate-manager.log for more information.
2020-07-23T00:59:40.435Z INFO service-control Perform stop operation. vmon_profile=ALL, svc_names=None, include_coreossvcs=True, include_leafossvcs=True
2020-07-23T00:59:40.437Z INFO service-control Performing stop operation on service vmware-pod...
2020-07-23T00:59:40.889Z INFO service-control Successfully stopped service vmware-pod
2020-07-23T00:59:40.890Z INFO service-control Performing stop operation on profile: ALL...
2020-07-23T00:59:44.482Z INFO service-control Successfully stopped service vmware-vmon
2020-07-23T00:59:44.482Z INFO service-control Successfully stopped profile: ALL.
2020-07-23T00:59:44.484Z INFO service-control Performing stop operation on service vmdnsd...
2020-07-23T00:59:44.706Z INFO service-control Successfully stopped service vmdnsd
2020-07-23T00:59:44.708Z INFO service-control Performing stop operation on service vmware-stsd...
2020-07-23T00:59:46.183Z INFO service-control Successfully stopped service vmware-stsd
2020-07-23T00:59:46.184Z INFO service-control Performing stop operation on service vmware-sts-idmd...
2020-07-23T00:59:47.499Z INFO service-control Successfully stopped service vmware-sts-idmd
2020-07-23T00:59:47.499Z INFO service-control Performing stop operation on service vmcad...
2020-07-23T00:59:47.576Z INFO service-control Successfully stopped service vmcad
2020-07-23T00:59:47.576Z INFO service-control Performing stop operation on service vmdird...
2020-07-23T00:59:47.745Z INFO service-control Successfully stopped service vmdird
2020-07-23T00:59:47.745Z INFO service-control Performing stop operation on service vmafdd...
2020-07-23T00:59:47.959Z INFO service-control Successfully stopped service vmafdd
2020-07-23T00:59:47.959Z INFO service-control Performing stop operation on service lwsmd...
2020-07-23T00:59:48.870Z INFO service-control Successfully stopped service lwsmd
2020-07-23T00:59:59.233Z INFO service-control ********** Start ['--start', '--all'] **********
2020-07-23T00:59:59.234Z INFO service-control Perform start operation. vmon_profile=ALL, svc_names=None, include_coreossvcs=True, include_leafossvcs=True
2020-07-23T00:59:59.236Z INFO service-control Performing start operation on service lwsmd...
2020-07-23T00:59:59.589Z INFO service-control Successfully started service lwsmd
2020-07-23T00:59:59.590Z INFO service-control Performing start operation on service vmafdd...
2020-07-23T01:00:01.487Z INFO service-control Successfully started service vmafdd
2020-07-23T01:00:01.488Z INFO service-control Performing start operation on service vmdird...
2020-07-23T01:00:04.297Z INFO service-control Successfully started service vmdird
2020-07-23T01:00:04.298Z INFO service-control Performing start operation on service vmcad...
2020-07-23T01:00:05.816Z INFO service-control Successfully started service vmcad
2020-07-23T01:00:05.817Z INFO service-control Performing start operation on service vmware-sts-idmd...
2020-07-23T01:00:07.492Z INFO service-control Successfully started service vmware-sts-idmd
2020-07-23T01:00:07.493Z INFO service-control Performing start operation on service vmware-stsd...
2020-07-23T01:00:19.521Z INFO service-control Successfully started service vmware-stsd
2020-07-23T01:00:19.521Z INFO service-control Performing start operation on service vmdnsd...
2020-07-23T01:00:19.550Z INFO service-control Successfully started service vmdnsd
2020-07-23T01:00:19.551Z INFO service-control Performing start operation on profile: ALL...
2020-07-23T01:00:21.0Z INFO service-control Successfully started service vmware-vmon
2020-07-23T01:07:49.890Z ERROR service-control Service-control failed. Error: Failed to start services in profile ALL. RC=1, stderr=Failed to start vpxd-svcs, vapi-endpoint services. Error: Operation timed out
Hi nettech1,
Please check and validate if the STS certificates are valid.
If not replace the STS certificate.
Regards,
Sudeshna Sarkar
Install-Upgrade Specialist
_______________________________________________________________________________________________________
"Did you find this helpful? Let us know by completing this survey (takes 1 minute!)"
python checksts.py
1 VALID CERTS
================
LEAF CERTS:
None
ROOT CERTS:
[] Certificate 1A:50:D1:7B:A3:C5:E4:5E:1A:E0:66:FA:A6:A5:68:7D:08:7B:0B: B6 will expire in 2915 days (8 years).
1 EXPIRED CERTS
================
LEAF CERTS:
[] Certificate: FE:08:93:FF:47:B3:19:7B:19:A8:7E:17:F5:6B:5C:37:F5:60:04 :9A expired on 2020-07-21 10:57:46 GMT!
ROOT CERTS:
None
WARNING!
You have expired STS certificates. Please follow the KB corresponding to yo ur OS:
the replacement STS certificates were generated with expiration in 2 years. is there a way to extend them?
Hi nettech1,
You have to regenerate the STS certs in order to extend the expiration.
Sudeshna Sarkar
Install-Upgrade Specialist
Sorry, I meant to extend the STS certs for a term longer than 2 years.
in my case VCSA6.5u2 after replacing expired STS certificate and SSL certificate, the service still can not start,can you help me?