VMware Cloud Community
tmichaeli
VMware Employee
VMware Employee

Failed to start vmonapi service - VCSA 6.5 - all builds

Gang,

I'm trying to resolve this issue for a long time. My VCSA come up with "vmonapi" service down, always after first LAB boot (all service VMs are starting up - PSC, DNS, NSXM, VRLI, NSXC, etc.).

Issue can be fixed by second reboot VCSA appliance. It is annoying. Here are the logs, pictures and debug infos. Any idea is welcomed!

How to ensure that vmonapi will start over boot storm? Any idea how to troubleshoot? Any recommendations for resource pools for VCSA?

Error from GUI Home->System Config->Summary

HTTP response with status code 503 (enable debug logging for details): <HTML><BODY><H1>503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http16LocalServiceSpecE:0x00007fdba4003a70] _serverNamespace = /vmonapi action = Allow _port = 8900)</H1></BODY></HTML>

root@vcsa-01a [ /var/log/vmware/vmon ]# service-control --status

Running:

applmgmt lwsmd vmafdd vmware-cm vmware-content-library vmware-eam vmware-perfcharts vmware-rhttpproxy vmware-sca vmware-sps vmware-statsmonitor vmware-updatemgr vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui

Stopped:

vmcam vmonapi vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-rbd-watchdog vmware-vcha

root@vcsa-01a [ /var/log/vmware/vmon ]# vi vmon-syslog.log

17-12-18T11:28:15.825616+01:00 notice vmon  Executing op START on service statsmonitor...

17-12-18T11:28:15.825715+01:00 notice vmon  Constructed command: /usr/lib/vmware-statsmonitor/statsMonitor.sh /etc/vmware/statsmonitor/statsMonitor.xml

17-12-18T11:28:16.814873+01:00 notice vmon  Constructed command: /usr/bin/python /usr/lib/vmware-rhttpproxy/rhttpproxy-vmon-apihealth.py

17-12-18T11:28:16.815353+01:00 notice vmon  Constructed command: /usr/bin/python /usr/lib/vmware-vmon/vmonApiHealthCmd.py -n vmware-statsmonitor -f /var/vmware/applmgmt/statsmonitor_health.xml

17-12-18T11:28:18.274459+01:00 warning vmon  Service vmonapi pre-start command's stderr: urlopen() failed! Trying force_refresh=True...

17-12-18T11:28:18.274777+01:00 warning vmon

17-12-18T11:28:18.308797+01:00 warning vmon  Service vmonapi pre-start command's stderr: Failed to start vmonapi service. Exception : <urlopen error [Errno 111] Connection refused>

17-12-18T11:28:18.320921+01:00 err vmon  Service vmonapi pre-start command failed with exit code 255.

17-12-18T11:28:19.256395+01:00 warning vmon  Service rhttpproxy api-health command's stderr: Health URL: https://localhost:443/rhttpproxyhealth

17-12-18T11:28:19.256527+01:00 warning vmon  <urlopen error [Errno 111] Connection refused>

17-12-18T11:28:19.256641+01:00 notice vmon  Re-check service rhttpproxy health since it is still initializing.

17-12-18T11:28:19.274526+01:00 notice vmon  Service applmgmt pre-start command completed successfully.

17-12-18T11:28:19.274696+01:00 notice vmon  Constructed command: /usr/bin/python /usr/lib/applmgmt/base/bin/vherdrunner /usr/lib/applmgmt/transport/bin/multiserve --config /etc/applmgmt/applmgmt.conf

17-12-18T11:28:20.245715+01:00 notice vmon  Constructed command: /usr/bin/python /usr/lib/vmware-rhttpproxy/rhttpproxy-vmon-apihealth.py

17-12-18T11:28:20.393563+01:00 notice vmon  Constructed command: /usr/bin/python /usr/lib/applmgmt/applmgmt_vmonhealth.py

17-12-18T11:28:20.393805+01:00 warning vmon  Service rhttpproxy api-health command's stderr: Health URL: https://localhost:443/rhttpproxyhealth

17-12-18T11:28:20.393942+01:00 warning vmon  <urlopen error [Errno 111] Connection refused>

17-12-18T11:28:20.394053+01:00 notice vmon  Re-check service rhttpproxy health since it is still initializing.

17-12-18T11:28:20.825672+01:00 warning vmon  Service applmgmt api-health command's stderr: error(111, 'Connection refused')

17-12-18T11:28:20.832896+01:00 notice vmon  Re-check service applmgmt health since it is still initializing.

17-12-18T11:28:21.322502+01:00 notice vmon  Constructed command: /usr/bin/python /usr/lib/vmware-rhttpproxy/rhttpproxy-vmon-apihealth.py

0 Kudos
5 Replies
tmichaeli
VMware Employee
VMware Employee

Just addition, I'm running PSC and VCSA on different VMs.

0 Kudos
msripada
Virtuoso
Virtuoso

Can you check if there is proxy configured.

etc/sysconfig/proxy

It looks like the connection refused is due to dependency service could not talk properly to vmonapi while starting up.

Also check if the component manager logs when this is failing to start to check if there is any connection refused or does the component manager started properly..

Also, check if there are any firewall rules from nsx between psc and vcenter need to be in exception for communication purposes leading to timeout??

Thanks,

MS

0 Kudos
Goomi
Contributor
Contributor

I have this issue as well.  vmonapi won't start on the vCenter, but is running fine on the external PSC.  Did you resolve this?

0 Kudos
tmichaeli
VMware Employee
VMware Employee

No I did not. As I said, first reboot after Power-On (or services restart) on vcsa always fix the issue. Proxy and firewall are irrelevant,

  • There is no Firewall (clean port group) between vcsa and psc
  • Proxy is not set, all have direct internet access

It has something related to process startup timing and timeouts? May me DNS lookups? However they seem to be correct at the end of my post.

Here is services status after VCSA Power-On and after Reboot.

 

Power-on

Running:

applmgmt lwsmd vmafdd vmware-cm vmware-content-library vmware-eam vmware-perfcharts vmware-rhttpproxy vmware-sps vmware-statsmonitor vmware-updatemgr vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui

Stopped:

vmcam vmonapi vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-rbd-watchdog vmware-sca vmware-vcha

Reboot

Running:

applmgmt lwsmd vmafdd vmonapi vmware-cm vmware-content-library vmware-eam vmware-perfcharts vmware-rhttpproxy vmware-sca vmware-sps vmware-statsmonitor vmware-updatemgr vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui

Stopped:

vmcam vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-rbd-watchdog vmware-vcha

After Power-On:

Netstat for established SOCKETS

root@vcsa-01a [ /etc/sysconfig ]# netstat -anv | grep ESTABLISHED | grep -v 127.0.0.1

netstat: no support for `AF IPX' on this system.

netstat: no support for `AF AX25' on this system.

netstat: no support for `AF X25' on this system.

netstat: no support for `AF NETROM' on this system.

tcp        0      0 192.168.0.110:45960     192.168.0.102:443       ESTABLISHED

tcp        0      0 192.168.0.110:35083     192.168.0.111:2014      ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:40272     ESTABLISHED

tcp        0      0 192.168.0.110:60876     192.168.0.229:443       ESTABLISHED

tcp        0      0 192.168.0.110:51126     192.168.0.101:443       ESTABLISHED

tcp        0      0 192.168.0.110:35398     192.168.0.112:514       ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:40046     ESTABLISHED

tcp        0      0 192.168.0.110:45380     192.168.0.102:443       ESTABLISHED

tcp        0      0 192.168.0.110:39810     192.168.0.110:80        ESTABLISHED

tcp        0      0 192.168.0.110:33846     192.168.0.111:389       ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:39810     ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.110:50642     ESTABLISHED

tcp        0      0 192.168.0.110:47518     192.168.0.111:443       ESTABLISHED

tcp        0      0 192.168.0.110:46004     192.168.0.111:443       ESTABLISHED

tcp        0      0 192.168.0.110:50174     192.168.0.101:443       ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:54820     ESTABLISHED

tcp        0      0 192.168.0.110:54106     192.168.0.202:443       ESTABLISHED

tcp        0      0 192.168.0.110:22        192.168.0.250:52460     ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.113:42740     ESTABLISHED

tcp        0      0 192.168.0.110:54820     192.168.0.110:80        ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.250:49651     ESTABLISHED

tcp        0      0 192.168.0.110:53750     192.168.0.202:443       ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.113:37084     ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.137:35626     ESTABLISHED

tcp        0      0 192.168.0.110:60852     192.168.0.229:443       ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:39056     ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.110:34890     ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.110:35016     ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.110:50770     ESTABLISHED

tcp        0      0 192.168.0.110:40342     192.168.0.110:80        ESTABLISHED

tcp        0      0 192.168.0.110:443       192.168.0.113:42874     ESTABLISHED

tcp        0      0 192.168.0.110:40046     192.168.0.110:80        ESTABLISHED

tcp        0      0 192.168.0.110:34798     192.168.0.111:389       ESTABLISHED

tcp        0      0 192.168.0.110:38584     192.168.0.201:443       ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:40498     ESTABLISHED

tcp        0      0 192.168.0.110:40272     192.168.0.110:80        ESTABLISHED

tcp        0      0 192.168.0.110:39056     192.168.0.110:80        ESTABLISHED

tcp        0      0 192.168.0.110:39866     192.168.0.201:443       ESTABLISHED

tcp        0      0 192.168.0.110:80        192.168.0.110:40342     ESTABLISHED

tcp6       0      0 192.168.0.110:40498     192.168.0.110:80        ESTABLISHED

tcp6       0      0 192.168.0.110:50642     192.168.0.110:443       ESTABLISHED

tcp6       0      0 192.168.0.110:34890     192.168.0.110:443       ESTABLISHED

tcp6       0      0 192.168.0.110:50770     192.168.0.110:443       ESTABLISHED

tcp6       0      0 192.168.0.110:35016     192.168.0.110:443       ESTABLISHED

Proxy settings:

root@vcsa-01a [ /etc/sysconfig ]# cat proxy

# Enable a generation of the proxy settings to the profile.

# This setting allows to turn the proxy on and off while

# preserving the particular proxy setup.

#

PROXY_ENABLED="no"

# Some programs (e.g. wget) support proxies, if set in

# the environment.

# Example: HTTP_PROXY="http://proxy.provider.de:3128/"

HTTP_PROXY=""

# Example: HTTPS_PROXY="https://proxy.provider.de:3128/"

HTTPS_PROXY=""

# Example: FTP_PROXY="http://proxy.provider.de:3128/"

FTP_PROXY=""

# Example: GOPHER_PROXY="http://proxy.provider.de:3128/"

GOPHER_PROXY=""

# Example: SOCKS_PROXY="socks://proxy.example.com:8080"

SOCKS_PROXY=""

# Example: SOCKS5_SERVER="office-proxy.example.com:8881"

SOCKS5_SERVER=""

# Example: NO_PROXY="www.me.de, do.main, localhost"

NO_PROXY="localhost, 127.0.0.1"

root@vcsa-01a [ /etc/sysconfig ]# dig -x 192.168.0.110

; <<>> DiG 9.10.4-P8 <<>> -x 192.168.0.110

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60896

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:

; EDNS: version: 0, flags:; udp: 4096

;; QUESTION SECTION:

;110.0.168.192.in-addr.arpa. IN PTR

;; ANSWER SECTION:

110.0.168.192.in-addr.arpa. 86380 IN PTR vcsa-01a.corp.local.

;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Tue Mar 06 10:21:56 CET 2018

;; MSG SIZE  rcvd: 88

root@vcsa-01a [ /etc/sysconfig ]# dig -x 192.168.0.111

; <<>> DiG 9.10.4-P8 <<>> -x 192.168.0.111

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59059

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:

; EDNS: version: 0, flags:; udp: 4096

;; QUESTION SECTION:

;111.0.168.192.in-addr.arpa. IN PTR

;; ANSWER SECTION:

111.0.168.192.in-addr.arpa. 84202 IN PTR psc-01a.corp.local.

;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Tue Mar 06 10:22:09 CET 2018

;; MSG SIZE  rcvd: 87

root@vcsa-01a [ /etc/sysconfig ]# dig vcsa-01a.corp.local

; <<>> DiG 9.10.4-P8 <<>> vcsa-01a.corp.local

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6654

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:

;vcsa-01a.corp.local. IN A

;; ANSWER SECTION:

vcsa-01a.corp.local. 86400 IN A 192.168.0.110

;; Query time: 1 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Tue Mar 06 10:22:55 CET 2018

;; MSG SIZE  rcvd: 53

root@vcsa-01a [ /etc/sysconfig ]# dig psc-01a.corp.local

; <<>> DiG 9.10.4-P8 <<>> psc-01a.corp.local

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50231

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:

; EDNS: version: 0, flags:; udp: 4096

;; QUESTION SECTION:

;psc-01a.corp.local. IN A

;; ANSWER SECTION:

psc-01a.corp.local. 83962 IN A 192.168.0.111

;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Tue Mar 06 10:23:02 CET 2018

;; MSG SIZE  rcvd: 63

0 Kudos
tmichaeli
VMware Employee
VMware Employee

Log file from vmon after successful service restart.

It would be nice to know services dependences and the best way how to troubleshoot VCSA start.

0 Kudos