storage_god
Contributor
Contributor

VRAC 7.0 installation fails at Verify that all services are started..

Hello,


I am running through an issue while doing a brand new installation of VRAC 7.0. The same error below from https://vracFQDN/vcac/services/api/status is listed on the catalina.out log. Any help would be appreciated.


Thank you!


vrac7.JPG

<serviceRegistryStatus>

<errorMessage>

catalog-service service error: 503 Service Unavailable

</errorMessage>

<initialized>false</initialized>

<serviceInitializationStatus>UNAVAILABLE</serviceInitializationStatus>

<serviceName>shell-ui-app</serviceName>

<solutionUser>cafe-1F7kY6mCRw</solutionUser>

<startedTime>2016-01-04T12:35:54.653-08:00</startedTime>

<serviceRegistrationId>40ea0a62-f91a-4416-a3b2-3262ee447763</serviceRegistrationId>

9 Replies
willrodbard
VMware Employee
VMware Employee

Hi storage_god

I suffered the same annoying issue.

Can I check, are you trying to build this in an enterprise or in your home lab? I ask as I was trying this in my home lab where resources are scarse so my vRA appliance only had 2 x vCPUs and 12GB RAM, so the action was initially timing out I think.

After a good number of retries I re-provisioned everything (I was trying to use SQL Express which was also causing me SQL connection issues during setup)

2nd time around I increased the ram for both my vRA Appliance and the iaas server, the first time I ran through everything it timed out again, so I tried to do some digging, I didn't get the same error as you from Catalina, but I did a manual check of the services from a putty session to the appliance, which also took a lot longer than I would have expected it to take.

I noticed the the likewise service manager had failed (lwsmd) and there were a number of failed attempts by the system to start this service, which I guess was the reason for the verify services task was timing out. Strangely though after performing the manual services --status-all check the verify task ran through ok?

So I can't for sure say what I did will fix your issue but thought I would list what I did just in case it helps

ZahariIvanov
Contributor
Contributor

First two steps in installation are related to configuration of appliance. After "Configure Postgres" step all vRA services are starting, but in some cases not all services managed to started for the default time of the healthceck which is 30 min. Healthcheck verifies if all services are up and running. If some service is not up and running the healthcheck fails and this behavior repeats until maximum number of retries is reach. This number of retries as well as time between every healtcheck retry could be configured in /etc/vcac/vcac-config.properties.
However we will try to add a check for vRA resources which may be one of the reasons for the described failure. It seems that the same issue was hit in https://vmware-com.socialcast.com/messages/28824096.

Regards,
Zahari

0 Kudos
GrantOrchardVMw
Commander
Commander

Hey Zahari,

Not everyone will have access to our Socialcast instance Smiley Happy

Suffice to say that this has been encountered internally and we're looking at a fix.

Grant

Grant http://grantorchard.com
0 Kudos
jeffdelapena
Contributor
Contributor

been experiencing this for days! any luck of getting answers/solution?

0 Kudos
carlsonc
VMware Employee
VMware Employee

Another cause for this is if the IP's assigned to the vRA VA exceed the column varchar length causing vRO initialization to fail (can happen with multiple IPv6 entries).  You have this issue if the following apply:

  • The vRealize Automation /var/log/vmware/vcac/catalina.out log shows an error similar to:  Error Message I/O error on GET request for "https://<FQDN>:443/vco/api/status": Read timed out; nested exception is java.net.SocketTimeoutException: Read timed out.
  • The vRealize Automation /var/log/vmware/vco/integration-server.log shows an error similar to:  [ApplicaitonEventHandler-1] ERROR {} [SqlExceptionHelper] Batch entry 0 insert into vmo_clustermember (contentversion, hostaddress, hostname, lastupdated, state, id) values (0, , '<LIST OF IPv6 AND IPv4 ADDRESSES>', 'localhost', 1464817655348, 1, '67e8c8ca-0150-4988-9d29-b7f844e04222') was aborted. Call getNextException to see the cause.

To work around this (in the case of DHCP):

  • Edit the /etc/sysctl.conf and add the following to disable IPv6:

#disable ipv6

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

  • Run 'sysctl -p' to reload the sysctl (ifconfig should show no IPv6 addresses)
  • Run 'service vco-server restart' to restart vRO.
  • Run the "Retry Failed' from the installation wizard to proceed.

Hope that helps!

Regards,
Cody Carlson
Senior Solution Architect
Global Support Services, VMware Inc.
0 Kudos
vvenu
Enthusiast
Enthusiast

These below lines for disabling IPV6 already exists in sysctl.conf file vRA 7.2.0

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

Can you someone please share a workaround for this issue?

0 Kudos
brandon364
Contributor
Contributor

!I took the below /etc/hosts file on the vRA appliance

Original
# VAMI_EDIT_BEGIN
# Generated by Studio VAMI service. Do not modify manually.
127.0.0.1 localhost
192.168.1.17 myhostname.this.domain.com myhostname.this.domain.com
# VAMI_EDIT_END
127.0.0.1 localhost.localdom
127.0.0.1 myhostname.this.domain.com load-balancer-host

New
# VAMI_EDIT_BEGIN
# Generated by Studio VAMI service. Do not modify manually.
#127.0.0.1 localhost
192.168.1.17 myhostname.this.domain.com myhostname.this.domain.com
# VAMI_EDIT_END
#127.0.0.1 localhost.localdom
192.168.1.17 myhostname.this.domain.com load-balancer-host

After doing this and restarting from my snapshot it passes on checking services and continues the next steps.

0 Kudos
kumar6384
Enthusiast
Enthusiast

Hi,

If the installation fails at Verify that all services are started step, it means some service is not up and running the health check fails and this behavior repeats until the maximum number of retries is reached. The number of retries, as well as the time between every health check retry, can be changed.

vrealize7configuration26

To change the number of retries and the time between every health check retry, from the VA console, edit the /etc/vcac/vcac-config.properties file.

# vi /etc/vcac/vcac-config.properties

The default values could be the reason of the failure and should be changed.

vrealize7configuration27

Set a bigger value and save the file. Go back to the vRA web console and click Install to proceed with the installation again.

vrealize7configuration28

When the installation has been completed successfully.

0 Kudos
daphnissov
Immortal
Immortal

If your vRA install is failing at that step (which is only the second step out of many), then editing that configuration file is not the solution. Instead, you have a problem with either the inputs you're supplying in the wizard, or other environmental issues pertaining to the cafe appliance or shared infrastructure services. These problems need to be identified and fixed before editing any config files.

0 Kudos