jimmyvandermast
Hot Shot
Hot Shot

vRA 8.3 easy installer fails to reach vRSLCM

Jump to solution

We have a dev-lab and a test-lab.

A few months ago I have installed vRA 8.2 without any issue to our test-lab.
Now I have removed the VM's from vCenter so I can start a new deployment of vRA 8.3.

Before this, I have used the same vRA 8.3 ISO file to install to our dev-lab without any issues.

Now I try to install it to our test-lab and it fails:

2021-02-17T11:30:27.378Z - info: copying binaries to lcm va: 192.168.x.x
2021-02-17T11:31:09.347Z - debug: Updated current session with lstActTm Wed Feb 17 2021 12:28:10 GMT+0100 (W. Europe Standard Time)
2021-02-17T11:36:09.354Z - debug: Updated current session with lstActTm Wed Feb 17 2021 12:28:10 GMT+0100 (W. Europe Standard Time)
2021-02-17T11:41:09.364Z - debug: Updated current session with lstActTm Wed Feb 17 2021 12:28:10 GMT+0100 (W. Europe Standard Time)
2021-02-17T11:46:09.370Z - debug: Updated current session with lstActTm Wed Feb 17 2021 12:28:10 GMT+0100 (W. Europe Standard Time)
2021-02-17T11:48:02.814Z - error: vRSLCM services bootstrap failed.
2021-02-17T11:48:02.816Z - error: error has occurred vRSLCM services bootstrap failed. Verify vRSLCM appliance is reachable on given IP and provided network settings are correct
2021-02-17T11:51:09.386Z - debug: Updated current session with lstActTm Wed Feb 17 2021 12:28:10 GMT+0100 (W. Europe Standard Time)
2021-02-17T11:55:07.374Z - info: Exiting from the app

This is really weird. During the process of the easy installer, I can ping and SSH to the newly deployed LCM.
(while doing that from exact the same workstation from where the easy installer is running)

Previous 8.2 had no problems here.

I have tried 2 times, with the same result.

 

EDIT: Third attempt.

This time I skipped vIDM, so also skipped the vRA appliance.
However the Easy installer then still tries the step "Moving binaries".
Within just a short time, the LCM boots up with it's fine Blue startup screen saying that I can access it at https://<ipaddr>/vrlcm

I can then again ping it, SSH to it.

However, when I go to the above vrlcm url, it connects over https but then the lcm page keeps saying "Waiting for services to start".
I cannot find any logging or any VMware KB to help.

 

In the easyinstaller log, I do see:

2021-02-17T15:36:44.874Z - warn: DNS is not resolved to IPv6: Error: queryAaaa ENODATA <hostname.fqdn>

But it seems just like a warning, and we're not using ipv6 here.

A few lines before, it does resolve the ipv4 fqdn to the correct address.

Labels (1)
0 Kudos
1 Solution

Accepted Solutions
jimmyvandermast
Hot Shot
Hot Shot

After fixing the postgres here:

https://communities.vmware.com/t5/vRealize-Automation-Tools/Clean-LCM-deploy-PSQLException-Connectio...

I applied the same principle to a full vRA install with the easy installer and that worked.

View solution in original post

8 Replies
jimmyvandermast
Hot Shot
Hot Shot

After fixing the postgres here:

https://communities.vmware.com/t5/vRealize-Automation-Tools/Clean-LCM-deploy-PSQLException-Connectio...

I applied the same principle to a full vRA install with the easy installer and that worked.

Christiankkcc
Enthusiast
Enthusiast

WOW

Talk about a coincidence. I too am having troubles with the 8.3 at this very moment.
Either the installation gets stuck at "moving product binaries" (where it stayed there for 2 hrs while I went and did other things!)

or

I continue to receive this error below and it also reflects in the logs

2021-02-18 11_37_15-mRemoteNG - confCons.xml - sd-itds01 (Austria).png

INSTALLER LOG

2021-02-18T14:16:51.622Z - info: LCM is still booting up
2021-02-18T14:17:31.737Z - error: Error occured during updating vRSLCM password Response with status: 0 for URL: null
2021-02-18T14:17:31.737Z - error: error has occurred Error occured during updating vRSLCM password
2021-02-18T14:19:45.000Z - debug: Updated current session with lstActTm Thu Feb 18 2021 15:11:49 GMT+0100 (W. Europe Standard Time)
2021-02-18T14:21:06.957Z - info: Exiting from the app

 

I've also moved over to your other post as well about removing the old postgres entry. I would like to try this out, if this doesn't work, I guess I'll just go back to 8.2 or below.

0 Kudos
jimmyvandermast
Hot Shot
Hot Shot

@Christiankkcc one more thing that I did from which I am not sure it was needed, is to (re)set the lcm admin password form the console.

You could also try that, just after LCM is deployed and before it starts to move the binaries.
So, log on to the console (or proably SSH will also do), and then /opt/vmware/share/vami/vami-vlcm-passwd-reset

Moving the binaries WILL take some time, depending on your connection and storage.
Also via console or SSH, check for vidm.ova and vra.ova to appear in /data?

0 Kudos
Christiankkcc
Enthusiast
Enthusiast

@jimmyvandermast 

 

I think my problem is, I removed LCM and IDM from my test vCenter by simply deleting the VMs. These errors didn't happen to me at all when I first installed the LCM. So I think I broke something in vCenter?

 

Also, what's weird for me is that I tried following you password reset advice and I can't even log into LCM through the console with admin@local . I use the same password for all of my test machines so I know I'm using my password correctly, but it's as if my password doesn't exist at all.

So I deleted the LCM VM again and am trying to re-install it with 8.0.

 

I keep having the same two problems:

 

First re-install today:

21-02-19T06:21:59.703Z - info: Guest IP was obtained at loopCount=10: xxx.xxx.xxx.xxx

2021-02-19T06:21:59.703Z - info: copying binaries to lcm va: 10.14.3.201

2021-02-19T06:24:25.172Z - error: Error occured during updating vRSLCM password Response with status: 0 for URL: null

2021-02-19T06:24:25.173Z - error: error has occurred true

2021-02-19T06:25:16.500Z - debug: Updated current session with lstActTm Fri Feb 19 2021 07:20:01 GMT+0100 (W. Europe Standard Time)

2021-02-19T06:26:06.848Z - info: Exiting from the app

 

And second re-install today:

Moving product binaries. Again, I think here it's just stuck. I'm using a 10G connection from my server to storage and I remember the first time I installed that it didn't take over 2 hours to complete.

 

 

 

UPDATE

Just as I was writing this post, the installation process successfully moved the binaries. It took about 10 minutes.

But now, I have another problem..


2021-02-19T06:42:39.046Z - info: Binary mapping is done properly. Both entroes are present
2021-02-19T06:43:39.313Z - info: response-datacenter: Response with status: 200 OK for URL: https://xxx.xxx.xxx.xxx/lcm/lcops/api/datacenters
2021-02-19T06:43:39.566Z - info: response-regions:Response with status: 200 OK for URL: https://xxx.xxx.xxx.xxx/lcm/lcops/api/datacenters/Default_datacenter/regions
2021-02-19T06:43:39.732Z - info: response-zonesResponse with status: 200 OK for URL: https://xxx.xxx.xxx.xxx/lcm/lcops/api/datacenters/Default_datacenter/regions/default/zones
2021-02-19T06:43:39.897Z - info: response-zonesResponse with status: 200 OK for URL: https://xxx.xxx.xxx.xxx/lcm/lcops/api/datacenters/Default_datacenter/regions/default/zones/default/v...
2021-02-19T06:43:39.897Z - info: In vidm install funcrion
2021-02-19T06:43:39.976Z - info: responseResponse with status: 200 OK for URL: https://10.14.3.201/lcm/lcops/api/environments
2021-02-19T06:43:39.978Z - info: vRA wait time in millSecond: 300000
2021-02-19T06:45:44.860Z - debug: Updated current session with lstActTm Fri Feb 19 2021 07:31:48 GMT+0100 (W. Europe Standard Time)
2021-02-19T06:48:39.984Z - info: In vra install funcrion URL: https://xxx.xxx.xxx.xxx/lcm/lcops/api/environments
2021-02-19T06:48:39.991Z - info: Response with status: 0 for URL: null
2021-02-19T06:48:39.992Z - error: error has occurred ERROR while creating vRealize Automation environment request


2021-02-19 07_53_38-Window.png

Me deleting the LCM and IDM VMs really made a mess of things..

 

Thanks for the help

0 Kudos
jimmyvandermast
Hot Shot
Hot Shot

@Christiankkcc  Simply removing the VM's should not be a problem. I do that ever time (since I am trying to write a procedure, I need to do things many times over and over).

Log in to the console or ssh is not with admin @ local but with root and the password that you provided during the wizard.

From what I see in your latest logs and in earlier logs, for some reason your lcm responds normal (http 200) for a while and then suddenly stops responding.  Can you check things like a simple ping or even better keep checking if you can continiously reach the lcm mchine on https port 443?  It really looks like some kind of network issue or some other reason why the lcm vm does not respond.

Since it happens at different points every time, I guess it's not an installation issue or a bug, but it is more something in your environment that interrupts network or the VM itself.

Christiankkcc
Enthusiast
Enthusiast

@jimmyvandermast 

Good to know that simply deleting the VMs is no problem at all. I really didn't see a reason why it would have in the first place, but this whole time, I really thought this was the issue.

But in fact..

and this will be typical behavior..😪

it was my Palo Alto firewall 😣

After initially deleting the VMs, then reinstalling them, of course they received a new MAC and MAC security is enabled. I switched it to IP filtering and now everything works properly. 

Troubleshooting for so long and looking at the situation at a different angle this whole time, I guess I just got mentally stuck and looking at the network didn't even cross my mind.

Thanks for all your help, it was very much appreciated.

RonPSSC
Enthusiast
Enthusiast

I'm posting this in response to the original thread which had identified suspect network issues and the following VRSLCM install failure errors:

"error has occurred vRSLCM services bootstrap failed. Verify vRSLCM appliance is reachable on given IP and provided network settings are correct"

I just discovered I too was experiencing the above issues not only via the Easy Installer but also when deploying the installer ova directly in vCenter. I am posting on the odd chance other users are still troubleshooting this issue.

It appeared, at least in my case, that when I deselected the FIPS Mode Compliance option during the Setup, all attempts to install would fail and the above errors were recorded in the supporting installer log file. I validated this issues with both vRSLCM 8.2 and Version 8.3.

For info, I also opened a Support case with VMware and the one suggestion by the Rep was to ensure Thick Disk only was used for provisioning of the VM. I was not able to precisely confirm if this would have any effect because 3 successive install attempts after enabling FIPS Mode only were successful. In short, maintaining the thin disk option for me did not yield any negative results when selected. I am simply referencing this in case this might be relevant for others however.

RonP  

 

0 Kudos
jimmyvandermast
Hot Shot
Hot Shot

@RonPSSC  I've just tested the easy installer with FIPS _on_ and now the deployment continues. So that definitely is part of the problem.

Very weird because inside the lcm appiance, it's the postgres init that fails.

I will add this to the current open SR.

0 Kudos