VMware Cloud Community
Amarokada
Contributor
Contributor

Deploy Identity Manager 3.3.5 cluster from vrslcm fails during OVA stage 9

 

I've got a brand new vCF deployment where I've been following the VVD based on vCF 4.2.  I'm at the point where I've updated the components to vCF 4.3 level and I have the AVNs (NSX-T segments) built as part of the original cloudbuilder stage.

 

I rolled out vrslcm which was mostly ok, except for the fact that it struggled to sync the binaries with the SDDC manager after install.  I was monitoring the /data/vcf folder while it was syncing and found it kept stalling and the OVA files wouldn't get any larger.  I was able to get past this by cancelling the sync operation or rebooting the appliance until finally all the OVAs got synced and the packages showed up in vrslcm.

 

Now my first thing to do is push out the WSA cluster.  I follow the VVD exactly (except for the fact the identity manager slider is hidden inside the create environment section, which automatically pre-fills the environment name with globalenvironment).

 

The request goes in and it starts making the load balancing config, but when it gets to pushing the OVAs to the vCenter I always only see 1 or 2 being pushed out at once in vCenter, although the request graphics show all 3 are attempted.  Sometimes it only gets as far as 1, and once it eventually managed to push all 3 out (not in parallel), but it seems always one will have some kind of issue (not the same one each time).  Usually this seems to be elasticsearch taking more than 13 mins before timing out and the final console says "deployment failed, please redeploy".

 

I've literally deleted the environment about 10 times now and just tried again, and each time it fails at a different point during the OVA push.  I've saved the deployment options as json for easy repeating.

 

I started suspecting the NSX-T segment but I've done manual tests with VMs inside and out and I get healthy pings and transfer speeds (over 4gbps).

 

It feels to me there is some bug with vrslcm reliably talking to vcenter to register the 3 OVA deployments at once, and because they then come up at different times it interferes with the cluster setup.

 

Any ideas?

0 Kudos
1 Reply
Amarokada
Contributor
Contributor

Ok I've come back to answer my own post.

 

After VMware took a quick look at the environment, curl'd a few ports etc, they couldn't see any issues on the surface and my ticket was to be escalated.

 

However during the day I went digging and discovered the version of vrslcm that got deployed was slightly later revision than the vCF 4.3 BOM was expecting.  Somehow I had downloaded the vrslcm for v4.3.1 as well into the bundles, and when you deploy it you don't get to choose which version it uses, I think it just deploys the latest.

 

I ripped out the old vrslcm and removed the higher revision bundle and redeployed.  Now it seems better, I still get only 2 OVAs at the start but I can retry and the 3rd gets pushed out.  So far so good.

0 Kudos