VMware Cloud Community
jkostraba
Contributor
Contributor
Jump to solution

VC 6.5 Install failure - identity management service error on first boot

Is anyone else experiencing a failure when installing VC 6.5?  I have this occuring when installing using the CLI installer, and when deploying the OVA using the ESXi web client.

The version of VC is 6.5.0 build 4240420, the recent release candidate.

The exception report looks like:

Encountered an internal error. Traceback (most recent call last): File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 2017, in main vmidentityFB.boot() File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 349, in boot self.configureSTS(self.__stsRetryCount, self.__stsRetryInterval) File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 1478, in configureSTS self.startSTSService() File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 1140, in startSTSService returnCode = self.startService(self.__sts_service_name, self.__stsRetryCount * self.__stsRetryInterval) File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 88, in startService return service_start(svc_name, wait_time) File "/usr/lib/vmware/site-packages/cis/utils.py", line 784, in service_start raise ServiceStartException(svc_name) ServiceStartException: { "resolution": null, "detail": [ { "args": [ "vmware-stsd" ], "id": "install.ciscommon.service.failstart", "localized": "An error occurred while starting service 'vmware-stsd'", "translatable": "An error occurred while starting service '%(0)s'" } ], "componentKey": null, "problemId": null }


My group and I am curious to hear whether or not anyone else is experience this and if there are any workarounds.

Thanks!


1 Solution

Accepted Solutions
CSvec
Enthusiast
Enthusiast
Jump to solution

So I got a response from vmware tech support on this, and the only thing that worked for me was this:

Pause the install process right before phase 2.

Open a console to the appliance, enable SSH

On the appliance, echo "::1 localhost.localdom localhost" >> /etc/hosts

Continue the installer and magically it works. This shouldn't really matter, but it does, so I figured I'd let people see if it fixed their issues.

Their first recommendation was basically most people's problem: forward and reverse DNS should work before you install. But if your DNS is happy and the appliance still hates life, it appears to also care about v6 lookups.

View solution in original post

30 Replies
AndrewBrinded0
Contributor
Contributor
Jump to solution

Hi,

I've just tried the same thing, i had a working template used with 6.0 which i then ran against 6.5, it failed as there were some slight changes with the latest release but the cli-installer generated a new json file for me.

  • I took this newly generated template, compared against the stock defaults and my own and ran both a --verify-only which succeeded
  • I performed a manual install providing the same settings which went into the cli-installer via the json file, and it deployed perfectly fine.
  • I then ran it without the --verify-only, which led me to get the same error message as you.

Taking another look into the error messages i can see the below (unfortunately running the deploy again with --verbose doesn't give me anything else to go on):

2016-11-17 13:30:44,838 - vCSACliInstallLogger - INFO - Initial Configuration: Progress: 5% Starting VMware Identity Management Service...

2016-11-17 13:36:03,202 - vCSACliInstallLogger - ERROR - Task failed. Status: ERROR

Progress: 5% Starting VMware Identity Management Service...

Error:

    Problem Id: None

    Component key: idm

    Detail:

        Encountered an internal error.

Traceback (most recent call last):

  File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 2018, in main

    vmidentityFB.boot()

  File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 349, in boot

    self.configureSTS(self.__stsRetryCount, self.__stsRetryInterval)

  File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 1479, in configureSTS

    self.startSTSService()

  File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 1141, in startSTSService

    returnCode = self.startService(self.__sts_service_name, self.__stsRetryCount * self.__stsRetryInterval)

  File "/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py", line 88, in startService

    return service_start(svc_name, wait_time)

  File "/usr/lib/vmware/site-packages/cis/utils.py", line 784, in service_start

    raise ServiceStartException(svc_name)

ServiceStartException: {

    "resolution": null,

    "detail": [

        {

            "args": [

                "vmware-stsd"

            ],

            "id": "install.ciscommon.service.failstart",

            "localized": "An error occurred while starting service 'vmware-stsd'",

            "translatable": "An error occurred while starting service '%(0)s'"

        }

    ],

    "componentKey": null,

    "problemId": null

}

The generated support bundle gives a little bit more information to go on from a variety of log files:

vmware-identity-sts-6.5.0.1351-4594647########################################

mkdir: cannot create directory .../usr/lib/vmware-sso/vmware-sts/conf...: No such file or directory

Failed to create dir /usr/lib/vmware-sso/vmware-sts/conf

chmod: cannot access '/usr/lib/vmware-sso/vmware-sts/conf': No such file or directory

mkdir: cannot create directory .../etc/vmware-sso/keys...: No such file or directory

Failed to create dir /etc/vmware-sso/keys

2016-11-17T09:48:46.714Z   Stderr: hostname: Host name lookup failure

2016-11-17T09:49:51.036Z   Failure setting accounting for vmware-sts-idmd. Err Failed to set unit properties on vmware-sts-idmd.service: Unit vmware-sts-idmd.service is not loaded.

2016-11-17 09:50:20 5776: [ERROR] Request for http://localhost:7080/afd failed after 1 seconds. Status: /usr/bin/curl status. Response: 000. Host: ;; connection timed out; no servers could be reached

2016-11-17T09:50:09.323Z   Service vmware-stsd does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.

2016-11-17T09:55:09.651Z   VMware Identity Service bootstrap failed.

2016-11-17T09:55:09.676Z INFO firstbootInfrastructure [Failed] /usr/lib/vmidentity/firstboot/vmidentity-firstboot.py is complete

2016-11-17T09:55:09.677Z WARNING firstbootInfrastructure Bug component info file does not exist

2016-11-17T09:55:09.678Z INFO firstbootInfrastructure First boot is a failure

Stderr = Job for vmware-stsd.service failed because a timeout was exceeded. See "systemctl status vmware-stsd.service" and "journalctl -xe" for details.

"Command: ['/sbin/service', 'vmware-stsd', 'start']\nStderr: Job for vmware-stsd.service failed because a timeout was exceeded. See \"systemctl status vmware-stsd.service\" and \"journalctl -xe\" for details.\n"

  • I then checked the cli template-help and william lams blog but can't see that my json file is missing anything in particular that would be causing this.
  • Given that a manual deploy works fine means that the backend / target is working fine and there are no issues with the install media.
  • I then found vCenter Server Appliance Upgrade from 6.0 to 6.5 - CareExchange.in which seems to show a similar error message with a resolution related to credentials; given that i'm deploying to a single host which is not already managed by vCenter, not sure how this fits in but alas i reset my host password to match that of those defined in the vcsa os/sso, just in-case there is something odd going on that I'm missing.

I've tried a few other things but the Identity Management service always seems to stall at the 5% marker before it then blows chunks.

I'll take another look later, maybe someone else will see the above and a light bulb moment will go off.

dgbroadv
Contributor
Contributor
Jump to solution

Having the same issue, via a fresh gui install.

All passwords have been changed to match that of the installer, (host,installer, etc..)

Reply
0 Kudos
rainer_schumach
Contributor
Contributor
Jump to solution

Hi,

same problem here. Several attempts to migrate from Windows and SQL-Server to VCSA 6.5 via the gui migration. It stops at 5% "Starting VMware Identity Management Service..."

Reply
0 Kudos
Marmotte94
Enthusiast
Enthusiast
Jump to solution

Hi,

Did you try this Kb ? https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21444...

Thanks,

Regards,

Please, visit my blog http://www.purplescreen.eu/
Reply
0 Kudos
dlaemle
Contributor
Contributor
Jump to solution

So i encountered this same exact error with a fresh build and figured it out, at least on my end.

There were three things I did not do or did 'incorrectly' when setting up this VCSA on our lab at the office below with changed effect:

1. Did not setup a proper fqdn: i had set it up as an IP address and not a resolving fqdn, going forward I set it to vc.thegrid.net (testing domain)

2. Configured 2 NTP servers when 1 was offline: discovered this when trying to see if instead we could do an upgrade path and doing a fresh install for VCSA 6.0U2 threw a warning that it WILL fail since one of the NTP servers listed is unreachable.

3. Set up 2 DNS servers when 1 was offline: same as #2, but there was no warning when setting up 6.0U2 however i omitted the third server just because.

So my 2 cents would be to set a properly resolvable FQDN, and point to 1 NTP and DNS server that is online, make edits to 2+ servers later on.

Reply
0 Kudos
Marmotte94
Enthusiast
Enthusiast
Jump to solution

Hi,

You can use this post for solve your problem. I created this post for help community.

Erreur lors du déploiement vCenter 6.5 - PurpleScreen.eu

Thanks,

Regards,

Please, visit my blog http://www.purplescreen.eu/
Reply
0 Kudos
AndrewBrinded0
Contributor
Contributor
Jump to solution

Thanks for the link, however when I run ./vcsa-deploy install --template-help it does not show a parameter which specifies the default SSO domain nor was it in version 6.0 either.

I'm assuming many of those who would install vCSA 6.5 into production would prefer to use the GUI, however the main benefit of using the cli-installer to me (and probably others), is that I can deploy vCSA without an administrator at the helm and rapidly tweak the json file for the next time round, which this "workaround" of logging into the web console doesn't help with.

@dlaemle thanks for mentioning the DNS and NTP servers, I came across the issue/design decision with the system name/ip address a while back but at the time I didn't think it extended to the NTP server actually being up (using the IP Address as a systemname was still valid) but both then and now my JSON file was configured to use vmware-tools as the time source via the time.tools-sync property so I probably wouldnt have noticed.

VMware vCSA 6 Scripted/vcsa-deploy Issue

Reply
0 Kudos
01004753
Contributor
Contributor
Jump to solution

I also had issues in my homelab and I was being able to replicate the issue.

I basically setup all my 4 DNS servers in the vCenter configuration wizard (2nd step) and 2 of them were Powered off.

I could say that it's best to setup the vCenter with just one DNS server and then to reconfigure it with the others.

For NTP settings, I chose to sync the vCenter with the local ESX but previously I was using pool.ntp.org.

This might be also causing issues but I doubt.

Using VMware products since Workstation 3.0 and loved the way it changed the world!
Reply
0 Kudos
CSvec
Enthusiast
Enthusiast
Jump to solution

So I got a response from vmware tech support on this, and the only thing that worked for me was this:

Pause the install process right before phase 2.

Open a console to the appliance, enable SSH

On the appliance, echo "::1 localhost.localdom localhost" >> /etc/hosts

Continue the installer and magically it works. This shouldn't really matter, but it does, so I figured I'd let people see if it fixed their issues.

Their first recommendation was basically most people's problem: forward and reverse DNS should work before you install. But if your DNS is happy and the appliance still hates life, it appears to also care about v6 lookups.

igorus
Contributor
Contributor
Jump to solution

Worked for me - thank you CSvec.

Reply
0 Kudos
munishpalmakhij
Contributor
Contributor
Jump to solution

I ran in to same issue when deploying VCSA 6.5 in lab and while researching came over to this thread.

Couple of things which I wanted to share. Quick Note , I dont have any DNS server setup in my lab. I agree we definitely need to have DNS in Production deployments but I have never used DNS for VCSA in Lab

  1. Out of many trial runs one of them was to specify the DNS Server IP as itself (i.e. same IP as VCSA IP. In my case it is 10.1.1.49) and what do I see , it worked perfectly fine Smiley Happy
  2. Additionally I also tried to use DHCP to assign IP to my VCSA and leave FQDN field empty as it automatically takes as IP and it works fine as well.(My DHCP server doesn't give any DNS details)
    • For some strange reason it causes issue with static IP hence I tried DNS server as its own IP

Hope this helps , Keep Sharing !!!!

Reply
0 Kudos
BoopathiD
Contributor
Contributor
Jump to solution

CSvec,

Thank you, it worked for me.

I gave the DNS IP as VCSA IP itself.

Thivakaran
VMware Employee
VMware Employee
Jump to solution

Reply
0 Kudos
GabeGheorghiu
Contributor
Contributor
Jump to solution

CSvec has the right idea.

Additional steps, required in my situation:

- Despite running " echo "::1 localhost.localdom localhost" >> /etc/hosts" before stage 2, I got the same error.

- After 2 tries, I figured out, the the begging of stage 2 was actually erasing the edits in /etc/hosts

Solution:

- Wait until stage 2 is at 2%, and then add " echo "::1 localhost.localdom localhost" >> /etc/hosts". You may want to be logged in via SSH in order to be ready.

Option B:

- Set the DNS to the VC itself. This can be updated later.

Good luck.

rindr
Contributor
Contributor
Jump to solution

Hi,

There is even a better solution which does not require any tricks "on the fly" like echo-ing the string to /etc/hosts file during the install at 2%. Just follow this KB as advised by support:

"An error occurred while invoking external command : '%(0)s'" when deploying vCenter Server Applianc...

Regards.

R

Reply
0 Kudos
jkostraba
Contributor
Contributor
Jump to solution

Thank you CSvec!

This worked like a champ!

Reply
0 Kudos
rindr
Contributor
Contributor
Jump to solution

With VCSA 6.5a release (2017-02-02) there is no need to use the workaround by altering /etc/hosts file. Either fix suggested by CSvec or GabeGheorghiu is no longer required.

Just download and use VCSA 6.5a for migration and now it goes smoothly right from Stage 1 to Stage 2:

https://my.vmware.com/group/vmware/details?downloadGroup=VC650A&productId=614

Reply
0 Kudos
anthonybailey
Contributor
Contributor
Jump to solution

No, I can confirm that I'm definitely still getting this issue on the VCSA 6.5a (4944578) installer, upgrading from vCenter 5.5 (1750787) on Windows.

I'd also like to point out that the KB article says to insert the ::1 line into the middle of the hosts file, inside the "VAMI_EDIT" block, which never worked for me -- checking after its failure, the line I'd inserted was absent, so I think that block gets overwritten -- and that appending the line to the end of the file (via the '>>' redirector) did the trick.

Reply
0 Kudos
hoFFy84
Contributor
Contributor
Jump to solution

After trying everything from here (DNS, Reverse DNS, Disabling IPv6, editing the /etc/hosts) and nothing helped, I tried the newest VCSA 6.5b (available since yesterday) and that worked like a charm. I checked the /etc/hosts during Phase 2 and it seems that they have corrected the bug there (found an entry for IPv6).