VMware Cloud Community
sbonds
Contributor
Contributor
Jump to solution

How can I complete a VCSA CLI Install? Error seen: "Failed to authenticate with the guest operating system using the supplied credentials."

Environment

  • HP DL380 G7, 96GB RAM, 2 Xeon L5630 CPUs, 8 total cores available
  • ESXi 6.0U3, HPE version
  • ESXi freshly installed onto internal SD card, no other VMs present
  • 3TB of RAID-10 datastore available (empty)

Problem

When installing the vCenter Server Appliance using the CLI, after a 45 minute pause at "RPM Install: Progress: 95% Configuring the maching" I get a final fatal error: "Failed to authenticate with the guest operating system using the supplied credentials."

Details

First of all, hats of to vMware for providing this article: https://kb.vmware.com/s/article/2106760 ("Triaging a vCenter Server Appliance 6.0 installation Failure") Those are the sorts of articles that can really help by providing basic info about what the software is trying to do and letting us use our own brains to try and find why it's not doing what's intended. (Contrasted with the guides that try to find specific solutions for every possible problem-- an impossible goal.)

The first error seen in the /tmp/vcsaCliInstaller-<datestamp>/vcsa-cli-installer.log file seems to suggest a possible problem:

2018-01-11 21:01:38,004 - vCSACliInstallLogger - TRACE - Cannot download file /var/log/firstboot/rpmInstall.json from esx60dl380.mydomain.net. Got error: Failed to download file from guest

with error '(vim.fault.InvalidGuestLogin) {

   dynamicType = <unset>,

   dynamicProperty = (vmodl.DynamicProperty) [],

   msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',

   faultCause = <unset>,

   faultMessage = (vmodl.LocalizableMessage) []

}'

The odd part about this message is that the download fails against THE ESXi SERVER but it appears that the file would be one generated on THE VCSA GUEST.

When downloading the logs manually from https://<VCSA hostname>/appliance/support-bundle, it works, though it takes about 3 minutes to complete. This appears to be dynamically generated by /bin/vc-support.cgi, which when run from the bash command line, takes about 2 minutes to complete. The logs show that the long steps are things like "rpm -qa --verify" (58s) and "/sbin/service --status-all" at 8s. No surprises there.

The final errors in the log file:

2018-01-11 21:47:49,075 - vCSACliInstallLogger - INFO - Gathering VC support log bundle. This can take a few minutes.

2018-01-11 21:47:49,365 - vCSACliInstallLogger - WARNING - Collecting the support bundle from the deployed appliance...

2018-01-11 21:47:49,753 - vCSACliInstallLogger - ERROR - Cannot collect the support bundle from the deployed appliance: Failed to run command in guest with error '(vim.fault.InvalidGuestLogin) {

   dynamicType = <unset>,

   dynamicProperty = (vmodl.DynamicProperty) [],

   msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',

   faultCause = <unset>,

   faultMessage = (vmodl.LocalizableMessage) []

}'

2018-01-11 21:47:49,754 - vCSACliInstallLogger - ERROR - Got error while running OVF Tool command: Failed to download file from guest with error '(vim.fault.InvalidGuestLogin) {

   dynamicType = <unset>,

   dynamicProperty = (vmodl.DynamicProperty) [],

   msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',

   faultCause = <unset>,

   faultMessage = (vmodl.LocalizableMessage) []

}'

2018-01-11 21:47:49,756 - vCSACliInstallLogger - DEBUG - The vCenter Server Appliance installer log file is at: /tmp/vcsaCliInstaller-2018-01-12-04-59-U0CFN0/vcsa-cli-installer.log

2018-01-11 21:47:49,756 - vCSACliInstallLogger - DEBUG - The vCenter Server Appliance installer result file is at: /tmp/vcsaCliInstaller-2018-01-12-04-59-U0CFN0/vcsa-cli-installer.json

These messages fail to provide basic info like what host ("deployed appliance") were you trying to download from? What were the credentials used? (Yeah, that's not always something good to put in log files, but it sure is helpful for troubleshooting!)

I've checked that the hostnames of the ESXi server, CLI source server, and DHCP reservation all resolve in DNS both forward and reverse. I've checked the network throughput using the VCSA shell and netcat and it's excellent. (100+ MB/s) I've checked DNS from within the VCSA shell and every host I could think that could be related (source, ESXi host, etc.) all resolve perfectly and ping fine.

I've simplified the CLI deployment template to the bare essentials. Here's what I'm using (with the temp passwords intact because, hey, they're throwaway anyhow):

{

    "__version": "1.2.0",

    "__comments": "Deploy a vCenter 6.0 instance to manage zircon and ruby. Will live on zircon.",

    "target.vcsa": {

        "appliance": {

            "deployment.network": "VM Network",

            "deployment.option": "tiny-lstorage",

            "name": "vcenter",

            "thin.disk.mode": false

        },

        "esxi": {

            "hostname": "esx60dl380.mydomain.net",

            "username": "vcsa-deploy",

            "password": "Temp2_ChangeL8R",

            "datastore": "HP P410i RAID10"

        },

        "network": {

            "ip.family": "ipv4",

            "mode": "dhcp"

        },

        "os": {

            "password": "/Osburl9",

            "ntp.servers": [

                "172.29.55.25"

            ],

            "ssh.enable": true

        },

        "sso": {

            "password": "5-fronLi",

            "domain-name": "vsphere.local",

            "site-name": "TESTING"

        }

    }

}

I'm wondering if the installer is getting confused about what's the guest and what's the ESXi server. Has anyone else seen this situation where it looks like the installer is trying to grab guest logs from the ESXi host? How would this installation work for anyone if that's the case?

Reply
0 Kudos
1 Solution

Accepted Solutions
sbonds
Contributor
Contributor
Jump to solution

When the problem's not DNS, what is it? Yes, the date was wrong on the reloaded ESXi server. I'm not sure how it happened, but the date was a couple years behind the present. Nowhere in the logs is there any mention of date issues, certificate expiration, etc. so it was pure chance that I noticed it.

So if anyone ever sees the above hot mess of log entries-- check the basics. DNS and NTP. 🙂

View solution in original post

Reply
0 Kudos
2 Replies
sbonds
Contributor
Contributor
Jump to solution

I tried the same deploy using VCSA 6.5 and got a very similar error:

2018-01-12 22:43:33,070 - vCSACliInstallLogger - INFO - OVF Tool: Powering on VM: vcenter

2018-01-12 22:43:33,966 - vCSACliInstallLogger - INFO - OVF Tool: Task Completed

2018-01-12 22:43:34,101 - vCSACliInstallLogger - INFO - OVF Tool: Completed successfully

2018-01-12 22:43:34,108 - vCSACliInstallLogger - DEBUG - Starting to monitor status JSON file at /tmp/vcsaCliInstaller-2018-01-12-22-40-JDj9sJ/monitor-firstboot-progress-AmaTuX.json

2018-01-12 22:43:34,108 - vCSACliInstallLogger - INFO - =================== [4] Install Services started at 22:43:34 ===================

2018-01-12 22:43:34,109 - vCSACliInstallLogger - INFO - Installing services...

2018-01-12 22:43:34,421 - vCSACliInstallLogger - TRACE - Cannot download file /var/log/firstboot/rpmInstall.json from esx60dl380.mydomain.net. Got error: Failed to download file from guest with error '(vim.fault.GuestOperationsUnavailable) {

   dynamicType = <unset>,

   dynamicProperty = (vmodl.DynamicProperty) [],

   msg = 'The guest operations agent could not be contacted.',

   faultCause = <unset>,

   faultMessage = (vmodl.LocalizableMessage) []

}'

Again, the ESXi server is the target for the download, not the vcenter guest, which seems potentially odd.

Edit:

Later in the log it appears to change over from that error to this one, familiar from last time on vCenter 6.0:

2018-01-12 22:45:32,415 - vCSACliInstallLogger - TRACE - Cannot download file /var/log/firstboot/rpmInstall.json from esx60dl380.mydomain.net. Got error: Failed to download file from guest with error '(vim.fault.InvalidGuestLogin) {

   dynamicType = <unset>,

   dynamicProperty = (vmodl.DynamicProperty) [],

   msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',

   faultCause = <unset>,

   faultMessage = (vmodl.LocalizableMessage) []

}'

About 5 minutes after starting it seems to enter some sort of error recovery attempt and retries the file download. That leads to a brand new generic error:

2018-01-12 22:48:19,827 - vCSACliInstallLogger - TRACE - Retrying file download ...

2018-01-12 22:48:25,418 - vCSACliInstallLogger - INFO - RPM Install: Progress: 5% Setting up storage

2018-01-12 22:50:07,223 - vCSACliInstallLogger - INFO - RPM Install: Progress: 51% Installed VMware-jmemtool-6.5.0-4944578.x86_64.rpm

2018-01-12 22:50:12,755 - vCSACliInstallLogger - INFO - RPM Install: Progress: 54% Installed VMware-unixODBC-2.3.2.vmw.2-6.5.0.x86_64.rpm

2018-01-12 22:50:23,741 - vCSACliInstallLogger - INFO - RPM Install: Progress: 58% Installed vmware-certificate-client-6.5.0.1306-4580178.x86_64.rpm

2018-01-12 22:50:29,308 - vCSACliInstallLogger - INFO - RPM Install: Progress: 64% Installed VMware-cis-license-6.5.0-4528602.x86_64.rpm

2018-01-12 22:50:34,781 - vCSACliInstallLogger - TRACE - Expecting object: line 3 column 6 (char 278)

2018-01-12 22:50:34,781 - vCSACliInstallLogger - TRACE - Failed to parse the status file as JSON: {"status": "running", "info": [], "question": null,

          "progress_message": {"args": [], "localized": "Installed vmware-psc-health-6.5.0.83-4594646.x86_64.rpm", "translatable": "Installed vmware-psc-health-6.5.0.83-4594646.x86_64.rpm"}, "warning": [], "error": null,

2018-01-12 22:50:34,781 - vCSACliInstallLogger - TRACE - Invalid JSON format. Check the JSON file at: /tmp/vcsaCliInstaller-2018-01-12-22-40-JDj9sJ/monitor-firstboot-progress-AmaTuX.json

2018-01-12 22:50:34,782 - vCSACliInstallLogger - TRACE - Retrying file download ...

2018-01-12 22:50:40,328 - vCSACliInstallLogger - INFO - RPM Install: Progress: 78% Installed VMware-mbcs-6.5.0-4944578.x86_64.rpm

2018-01-12 22:50:45,801 - vCSACliInstallLogger - INFO - RPM Install: Progress: 80% Installed VMware-vpxd-vctop-6.5.0-4944578.x86_64.rpm

2018-01-12 22:50:51,281 - vCSACliInstallLogger - INFO - RPM Install: Progress: 82% Installed vmware-vmrc-6.5.0-4944578.x86_64.rpm

2018-01-12 22:50:56,840 - vCSACliInstallLogger - INFO - RPM Install: Progress: 84% Installed ipxe-1.0.0-1.4446055.vmw.i686.rpm

2018-01-12 22:51:02,256 - vCSACliInstallLogger - INFO - RPM Install: Progress: 87% Installed VMware-vcha-6.5.0-4944578.x86_64.rpm

2018-01-12 22:51:07,748 - vCSACliInstallLogger - INFO - RPM Install: Progress: 90% Installed vsphere-client-6.5.0-4944578.noarch.rpm

2018-01-12 22:51:13,246 - vCSACliInstallLogger - INFO - RPM Install: Progress: 95% Configuring the machine

2018-01-12 22:51:24,111 - vCSACliInstallLogger - TRACE - Cannot download file /var/log/firstboot/rpmInstall.json from esx60dl380.mydomain.net. Got error: Failed to download file from guest with error '(vmodl.fault.SystemError) {

   dynamicType = <unset>,

   dynamicProperty = (vmodl.DynamicProperty) [],

   msg = 'A general system error occurred: vix error codes = (1, 0).\n',

   faultCause = <unset>,

   faultMessage = (vmodl.LocalizableMessage) [],

   reason = 'vix error codes = (1, 0).\n'

}'

You know it's bad when Google can't find the error message.

Reply
0 Kudos
sbonds
Contributor
Contributor
Jump to solution

When the problem's not DNS, what is it? Yes, the date was wrong on the reloaded ESXi server. I'm not sure how it happened, but the date was a couple years behind the present. Nowhere in the logs is there any mention of date issues, certificate expiration, etc. so it was pure chance that I noticed it.

So if anyone ever sees the above hot mess of log entries-- check the basics. DNS and NTP. 🙂

Reply
0 Kudos