6 Replies Latest reply on Oct 22, 2019 10:16 AM by Vijay2027

    Auto Deploy 6.7 U1b - certificate issues at waiter.tgz

    TimR26 Novice

      I'm developing auto-deploy and i've gotten to the point where a the ESXi installation begins, but stalls when trying to install waiter.tgz. I checked the /var/log/vmware/rbd/rbd-cgi.log file and found this:

       

      beacon:Adding etc/vmware/autodeploy/waiternotify.json

      cache_tables:cache request ....

      sslutil:cert files are missing from <UID> (autodeployhost.contoso.com)

      sslcert:Generating SSL cert for <UID> (autodeployhost.contoso.com)

      ERROR:vmcacertutil:Could not generate certificates for: autodeployhost.contoso.com

      rc:0

      out: b'Error: 5, VMCASignedCertificatePrivate() failedError Code: 5\nMessage: UNKNOWN\n'

       

      Not sure what other info to provide to help determine the issue....any suggestions/guidance?

        • 1. Re: Auto Deploy 6.7 U1b - certificate issues at waiter.tgz
          msripada Expert
          vExpert

          Error 5 is access denied.. I am unsure if we are hitting any access denied here but i dont see  a reason for access denied in autodeploy...

           

          Have you tried restarting the rbd service and then tried deploying the ESXi host?

           

          Can you check the vpxd.log and the rbd.log

           

          Thanks,

          MS

          • 2. Re: Auto Deploy 6.7 U1b - certificate issues at waiter.tgz
            matthewingram Lurker

            did you find a resolution for this issue?

            • 3. Re: Auto Deploy 6.7 U1b - certificate issues at waiter.tgz
              dbuenoparedes Novice

              I have the same problem and see the same log entries in the /var/log/vmware/rbd/rbd-cgi.log file. I was trying to re-deploy the host using auto deploy after upgrading VCSA from 6.0U3 to 6.7U2c, I even tried removing the host from the vCenter server's inventory but still the same problem, it gets stuck downloading waiter.tgz.

               

              There seems to be a problem with the new certificate that tries to issue when host gets re-deployed. I've checked using the certool with the following command and I see there is still a certificate for that host that wasn't deleted when I removed it from the inventory:

              /usr/lib/vmware-vmca/bin/certool --enumcert --filter=all | less

               

              Edit: I've tried re-deploying the host after restarting the Auto Deploy waiter service, I also rebooted the VCSA once after removing the ESXi host from the inventory but it still gets stuck at the same step of the deployment.

              • 4. Re: Auto Deploy 6.7 U1b - certificate issues at waiter.tgz
                Vijay2027 Expert
                vExpert

                This requires assigning necessary permissions to waiter user which has to be done by connecting to vmdird DB with LDAP browser (Jxplorer)

                As the steps involved requires modifying vmdird DB, file a SR with GSS to get this sorted.

                • 5. Re: Auto Deploy 6.7 U1b - certificate issues at waiter.tgz
                  dbuenoparedes Novice

                  Thanks for the reply Vijay2027, you nailed it, I ended up opening a ticket with VMware support. They checked these log files:

                  • /var/log/vmware/rbd/rbd-cgi.log (VCSA)
                  • /var/log/vmware/vmcad/vmcad-syslog.log (PSC)

                   

                  We have an external PSC deployment in our environment, the key was in the following lines of the vmcad-syslog.log file:

                  2019-10-18T18:27:47.942203+00:00 warning vmcad  t@140271253645056: error code: 0x00000005

                  2019-10-18T18:27:47.942370+00:00 warning vmcad  t@140271253645056: error code: 0x00000005

                  2019-10-18T18:27:47.942537+00:00 warning vmcad  t@140271253645056: error code: 0x00000005

                  2019-10-18T18:28:08.373709+00:00 info vmcad  t@140271253645056: VMCACheckAccessKrb: Authenticated user waiter-d0cef9c5-5f40-4671-83f7-f611d19354cb@vsphere.local

                  2019-10-18T18:28:08.380445+00:00 info vmcad  t@140271253645056: Checking upn: cn=CAAdmins,cn=Builtin,dc=vsphere,dc=local against CA admin group: waiter-d0cef9c5-5f40-4671-83f7-f611d19354cb@vsphere.local

                  2019-10-18T18:28:08.380970+00:00 warning vmcad  t@140271253645056: error code: 0x00000005

                  2019-10-18T18:28:08.381299+00:00 warning vmcad  t@140271253645056: error code: 0x00000005

                  2019-10-18T18:28:08.381563+00:00 warning vmcad  t@140271253645056: error code: 0x00000005

                  2019-10-18T18:28:09.205803+00:00 info vmcad  t@140271253645056: VMCACheckAccessKrb: Authenticated user waiter-d0cef9c5-5f40-4671-83f7-f611d19354cb@vsphere.local

                  2019-10-18T18:28:09.210938+00:00 info vmcad  t@140271253645056: Checking upn: cn=CAAdmins,cn=Builtin,dc=vsphere,dc=local against CA admin group: waiter-d0cef9c5-5f40-4671-83f7-f611d19354cb@vsphere.local

                   

                  What support ended up doing is connecting via LDAP (with JXplorer) to the PSC and creating that waiter-d0cef9c5-5f40-4671-83f7-f611d19354cb@vsphere.local user that was missing from the CAAdmins group.After this user was created I was able to re-deploy the ESXi host without any issue. There were 2 other waiter users with a different string of chars after them but for some reason Auto Deploy was looking for this one specifically but was missing from that group of users.

                   

                  I hope this helps.

                  • 6. Re: Auto Deploy 6.7 U1b - certificate issues at waiter.tgz
                    Vijay2027 Expert
                    vExpert

                    Right. Sometimes we end up re-created the ID using dir-cli if the user doesn't exists.