Jcates28
Contributor
Contributor

vCenter 6.5 Appliance - replace VMCA Root Cert with Custom Signing and replace all Certs - Failure @85%

Jump to solution

Hey folks, can't find much on this error after scouring the web and many blogs. Basically the Certificate replacement process (Option #2) looks like it's completing successfully then hangs at 85% during  the " Starting Services" phase, which then gives a message about services failing to start  due to a timeout and proceeds to roll back this process however also fails and I'm left with a semi-bricked appliance unless I restart.

Here's what I'm doing:

I have a small lab on VMware Workstation 12 running ESXi 6.5d and vCenter appliance 6.5d (I've also tried A, B patches to see if this is a new bug introduced in version D) along with Horizon View 7.  I want to have the vCenter appliance act as a "Subordinate CA", replace the root cert with the appliance using a Certificate generated by my CA server, and Automate the replacement of all Certificates, including ESXi hosts done by VMCA with Signed CA certs. I have a single tier PKI using SHA 384 and 4096 bit key.

I've spent several days/nights looking over many articles and videos and I don't appear to be missing any Critical step with the setup of my Certificate Template for vSphere 6.0 VMCA, configuration of my CA, or the Certificate Signing process itself, but something is wrong somewhere.

What I've done so far. vSphere 6.0 Environment with Custom Certificates (External PSC) - YouTube

1. I've installed a Server 2012 R2 Root CA in Enterprise mode with Certificate Web services and have created the template per guidance of this article and this video.

2. I've patched the vCenter to the latest build which is Version D Build #

3. I've duplicated the Subordinate CA Certificate template and have customized it per VMware guidance

4. I'm using VMCA Cert Tool to generate the CSR

5. I am able to successfully generate a certificate based on this CSR

6. I'm able to upload the cert chain and key file provided by the vCenter appliance into the cert-tool during the process for Option # 2 from the main menu

7. the Process executes and looks to be updating and replacing all of the certs using the Certificate i've generated for VMCA

8. the process fails @ 85% when attempting to start the services again

9. I've exhausted most of my troubleshooting and knowledge in this area

However, I'm running into this weird error when attempting to run through the process. As described in this Article.

Replacing a vSphere 6.x Machine SSL certificate with a Custom Certificate Authority Signed Certifica...

As soon as I get to 85% starting services, it hangs for several minutes and then errors out and rolls back everything. Upon examining the logs, I can find no clear indication of what is failing outside of services not starting, which does not make sense is the fact that the  certificate replacement was successful per the logs, why would a failure to start these services cause the entire process to roll back?

The thing that boggles me is that I can confirm this in the /storage/log/vmware/vmcad/certificate-manager.log, I receiving messages that would lead one to believe that the certificates were successfully replaced along the way.

2017-05-26T22:51:09.381Z INFO certificate-manager []

2017-05-26T22:51:09.382Z INFO certificate-manager Create a entry using Key and File generated earlier

2017-05-26T22:51:09.382Z INFO certificate-manager Running command :- ['/usr/lib/vmware-vmafd/bin/vecs-cli', 'entry', 'create', '--store', u'vpxd', '--alias', u'vpxd', '--cert', u'/storage/certmanager/rollback/vpxd_bkp.crt', '--key', u'/storage/certmanager/rollback/vpxd_bkp.priv']

2017-05-26T22:51:09.413Z INFO certificate-manager Command output :-

Entry with alias [vpxd] in store [vpxd] was created successfully

If i do a search for error the only items that show up are

Service-control failed. Error Failed to start vmon services.vmon-cli RC=1, stderr=Failed to start vapi-endpoint, vpxd-svcs services. Error: Operation timed out

there's also mention of this during the rollback, but I don't find it usefull at all...

2017-05-26T22:51:09.871Z ERROR certificate-manager 2017-05-26T22:51:09.833Z   Updating certificate for "com.vmware.vim.eam" extension

2017-05-26T22:51:09.871Z INFO certificate-manager Command executed successfully

2017-05-26T22:51:09.871Z INFO certificate-manager Running command : ['/usr/bin/python', '/usr/lib/vmware-vpx/scripts/updateExtensionCertInVC.py', '-e', 'com.vmware.rbd', '-s', 'vc1.lab.local', '-c', u'/storage/certmanager/rollback/vpxd-extension_bkp.crt', '-k', u'/storage/certmanager/rollback/vpxd-extension_bkp.priv', '-u', 'administrator@vsphere.local', '-p', '*****']

2017-05-26T22:51:10.109Z INFO certificate-manager Command output :-

2017-05-26T22:51:10.071Z   Updating certificate for "com.vmware.rbd" extension

2017-05-26T22:51:10.109Z ERROR certificate-manager 2017-05-26T22:51:10.071Z   Updating certificate for "com.vmware.rbd" extensio

Any thoughts folks? 

1 Solution

Accepted Solutions
Jcates28
Contributor
Contributor

So I wrote a short blog post on my Google + about this.

VMCA Pitfalls:I'm sure most of you have had some experience with VMware's ...

The post summarizes Critical details that don't appear to be in the documentation so far.

Some notes:

1. Make sure to Snapshot the vCenter/Appliance before attempting cert replacement. No matter what I did in my Lab after it failed to rollback I could not get the operation to successfully complete. I'm sure I could have worked through this via a support call, but I did not want to invest the time.

2. If Trying to use the VMCA as a "Subordinate" Appliance make sure to download the certificate chain and export all the certificates in the chain as x.509 base 64 (See Screen shots)

3. Certificate Templates: There's a few templates you'll use, but keep in mind what you are doing. Are you trying to just replace the Macine_SSL cert or turn the Appliance/vCenter into a Subordinate CA so that VMCA will automate the replacement of all the other Components and Solution User's Certificates? See this post Creating a Microsoft Certificate Authority Template for SSL certificate creation in vSphere 6.0 (211... 

4. Creating the Chain: don't rely on copy/paste. Use Copy vmca.cer+root64.cer chain.cer command  ( See screen shot) Note* your file names may differ. This eliminates risk of garbage sneaking its way in the chain file from accidental spaces or other characters.

5. See my blog  to see the full output of the commands needed to be run including the required inputs. ** This is Very Critical**

View solution in original post

8 Replies
Jcates28
Contributor
Contributor

So I wrote a short blog post on my Google + about this.

VMCA Pitfalls:I'm sure most of you have had some experience with VMware's ...

The post summarizes Critical details that don't appear to be in the documentation so far.

Some notes:

1. Make sure to Snapshot the vCenter/Appliance before attempting cert replacement. No matter what I did in my Lab after it failed to rollback I could not get the operation to successfully complete. I'm sure I could have worked through this via a support call, but I did not want to invest the time.

2. If Trying to use the VMCA as a "Subordinate" Appliance make sure to download the certificate chain and export all the certificates in the chain as x.509 base 64 (See Screen shots)

3. Certificate Templates: There's a few templates you'll use, but keep in mind what you are doing. Are you trying to just replace the Macine_SSL cert or turn the Appliance/vCenter into a Subordinate CA so that VMCA will automate the replacement of all the other Components and Solution User's Certificates? See this post Creating a Microsoft Certificate Authority Template for SSL certificate creation in vSphere 6.0 (211... 

4. Creating the Chain: don't rely on copy/paste. Use Copy vmca.cer+root64.cer chain.cer command  ( See screen shot) Note* your file names may differ. This eliminates risk of garbage sneaking its way in the chain file from accidental spaces or other characters.

5. See my blog  to see the full output of the commands needed to be run including the required inputs. ** This is Very Critical**

JamieGator32
Enthusiast
Enthusiast

I have been experiencing this exact scenario and posted this question earlier today and then I saw your post.  I will definitely be trying this next week and I will let you know the results.  Very nice explanation on Google+.

James F Cruce VCP6.5-DCV Gainesville VMUG Leader http://vmug.com/gainesville @jamescruce http://astgl.com
0 Kudos
Jcates28
Contributor
Contributor

Glad you found this useful! Let me know if you were sucessfull! Enjoy the Weekend!

0 Kudos
JamieGator32
Enthusiast
Enthusiast

Well after much trial and error I had to put in a support request with VMware.  I also meant to say earlier that I would be choosing to just replace the machine ssl certs.  I've been working with VMware all week and I still don't have my certificates successfully installed.  There's an old saying, 'What do you never want to be? An interesting problem."  Support did confirm I was following the procedures exactly but it would still roll back each time.  Support confirmed that all attributes were enabled for the generated certificates from the 3rd party CA.  Right now my status is on hold while they research the unusual events they see in the logs.  I'll keep you posted on the results.

James F Cruce VCP6.5-DCV Gainesville VMUG Leader http://vmug.com/gainesville @jamescruce http://astgl.com
0 Kudos
crstrickland
Contributor
Contributor

Try this:

chmod 777 .buildInfo

The file is located at /etc/vmware

0 Kudos
GTO455
Enthusiast
Enthusiast

I was having a similar issue where the replacement of a certificate on a VCSA (Version 6.5.0.10000 Build 5973321) would hang (and subsequently fail) at 85% and rollback to the original certificate.

EDIT: Only AFTER I performed the following procedure below did I find the following in the Release Notes for VMware vCenter 6.5 Update 1, so I have no idea if this would have fixed my issue or not. I'm just putting this section here in case someone else is experiencing this same issue and wants to try this first. Good luck!

Custom certificate replacement fails on upgraded vCenter Server Appliance 6.5 Update 1

After you upgrade from vCenter Server Appliance 6.5 to 6.5 Update 1 and try to replace the Machine SSL certificate of vCenter Server Appliance, the operation fails because the vSphere Update Manager service cannot access the /etc/vmware/.buildinfo file as the file permission changed from 444 to 640.

Workaround:

  1. Log in as root to the vCenter Server Appliance.
  2. Change the file permission of /etc/vmware/.buildinfo from 640 back to 444 by running the following command
  3. chmod 444 /etc/vmware/.buildInfo
  4. Replace the Machine SSL certificate

So, After some extensive troubleshooting I found that it was VUM on the VCSA that was causing the issue. Here's what I did to fix the issue.

1. Stop and disable VUM.

  1. Log in to the vCenter Server by using the vSphere Web Client.
  2. On the vSphere Web Client Home page, click System Configuration.
  3. Under System Configuration, click Services.
  4. From the Services list, select the VMware vSphere Update Manager service.
  5. From the Actions menu, select Stop.
  6. From the Actions menu, select Edit Startup Type.
  7. Select Disabled

2. Replace the self-signed certificate and replace with your custom certificate.

This should complete successfully now that VUM isn't causing a rollback.

3. Reboot and verify your new certificate is installed.

4. Try to start VUM

Log into vCenter and change VUM back to Automatic startup. You can attempt to start the service but it should fail to start and generate a "sysimage.fault.SSLCertificateError" error everywhere you click in VUM.

5. Reset the VUM database and certificates.

ssh into the vcenter and run the following commends to reset the VUM database, VUM certificates and re-register VUM with vCenter. Note: These commands are destructive and remove your existing VUM configuration. Use at your own risk!

  1. service-control --stop vmware-updatemgr
  2. /usr/lib/vmware-updatemgr/bin/updatemgr-util reset-db
  3. /usr/lib/vmware-updatemgr/bin/updatemgr-util refresh-certs
  4. /usr/lib/vmware-updatemgr/bin/updatemgr-util register-vc
  5. service-control --start vmware-updatemgr
  6. reboot

You should now be able to get into VUM without issue and reconfigure baselines, download times, etc. Once this was complete, I had a fully functioning vCenter with a custom certificate (Hybrid Mode).

cypherx
Hot Shot
Hot Shot

I'm having the same issue where the certificate update hangs at 85% waiting for services to start.  VMWare VirtualCenter Server hangs at starting and it always creates a dump at C:\ProgramData\VMware\vCenterServer\logs\core.vpxd.PID.dmp.  The dmp file is around 124-127 KB.

I've tried option 8 right now to reset all the certs and start new, and I stopped and disabled the vSphere Update Manager Service which apparently worked for the last poster, and nothing.  I'm still sitting here at 85% waiting, and I see a core.vpxd file was generated so I bet this times out and reverts.

Thing is, everything was working great and trusted with our Windows CA.  The cert just expired on 6/13.  I opened a case and we were on the phone and remote session for a few hours today but got nowhere.  He took the .dmp files that were generated for analysis and needs to get back to me.  For now we have no vmotion or vSphere update manager function.  vmotion does not think we are liscenced for any features so they are all greyed out.


Were on 6.0 build 5318200 which I think is update 3b.

0 Kudos
nneulspirent
Contributor
Contributor

Not sure if this will help anyone else - but I got stuck on this myself.

Turns out the issue is that vcenter REALLY wants the true root certificate - all the way to the root, and not one of the intermediates. In my case, this was with a comodo/positiveSSL cert that had two intermediates finally going back to AddTrust.

What finalled worked with certificate-manager:

1) When it asks for machine cert - give it a file containing new cert, intemediates, and ACTUAL root

2) When it asks for key, give it the key file

3) When it asks for root, give it just the root, in my case, this was the AddTrust CA root.

It turns out I had been giving it a root that was not all the way to the root authority.

Note - do NOT trust chain/path information from a GUI - as that will show shortest possible path to any trusted root. VCSA wants it going back to the self-signed actual root CA.

0 Kudos