Box293
Enthusiast
Enthusiast

Having issues adding ESXi host to vCenter Server Appliance

I've been spending lots of time working on the new vCenter 5.1.0A.

I have two vCenter environments in my test and dev.

vCenter1 = Windows 2008 R2 vCenter server. It has local domain Microsoft CA self signed certificates as per following Derek Seamans blog http://derek858.blogspot.com.au/2012/09/vmware-vcenter-51-installation-part-1.html. I can add ESXi hosts to this vCenter no problems. These ESXi hosts have local domain Microsoft CA self signed certificates.

vCenter2 = VMware vCenter Server Appliance. It has local domain Microsoft CA self signed certificates as per following Doug Baer's blog http://www.goitpartners.com/blog/?p=662. I CANNOT add ESXi hosts to this vCenter when they have local domain Microsoft CA self signed certificates. I CAN add an ESXi host only if I have NOT done anything with the certificates.

The error I get is:

License not available to perform the operation.
License file download from blade001.xxx.yyy to vCenter Server failed due to exception: vim.fault.SSLVerifyFault.

I've attached the screenshot "unable to add host to ESXi - custom cert.png" that shows this.

When I add an ESXi host that has NOT had the certificate replaced, I get prompted that it is unable to verify the authenticity of the host and asks me to veryify the thumbprint. I click yes to verify and the host is added successfully.

I've attached the screenshot "Adding host blade002 with default cert.png" that shows this.

So this is really puzzling. When a host has a replaced certificate, the thumbpring MUST be verified by the vCenter Appliance because I DO NOT get prompted about the authenticity of the host.

Is anyone else experiencing this?

The one thing I really want to make clear is:

  • I can add the ESXi host with replaced certificates to a vCenter 5.1.0A server running on Windows 2008 R2 (vCenter1)
  • This then confirms with me that there is nothing wrong with the certificate on the ESXi host
  • I remove the host from vCenter1
  • I try and add the host to vCenter2 (Appliance) and I get the error, the host never gets added to vCenter
VCP3 & VCP4 32846 VSP4 VTSP4
25 Replies
AlexandrZubko
Contributor
Contributor

You resolved this problem

0 Kudos
Box293
Enthusiast
Enthusiast

It is currently an unsolved mystery!

VCP3 & VCP4 32846 VSP4 VTSP4
0 Kudos
AlexandrZubko
Contributor
Contributor

If you add a host(add - remove), before replacement of certificates, such problems do not arise in the future

0 Kudos
Box293
Enthusiast
Enthusiast

That is very interesting.

VCP3 & VCP4 32846 VSP4 VTSP4
0 Kudos
silverline
Contributor
Contributor

They still haven't fixed this issue apparently.

I was in an upgrading mood lately and re-installed my domain controller (and CA) to a Win2012 R2 VM.  I then upgraded from a Windows based vCenter to VCSA 5.5.  And my ESX host to 5.5

Since I installed a new CA I issued a new root cert, and setup everything based on that.  Everything was looking good until I went to add my host and received this same error message.  I no longer have the original SSCs that were installed by vSphere but I did just rename the old CA signed certs so those were still there.

After reading this thread I went back and made active the old certs signed by the previous CA which is now deleted and hasn't issued any certs for the infrastructure.  Adding the host then worked with the thumbprint acceptance notification.

Afterwards, I went back and switched these certs around one more time, so the correctly signed CA certs are now active.  And vCenter seems fine.

Way to go VMWare.  Great QA!

0 Kudos
cepoon
Contributor
Contributor

I'm kind of surprised at the lack of resolution to this (and google doesn't give me a lot of results regarding this specific problem) - for the longest time I thought I had working ESXi hosts that used SSL certificates signed by an internal CA - anybody actually had a case opened with VMware on this issue which still exist for both 5.1 and 5.5?

My first set of installs: vCenter Server Appliance 5.1 (original build 5200, have since updated to the latest build in the 5.1 release) with a single ESXi 5.1 host. The host was already added into vCenter before the certificate was replaced (so technically vCenter database would have the thumbprint for the original self-signed certificate generated during install). All operations seem to be fine

My second set of installs is where I discovered that my first set might not be working properly - same sequence so the host certificate was replaced AFTER it was added into vCenter. When I try to deploy a template with customization, that failed. However, if you do it without customization, the operation succeeded.

Seeing the symptoms from my 2nd set, I went to build a fresh lab where the host certificate is replaced BEFORE adding it into the lab vCenter - that failed with an SSLVerify error so basically, you cannot add a host that doesn't have self signed certificate.

What is interesting is that none of the existing KB addresses this issue (KB 2036744 addresses vCenter appliance 5.1 certificate change, KB 2015499 addresses ESXi 5.1 host certificate change, but no actual sequence of which should go first), turning on trivia logging in vCenter doesn't add much more info, and it seems to be failing the verification when it invokes the OpenSSL libraries - this is what I see in vpxd.log:

2013-11-13T00:18:40.247Z [7F2FF2E2E700 info 'vpxdvpxdMoLicenseManager' opID=31582879] [LicMgr] Downloading Dlfs for Host 'VMware ESX Server', Version: '5.0', File Version: '5.1.1.0', Dlf Directory Location: '/etc/vmware-vpx//licenses/site//VMware ESX Server/5.0/5.1.1.0'

2013-11-13T00:18:40.258Z [7F2FF2E2E700 warning 'Default' opID=31582879] SSL Error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

-->

2013-11-13T00:18:40.258Z [7F2FF2E2E700 warning 'Default' opID=31582879] SSL: connect failed

-->

2013-11-13T00:18:40.258Z [7F2FF2E2E700 error 'provisioningvpxNfcClient' opID=31582879] [VpxNfcClient] Unable to connect to NFC server: The remote host certificate has these problems:

-->

--> * unable to get local issuer certificate

and proceeds to dump the stack trace.

For a host that doesn't use self-signed certificate, vCenter never stores the thumbprint into the database (it should be empty). Now when vpxd starts up, I noticed this message in the log:

2013-11-12T23:43:18.403Z [7F2FFF8B2720 info 'vpxdvpxdMain'] [VpxdMain] Setting OpenSSL verify locations CAFile= CAPath=/etc/ssl/certs

Which suggests to me that OpenSSL libraries should able to the certificate directory /etc/ssl/certs to verify a remote server when making an SSL connection - I had my internal CA cert in that directory, soft-linked to its subject hash, and did the same even for the intermediate CA certs (when that is technically not necessary because I already have those certs returned by the ESXi host in a certificate chain)

Reading through the KB article still doesn't tell me any of the steps will actually help this verification - any ideas in the community about using non-self-signed cert on an ESXi host? Right now, for the 2nd set of install I actually reverted to the self-signed cert on the host.

0 Kudos
silverline
Contributor
Contributor

Yah I found out later that after reconnecting the host following replacing the faulty certificate with the correct CA signed one, this broke ability to apply customization as well.

I set everything up from scratch several times to get these steps right and the certs all correct so it is frustrating that it still doesn't work.

Does anyone have CA-signed certs working with VCSA?  If not, I don't understand why VMWare doesn't release a KB addressing this.

0 Kudos
cepoon
Contributor
Contributor

I just opened a case with VMware and see how far I will go

You can replace the certs in VCSA so you don't get a certificate warning when you connect via vSphere client / web client, but you can't replace the cert in ESXi host. And really, you don't need to finish all the steps in KB2036744 to get the front end to report a signed certificate chain - I really didn't care for the internal certs used by SSO / Inventory Service / VAMI, etc - I only cared about the front end, and that means vSphere Web Client / vSphere Client against the VCSA, and the ESXi host. The last item is not working.

Basically the minute you successfully complete "vpxd_servicecfg certificate change", the vCenter front end should already reporting the new certs to the client / browser - the rest of it I really didn't care for (not to mention that they all had to have different subject names because they gets registered into SSO). Even the first few steps of KB2036744 I have opinions about their correctness (why stick the whole cert chain as a file into /etc/ssl/certs and symlink to the hash, when each root CA and intermediate CA cert should be in its own file and symlinked separately)

0 Kudos
pfurnessSKA
Contributor
Contributor

So, have VMWare bothered to give you any kind of response yet?

I've just come on to this thread after a wasted week trying to generate and install certificates signed by my own CA, and it's making me scream that not only is vcenter server appliance so badly designed that it requires 7 separate certificates (Why the heck can't it just use one certificate with soft links to it?!), but the documentation is so poor that it's almost impossible to get it to actually work, and now it looks like actually secure certificates cannot be used at all on the ESXi hosts that vCenter is meant to manage.

0 Kudos
cepoon
Contributor
Contributor

Well, went through about 2+ hours of "looking around", they ended up asking me for a log bundle from both the vCenter appliance and the host that is involved - since then (which was Monday), no updates via email nor phone yet.

For vCenter I really question the need to replace all 7 certs - as I have mentioned above, the minute you run "vpxd_servicecfg certificate change", your vCenter appliance should already be supplying the new certs over port 5480 (appliance management) or 9443 (vSphere web client). The rest in my opinion is just aesthetics as I couldn't see how changing the "front end" cert requires the changing of all those other "back end" certs. Unless of course, some other thing will actually stop working (which shouldn't be a surprise by now)

My guess is that the appliance was made due to the demand for a non-Windows centric solution as historically, the management software suite was Windows-based, and it's so drastically different that it's still going through growing pains. Hopefully VMware won't abandon the non-Windows stream

0 Kudos
pfurnessSKA
Contributor
Contributor

Hmm, not the best support experience, is it? Smiley Happy

Thanks to the stuff in this thread, I did get it working acceptably, and having played this game for a week, I agree with you cepoon, I think the only certificate that really matters on the vCenter appliance is the first one.

In the end, I got it working like this: Deploy the vCenter appliance, add the ESXi hosts, and then - and ONLY then - replace the self-signed certificates on both the ESXi hosts and the vpxd service on the appliance. I'd massively prefer to put valid certificates everywhere before connecting anything to anything, but at least it's possible to use real certificates eventually.

I do wonder about what you say about linux being a tossed-in afterthought. vCenter is so badly implemented on there, and quite outstandingly slow to start up, it cannot have been designed for that environment. Plus, of course, the fact I found typos in a couple of the linux scripts that you're supposed to run to change certificates, that would never have got through any kind of testing - just try running the script, it throws errors!.

I wonder if the reason that it's a linux appliance is related to MS licensing as much as anything else? Hey-ho, never mind. It works (for a particularly low value of "works"), and if only it did some meaningful logging of stuff it wouldn't be that bad at all.

0 Kudos
cepoon
Contributor
Contributor

Sorry no - first time ever invoking VMware support (I just don't deal with vSphere enough to ever to do that before) and now we are trying to repeat this KB: http://kb.vmware.com/kb/2015499

I have hard time convincing myself that following it to the tee will reveal anything that we haven't seen before but I'm open to be surprised...


pfurnessSKA - what you did will work until you needed to deploy templates with customization as that operation will still fail. In my experiments, you have to revert the host cert to self-signed for that to work. If that's not something you would do for your particular infrastructure then it's a feasible workaround. For me, it wasn't


I will be diplomatic here on their own forums and say that vCenter is probably a very complicated system that just needed to be delivered to the world quickly - we seemingly live in this world of continuously "debugging" software because there just isn't enough time to do it right entirely, because everybody else is trying to beat you first to the market. Windows licensing is a concern for some, and in my particular case, it's far too much overhead to run the vCenter management platform in Windows and me being the UNIX guy that just loathe Windows server for its complexity that I never really uses. It probably just needed time, that time VMware don't seem to have in this crazy race of virtualizing any and every hardware you can just to save some money, only to be paid in license fees.


Back to the problem at hand - I hope someone had actually run those steps in KB 2015499 to the tee and still have that fail, just to prove that I don't actually have to waste time on it again.

0 Kudos
silverline
Contributor
Contributor

I certainly have gone through the steps in that KB many times.

Doesnt take that long to do but I highly doubt you will have success at addressing the problem.

Might as well go through it to prove to support that it isn't working.

0 Kudos
cepoon
Contributor
Contributor

Nothing interesting in the logs according to support, so I guess they just had to see me working the KB over WebEx to be convinced it doesn't work

0 Kudos
cepoon
Contributor
Contributor

And after some unrelated changes to vCenter, the hosts are added ok - I cannot isolate what I did that is related:

1) Enable vSphere Auto Deploy

2) Enable atftpd

3) Build Auto Deploy ruleset

4) Added another host (tried it straight with the signed cert)

What I did notice though was when I rebuild another host, and change it to a non-self-signed cert, there is a difference:

a) if the cert has no intermediate CA, vCenter says it cannot verify and prompts you to trust the cert's fingerprint

b) if the cert has intermediate CA (in my case, 2 of them), vCenter just proceeds to add the host

I probably have to blow away this lab instance of vCenter and repeat the exercise to isolate what made it work

0 Kudos
cepoon
Contributor
Contributor

So I have found a work around if you really need a non-self-signed cert on the ESXi host - do not install the root CA cert and intermediate certs that signed the ESXi host cert in the vCenter appliance under /etc/ssl/certs. What this will do is cause vCenter to fail SSL verification "early", so that it will prompt you to trust the ESXi host cert. If you do remove the CA certs (root and intermediate), you will have to at least restart vpxd (/etc/init.d/vmware-vpxd restart) - you might be able to get away without doing so (it's OpenSSL libraries that does the first SSL verification) but I always restarted during my test

The caveat with this, by my guess, is that if you replaced the internal certs on vCenter appliance with certs that are signed by the same CA chain, things could break - I haven't tried going without the CA certs in /etc/ssl/certs on my production instance yet

On the other hand - the conclusion I came to is solely based on all the test cases I ran on my own after vCenter magically added my host during a troubleshooting session with VMware - I had to revert to a fresh vCenter appliance 2 or 3 times to verify that this is repeatable. I would speculate that somewhere in the internal routines it forgot to remind OpenSSL that the CA certs are in /etc/ssl/certs, and from the error messages I got out of the logs, I'm fairly certain it's OpenSSL doing all verification rather than some Java routines (which would have required a JKS keystore to be modified) as the error code returned matches an existing OpenSSL error code

0 Kudos
spea
Contributor
Contributor

VMware KB:    Configuring Certificate Authority (CA) signed certificates for vCenter Server Applian...

SSLVerifyFault error post assign certificates:

1. open ssh to vCenter:

shutdown -r now

2. open ssh to vCenter:

cd /etc/ssl

mv certs certs_org

shutdown -r now

3. add your first ESXi Host

4. open ssh to vCenter:

cd /etc/ssl

mv certs_org certs

shutdown -r now

5. add other ESXi Host's

0 Kudos
DougBaer
VMware Employee
VMware Employee

I figured I would add my experience here in case it helps someone.

Adding a new host with a CA-signed certificate to an empty inventory in vCenter seems to give this error, but adding subsequent hosts with certificates signed by the same CA seems to work fine. It looks like there is some linkage that is not being made the first time:

So, if I add the first host to the vCenter using the default self-signed certificates, then swap the certificate for the CA-signed certificate and restart the management agents on the ESXi, I can add subsequent hosts to the vCenter without issue. I really don't know why, but I'll see what I can find out when I get some time.

-Doug

Doug Baer, Staff Architect, Sr. Manager of vPod Architecture team for the VMware Hands-on Labs | VCDX #019, vExpert 2012-20 | @dobaer
0 Kudos
csdibiase
Contributor
Contributor

cepoon wrote:

Back to the problem at hand - I hope someone had actually run those steps in KB 2015499 to the tee and still have that fail, just to prove that I don't actually have to waste time on it again.

Well as best I can tell I've gone through KB 2015499 completely and the host I tested it on is in exactly that state. I actually only updated certificates on one host to provide a good A/B test for VMware support. Deploy with customization to host 3 (which in my dev cluster is the one with CA signed SSL certificates) and it fails at customization with "Authenticity of the host's SSL certificate is not verified". Deploy the same template and customization spec to host 2, which has the original self signed SSL certificates, and customization is successful.

I'm at the point where VMware support has observed the issue and has logs of the issue but I've not gotten further. I'll post again after giving them a couple of days to mull over the logs.

0 Kudos