VMware Cloud Community
U32Frank
Contributor
Contributor

SSL issues on single node causing backup issues

Hello everyone,

We recently had a company install a new 3 node vSphere environment which initially seemed to go well. However I noticed that some Veeam jobs fail when the VM being backed up was on a specific ESXi node, but everything works fine on the other 2. Initially I thought this was a Veeam issue but digging into the Veeam logs i found the following error:

[04.07.2019 12:24:54] < 3040> vdl| WARN|[vddk] [NFC ERROR] NfcNewAuthdConnectionEx: Failed to connect: The remote host certificate has these problems:

[04.07.2019 12:24:54] < 3040> vdl| WARN|[vddk]

[04.07.2019 12:24:54] < 3040> vdl| WARN|[vddk] * A certificate in the host's chain is based on an untrusted root.

Which pointed me towards the issue being with the ESXi server. I dug into the  /var/log/vmauthd.log log on the ESXi server that is effected and found the following.

2019-07-08T13:47:44Z vmauthd[2108903]: lib/ssl: OpenSSL using FIPS_drbg for RAND

2019-07-08T13:47:44Z vmauthd[2108903]: lib/ssl: protocol list tls1.2

2019-07-08T13:47:44Z vmauthd[2108903]: lib/ssl: protocol list tls1.2 (openssl flags 0x17000000)

2019-07-08T13:47:44Z vmauthd[2108903]: lib/ssl: cipher list ECDHE+AESGCM:RSA+AESGCM:ECDHE+AES:RSA+AES

2019-07-08T13:47:44Z vmauthd[2108903]: lib/ssl: curves list prime256v1:secp384r1:secp521r1

2019-07-08T13:47:44Z vmauthd[2108903]: Connect from remote socket (172.18.4.53:61252).

2019-07-08T13:47:44Z vmauthd[2108903]: Connect from 172.18.4.53

2019-07-08T13:47:44Z vmauthd[2108903]: SSL Error: error:14094418:SSL routines:ssl3_read_bytes:tlsv1 alert unknown ca

2019-07-08T13:47:44Z vmauthd[2108903]: recv() FAIL: 1.

2019-07-08T13:47:44Z vmauthd[2108903]: VMAuthdSocketRead: read failed.  Closing socket for reading.

2019-07-08T13:47:44Z vmauthd[2108903]: Read failed.

Which looks like there is an issue with the certificate authority on the effected host, which would tie in nicely with the error i am seeing in Veeam.  So then I compared the CA on one of the working nodes with the non-working one with this command

openssl crl2pkcs7 -nocrl -certfile /etc/vmware/ssl/castore.pem | openssl pkcs7 -print_certs -noout

Working ESXi node

subject=/CN=CA/DC=vsphere/DC=local/C=US/ST=California/O=DWLAN-VCA01.brand.local/OU=VMware Engineering

issuer=/CN=CA/DC=vsphere/DC=local/C=US/ST=California/O=DWLAN-VCA01.brand.local/OU=VMware Engineering

subject=/O=VMware/CN=SMS-190614111842368

issuer=/O=VMware/CN=SMS-190614111842368

Non-working ESXi node

subject=/CN=CA/DC=vsphere/DC=local/C=US/ST=California/O=DWLAN-VCA01.brand.local/OU=VMware Engineering

issuer=/CN=CA/DC=vsphere/DC=local/C=US/ST=California/O=DWLAN-VCA01.brand.local/OU=VMware Engineering

subject=/O=VMware/CN=SMS-190614111842368

issuer=/O=VMware/CN=SMS-190614111842368

And they are identical. I'm not sure how to move forward from here. Can anyone help at all?

Thanks in advance. Frank

Tags (1)
Reply
0 Kudos
7 Replies
daphnissov
Immortal
Immortal

Are these hosts connected to a vCenter Server? If so, regenerate all certs from the vSphere Client and try again.

Reply
0 Kudos
U32Frank
Contributor
Contributor

Thanks for the quick response. Yes they are connected to a vCenter server.

Is there any risk in doing this, and is this the guide you would follow?

VMware Knowledge Base

Reply
0 Kudos
daphnissov
Immortal
Immortal

There shouldn't be any risk if using the VMCA to issue certs. As long as everything talks through vCenter and not the ESXi hosts directly, you're fine (and even then you'll just have to accept the new cert). Depending on your client, you can just right-click a host and go to (in the Flex client) Certificates > Refresh certificates.

Reply
0 Kudos
U32Frank
Contributor
Contributor

Thanks, let me see how I get on with this and I will report back to you.

Frank

Reply
0 Kudos
U32Frank
Contributor
Contributor

It seems we are using out own certificate for this. which has made me very reluctant to issue new SSL for all the hosts. I don't know enough about how SSL is used in vSphere, and I'm concerned that I may negatively effect the live hosts.

Does anyone know a way that I could fix this one host?

Reply
0 Kudos
daphnissov
Immortal
Immortal

Based on the cert contents you posted, you are not using custom certificates for the ESXi hosts although you may be using custom certs for the machine cert of the vCenter. If you would like to open a support case with both Veeam and VMware, you're welcome to proceed.

Reply
0 Kudos
U32Frank
Contributor
Contributor

I have already raised a ticket with Veeam, and there response was "this is a VMware issue", which i think is fair enough. I believe that your solution is probably the right one, I'm just reluctant to make this change as i don't really understand what this certificate is doing, and therefor the effect of reissuing it to the live nodes.

I have kicked it back to the consultation company that built the cluster for us. Hopefully they can resolve it.

Reply
0 Kudos