VMware Cloud Community
MC1903
Enthusiast
Enthusiast
Jump to solution

Skyline Health checks fail when vSAN Encryption is enabled.

I am testing a new DellEMC CloudLink KMS cluster for vSAN encryption and I am getting two Skyline Health check issues that I just cannot clear.

Environment Overview:

vCenter Server Version: 7.0.2, 17694817

ESXi Host Version: 7.0.2, 17867351 

3 hosts in the vCenter 'Management Cluster' - vSAN Encryption Enabled

3 hosts in the vCenter 'Workload Cluster' - No vSAN Encryption

CloudLink Version: 7.1.0 (build 3.6) - 2 nodes in the KMS cluster

MC1903_0-1625144240325.png

Key Provider Overview:

MC1903_1-1625144593880.png

vSAN Services Overview:

MC1903_2-1625144720211.png

vSAN Skyline Health check #1 - vSAN cluster configuration consistency

Host(s): All 3 hosts in the Management Cluster

Issue: Key Management Servers information is inconsistent with cluster configuration

Recommendation: Remediate inconsistent configuration

The "REMEDIATE INCONSISTENT CONFIGURATION" action runs and completes successfully, but does NOT clear the error.

MC1903_3-1625144869202.png

vSAN Skyline Health check #2 - vCenter and all hosts are connected to Key Management Servers

vCenter KMS status: All OK
Hosts KMS status:
  All 3 hosts have warnings for 'Connection State' and 'Key State'

The "REMEDIATE KMS CONNECTION" action runs and completes successfully, but does NOT clear the error.

MC1903_4-1625145181358.png

MC1903_5-1625145221735.png

 

I tried to analyse vCenter Server with VMware Skyline Health Diagnostics but it does not support 'Crypto' hosts. Yes, I am aware the the vCenter Skyline Health and the VMware Skyline Health Diagnostics are totally separate tools.

MC1903_6-1625145688440.png

I would appreciate any thoughts or suggestions, as I am stuck.

Cheers,

M

 

0 Kudos
1 Solution

Accepted Solutions
astins0n
Contributor
Contributor
Jump to solution

Support has confirmed (in my case) that this is a cosmetic bug and will be fixed in the next release.

View solution in original post

9 Replies
MC1903
Enthusiast
Enthusiast
Jump to solution

@TheBobkinvery cheeky ask, but have you seen this before?

I cannot open a SR as this is a test lab environment. I have to do this for real in a few weeks on a live production cluster and I am somewhat apprehensive.

I have also noticed that when I fail one of the KMIP servers to test CloudLink HA, the vCenter Server response to storage based actions (E.g. "Edit VM Storage Policies", "Select Storage" during a Deploy OVF template wizard) takes approx 100 seconds to respond; where as they take <1 second when all KMIP servers are online.

Cheers,

M

0 Kudos
depping
Leadership
Leadership
Jump to solution

I've seen this issue popping up a few times already. And it sounds like a false positive alert triggered in this case. I would recommend going through support to confirm it is a false positive and consider silencing the alarm until there's a fix.

0 Kudos
MC1903
Enthusiast
Enthusiast
Jump to solution

Thank you Duncan @depping.

If only I had access to paid support for my lab environment - sadly my pockets are not that deep 😞

I will prewarn the client that this issue was seen in lab testing, that it is believed to be a false positive alert and that they will need to open an SR if we see it on their production, once vSAN encryption has been enabled.

I appreciate your help as always.

Cheers,

M

 

0 Kudos
astins0n
Contributor
Contributor
Jump to solution

Just updated our production cluster from 7.0.100 to 7.0.202 and am seeing the same exact issue and searching for help found this thread.

Have an SR open and waiting to see what the response is.

Other things I noticed on my cluster not displaying correctly are the versions under system and updates:

astins0n_0-1625860958721.png

 

astins0n_1-1625860974680.png

 

0 Kudos
MC1903
Enthusiast
Enthusiast
Jump to solution

Thanks for posting @astins0n

What Key Provider vendor/app are you using? What version? How many KMS nodes are registered in vCenter?

I would love to hear back if GSS come up with an answer or better a fix.

Cheers,

M

0 Kudos
astins0n
Contributor
Contributor
Jump to solution

@MC1903 

Using ClouldLink version 7.0.0 (build 8.7).

Running 2 KMS nodes, one local and another in a remote data center.

Still waiting on them to review uploaded vCenter logs.

0 Kudos
astins0n
Contributor
Contributor
Jump to solution

Support has confirmed (in my case) that this is a cosmetic bug and will be fixed in the next release.

MC1903
Enthusiast
Enthusiast
Jump to solution

Hello@astins0n 

Thank you for the follow up.

I am sure mine is also cosmetic, as every other vSAN test/check comes back as OK.

Cheers

M

0 Kudos
MC1903
Enthusiast
Enthusiast
Jump to solution

@astins0nI have just had the opportunity to upgrade to vCenter Server 7.0 Update 2c (7.0.2.00400) b18356314 and I can confirm that the issue has been resolved. Hopefully you wont have to wait too long for the VxRail upgrade bundle that includes this version to be released.

vCenter Server 7.0 U2c b18356314 - Screenshot 1.PNGvCenter Server 7.0 U2c b18356314 - Screenshot 2.PNGvCenter Server 7.0 U2c b18356314 - Screenshot 3.PNG

0 Kudos