I am testing a new DellEMC CloudLink KMS cluster for vSAN encryption and I am getting two Skyline Health check issues that I just cannot clear.
Environment Overview:
vCenter Server Version: 7.0.2, 17694817
ESXi Host Version: 7.0.2, 17867351
3 hosts in the vCenter 'Management Cluster' - vSAN Encryption Enabled
3 hosts in the vCenter 'Workload Cluster' - No vSAN Encryption
CloudLink Version: 7.1.0 (build 3.6) - 2 nodes in the KMS cluster
Key Provider Overview:
vSAN Services Overview:
vSAN Skyline Health check #1 - vSAN cluster configuration consistency
Host(s): All 3 hosts in the Management Cluster
Issue: Key Management Servers information is inconsistent with cluster configuration
Recommendation: Remediate inconsistent configuration
The "REMEDIATE INCONSISTENT CONFIGURATION" action runs and completes successfully, but does NOT clear the error.
vSAN Skyline Health check #2 - vCenter and all hosts are connected to Key Management Servers
vCenter KMS status: All OK
Hosts KMS status: All 3 hosts have warnings for 'Connection State' and 'Key State'
The "REMEDIATE KMS CONNECTION" action runs and completes successfully, but does NOT clear the error.
I tried to analyse vCenter Server with VMware Skyline Health Diagnostics but it does not support 'Crypto' hosts. Yes, I am aware the the vCenter Skyline Health and the VMware Skyline Health Diagnostics are totally separate tools.
I would appreciate any thoughts or suggestions, as I am stuck.
Cheers,
M
Support has confirmed (in my case) that this is a cosmetic bug and will be fixed in the next release.
@TheBobkinvery cheeky ask, but have you seen this before?
I cannot open a SR as this is a test lab environment. I have to do this for real in a few weeks on a live production cluster and I am somewhat apprehensive.
I have also noticed that when I fail one of the KMIP servers to test CloudLink HA, the vCenter Server response to storage based actions (E.g. "Edit VM Storage Policies", "Select Storage" during a Deploy OVF template wizard) takes approx 100 seconds to respond; where as they take <1 second when all KMIP servers are online.
Cheers,
M
I've seen this issue popping up a few times already. And it sounds like a false positive alert triggered in this case. I would recommend going through support to confirm it is a false positive and consider silencing the alarm until there's a fix.
Thank you Duncan @depping.
If only I had access to paid support for my lab environment - sadly my pockets are not that deep 😞
I will prewarn the client that this issue was seen in lab testing, that it is believed to be a false positive alert and that they will need to open an SR if we see it on their production, once vSAN encryption has been enabled.
I appreciate your help as always.
Cheers,
M
Just updated our production cluster from 7.0.100 to 7.0.202 and am seeing the same exact issue and searching for help found this thread.
Have an SR open and waiting to see what the response is.
Other things I noticed on my cluster not displaying correctly are the versions under system and updates:
Thanks for posting @astins0n
What Key Provider vendor/app are you using? What version? How many KMS nodes are registered in vCenter?
I would love to hear back if GSS come up with an answer or better a fix.
Cheers,
M
Using ClouldLink version 7.0.0 (build 8.7).
Running 2 KMS nodes, one local and another in a remote data center.
Still waiting on them to review uploaded vCenter logs.
Support has confirmed (in my case) that this is a cosmetic bug and will be fixed in the next release.
Hello@astins0n
Thank you for the follow up.
I am sure mine is also cosmetic, as every other vSAN test/check comes back as OK.
Cheers
M
@astins0nI have just had the opportunity to upgrade to vCenter Server 7.0 Update 2c (7.0.2.00400) b18356314 and I can confirm that the issue has been resolved. Hopefully you wont have to wait too long for the VxRail upgrade bundle that includes this version to be released.