MJMVCIX
Enthusiast
Enthusiast

vSAN Data at Rest Encryption and KMS Failure

Hi All, 

Does anyone have a document / link detailing failure scenarios and impacts if a KMS Solution failed, was offline etc and vSAN D@RE was enabled?

I have found this (The YouTube video at the end): https://www.yelof.com/2017/10/05/key-manager-concepts-and-toplogy-basics-for-vm-and-vsan-encryption/

However this was a few years ago and doesn't detail the scenarios and impacts etc. 

vSAN and vCenter in question is version 7.0.3.

This requirement is to understand the impacts on this cluster if D@RE was enabled and there was a failure of 1 KMS, both KMS etc? We can then determine if this should be enabled or not.

Thanks

Reply
0 Kudos
bmcb555
Enthusiast
Enthusiast

Here you go

https://docs.vmware.com/en/VMware-vSphere/8.0/vsan-monitoring-troubleshooting/GUID-084B3888-499F-4CD...

In an encrypted vSAN cluster, when communication between a host and the KMS is lost, the disk group can become locked if the host reboots. You should be fine until you reboot the hosts

Reply
0 Kudos
MJMVCIX
Enthusiast
Enthusiast

Yes i did see that, however thats very brief. 

There must be a more detailed document with failure scenarios such as:

  • What happens if vCenter is offline/ failed?
  • What happens if 1 KMS is offline Failed?
  • What happens if both KMS are offline/failed?

etc, these are just a few scenarios. Would be good to see a document that details what impact different failures would have. Something like the table in this blog: https://blogs.vmware.com/virtualblocks/2018/12/05/vsan-failure-scenarios/ 

Tags (1)
Reply
0 Kudos
bmcb555
Enthusiast
Enthusiast

I don't believe there is anything more detailed than that there is however a very good session (again old) on how KMS operates and it hasn't functionally changed as fair as I'm aware.

I've cut into the section you will be interested in.

https://youtu.be/I5gR_dVqfz0?t=653

  • What happens if vCenter is offline/ failed?
    • Depends, on boot the hosts are given the keys by vCenter. The keys are stored in ESXi memory so as long as you do not reboot the hosts, your VMs will be fine. If it comes up and vCenter is not available it will not be able to get it's host keys to then get the VM keys therefore that particular host cannot access the VMs on storage.
  • What happens if 1 KMS is offline Failed?
    • It fails over to the next KMS in the list
  • What happens if both KMS are offline/failed?
    • Again depends if you reboot the hosts, the keys are stored in the hosts memory and are lost on reboot
TheBobkin
Champion
Champion

Have a read through this - it covers a lot more details than the VMware docs page:
https://core.vmware.com/resource/vsan-encryption-services


"What happens if vCenter is offline/ failed?"

Actually vCenter is only used for initial configuration and KMS trust establishment - after this the hosts communicate directly with the KMS and thus vCenter being down has no consequences other than can't make changes to the KMS configuration.

 

"What happens if 1 KMS is offline Failed?"
This depends entirely on the KMS-side configuration - ideally this should be done properly and it be a redundant KMS cluster with all nodes being able to provide all keys, however I have seen situations where administrators thought this was the case but sadly it was not and keys were not available as one KMS was down and it was the only one with specific keys.

 

"What happens if both KMS are offline/failed?"
Nothing unless vSAN nodes are rebooted or any change that unmounts and remounts Disk-Groups, obviously don't do this if at all possible until KMS issue is resolved, if this is done then that/those Disk-Groups will be locked until the keys are available again.

View solution in original post

MJMVCIX
Enthusiast
Enthusiast

@TheBobkin, as always, thank you for taking the time to respond and provide this detail.

Reply
0 Kudos