I have done some tests regarding this topic with a cluster created with 3 master.
If one control plane node is shutdown from vCenter, "get pods -A" continue to work. (As expected)
If two control plane nodes are shutdown, "get pods -A" doesn't work anymore (Expected)
After restarting one of the control plane node "get pods -A" works again, (Expected)
So the basic functionality of a multi control plane nodes is working.
One issue is that no errors are reported in the events or in status of the cluster from CSE plugin. (Status is "ready")
The only thing visible is at load balancer level which shows that some endpoints are down and VAPP that is noticing some VMs down.
Would it be possible to add some kinds of "health" in the CSE plugin? (like all control planes node up and running / worker nodes up and running, load balancer associated to management IP deployed etc)
Second issue, I have deleted on purpose one of the control plane VM.
As mentioned above no information are reported from the CSE plugin, it still show "3 nodes".
It doesn't recreate the missing node (no "auto-heal" , which would be the best)
Is there a procedure on how to replace a failed node in such case?