I have deployed TKGI version 1.14.0 and per documentation, this should have CSI drivers v2.5.1 that have support for volume snapshots.. I am using the automatically installed CSI drivers and not the manually installed as recommended.
I am having issues when trying to run the deploy-csi-snapshot-components.sh as indicated at this site: https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/2.0/vmware-vsphere-csp-getting-s...
It seems to fail because snapshot controller that it is trying to deploy on the master node is failing because it could not find a node that matched the nodeSelector which in this case is the master node.
I noticed that when I run "kubectl get nodes", only worker nodes show up and not master node. Is this hidden? OR is there a configuration file or something I have to modify to have the master node be listed in "kubectl get nodes". Because it can not find the master node this script is failing.
Another thing I noticed in the script is that it looking for the vsphere-csi-controller pod within the vmware-system-csi and when I run "kubectl get pods -n vmware-system-csi" there are no resources running. Again I am using the "automatically installed CSI driver". Is there something I am missing in the deployment of the cluster and TKG-I that is not properly deploying these pods as defined in the vsphere-csi-driver.yaml file. I tried to manually install the CSI drivers, however, the CSI driver also expects the master node to be listed when doing a "kubectl get nodes".
Any advise would be most helpful...
Thanks,
AJ
TKGI integrated CSI driver does not included the snapshot services yet. it is not listed as a support feature. we plan to add snapshot capability in next TKGI minor version.
Much thanks for that information. I will look forward to that version once released. However, I got it working by doing the following:
a. Remove the following nodeselector from the vsphere-csi-controller Deployment section:
nodeSelector: node-role.kubernetes.io/master: ""
b. In the vsphere-csi-node DaemonSet section, replace all occurrences of /var/lib/kubelet with /var/vcap/data/kubelet.
3. For the snapshot deployment script, I also modified the 2.5.2 deployment-csi-snapshot-components.sh to remove the nodeSelector when deploying the snapshot-controller and snapshot-validation-deployment sections. I had already done this before, however, as I mentioned to you over the call, it worked until it started looking for the vsphere-csi-controller which was not present when using the automatically installed CSIDriver. Since the manually installed 2.5.2 CSI driver now has vsphere-csi-controller pod running, this script now completes. Diff is shown below.
$ diff 2.5.2deploy-csi-snapshot-components.sh.orig 2.5.2deploy-csi-snapshot-components.sh
167c167
< kubectl patch deployment -n kube-system snapshot-controller --patch '{"spec": {"template": {"spec": {"nodeSelector": {"node-role.kubernetes.io/master": ""}, "tolerations": [{"key":"node-role.kubernetes.io/master","operator":"Exists", "effect":"NoSchedule"}]}}}}'
---
> kubectl patch deployment -n kube-system snapshot-controller --patch '{"spec": {"template": {"spec": {"tolerations": [{"key":"node-role.kubernetes.io/master","operator":"Exists", "effect":"NoSchedule"}]}}}}'
228c228
< kubectl patch deployment -n kube-system snapshot-validation-deployment --patch '{"spec": {"template": {"spec": {"nodeSelector": {"node-role.kubernetes.io/master": ""}, "tolerations": [{"key":"node-role.kubernetes.io/master","operator":"Exists", "effect":"NoSchedule"}]}}}}'
---
> kubectl patch deployment -n kube-system snapshot-validation-deployment --patch '{"spec": {"template": {"spec": {"tolerations": [{"key":"node-role.kubernetes.io/master","operator":"Exists", "effect":"NoSchedule"}]}}}}'
As you can see below the pods that were deployed after running the csi driver script and the snapshot deployment script. The number of replicas can be modified in the csi driver yaml to only run 1 replica if you only have 1 control-plane (master).
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default csisnaps-restore-pod 1/1 Running 0 27m
default example-vanilla-block-pod 1/1 Running 0 56m
kube-system antrea-agent-54b9r 2/2 Running 0 67m
kube-system antrea-agent-m7bfl 2/2 Running 0 67m
kube-system antrea-agent-rqfg4 2/2 Running 0 71m
kube-system antrea-controller-5c788594b9-5s689 1/1 Running 0 73m
kube-system coredns-787c57488d-4vn2f 1/1 Running 0 64m
kube-system coredns-787c57488d-blfqn 1/1 Running 0 64m
kube-system coredns-787c57488d-nghdq 1/1 Running 0 64m
kube-system konnectivity-agent-3cb6ff69-fb01-4d81-8593-e678b2293266-7c7jdlp 1/1 Running 0 73m
kube-system konnectivity-agent-3cb6ff69-fb01-4d81-8593-e678b2293266-7cqbjpw 1/1 Running 0 73m
kube-system metrics-server-66c5bff789-w6pz6 1/1 Running 0 64m
kube-system snapshot-controller-7f5d798964-6g5ql 1/1 Running 0 54m
kube-system snapshot-controller-7f5d798964-qs5c2 1/1 Running 0 54m
kube-system snapshot-validation-deployment-d448fb598-lhgkv 1/1 Running 0 45m
kube-system snapshot-validation-deployment-d448fb598-ngsrj 1/1 Running 0 45m
kube-system snapshot-validation-deployment-d448fb598-zwlff 1/1 Running 0 45m
pks-system cert-generator-91882e178d2f89da53a2344e3b2eee692455c956-dmblp 0/1 Completed 0 64m
pks-system event-controller-795755d67-f7d47 2/2 Running 0 64m
pks-system fluent-bit-l9wt7 2/2 Running 0 63m
pks-system fluent-bit-rhxl7 2/2 Running 0 63m
pks-system fluent-bit-ttm5z 2/2 Running 0 63m
pks-system metric-controller-669f9bc57b-295g2 1/1 Running 0 64m
pks-system observability-manager-64cbd6c45d-f6qxc 1/1 Running 0 64m
pks-system sink-controller-77c8c69d54-52dxh 1/1 Running 0 64m
pks-system telegraf-hxgxt 1/1 Running 0 64m
pks-system telegraf-jht8l 1/1 Running 0 64m
pks-system telegraf-xhkbw 1/1 Running 0 64m
pks-system validator-8567d4b66-qfmbg 1/1 Running 0 64m
vmware-system-csi vsphere-csi-controller-7c9df9dfd9-d977z 7/7 Running 0 57m
vmware-system-csi vsphere-csi-controller-7c9df9dfd9-wdtjm 7/7 Running 0 57m
vmware-system-csi vsphere-csi-controller-7c9df9dfd9-wxgbk 7/7 Running 0 57m
vmware-system-csi vsphere-csi-node-gpdc4 3/3 Running 2 (57m ago) 57m
vmware-system-csi vsphere-csi-node-kvtth 3/3 Running 2 (57m ago) 57m
vmware-system-csi vsphere-csi-node-lv9bt 3/3 Running 1 (57m ago) 57m
Many thanks for this info