Hi.
I integrated k8s with nsxt manager. There are some problems.
1. one of coredns liveness probe failed.
- as you know, There are two coredns in the system. one is ok while another is not.
here is kubectl describe message.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned kube-system/coredns-5c98db65d4-r2c28 to master
Normal Pulled 8m44s (x2 over 10m) kubelet, master Container image "k8s.gcr.io/coredns:1.3.1" already present on machine
Normal Created 8m44s (x2 over 10m) kubelet, master Created container coredns
Warning Unhealthy 8m44s (x5 over 9m24s) kubelet, master Liveness probe failed: HTTP probe failed with statuscode: 503
Normal Killing 8m44s kubelet, master Container coredns failed liveness probe, will be restarted
Normal Started 8m43s (x2 over 10m) kubelet, master Started container coredns
Warning Unhealthy 24s (x61 over 10m) kubelet, master Readiness probe failed: HTTP probe failed with statuscode: 503
what is the problem??
2. nsx-node-agent liveness probe failed.
- It is the same message with coredns.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20m default-scheduler Successfully assigned nsx-system/nsx-node-agent-mkgv9 to master
Normal Pulled 20m kubelet, master Container image "nsx-ncp" already present on machine
Normal Created 20m kubelet, master Created container nsx-node-agent
Normal Started 20m kubelet, master Started container nsx-node-agent
Normal Pulled 20m kubelet, master Container image "nsx-ncp" already present on machine
Normal Created 20m kubelet, master Created container nsx-kube-proxy
Normal Started 20m kubelet, master Started container nsx-kube-proxy
Normal Pulled 20m kubelet, master Container image "nsx-ncp" already present on machine
Normal Created 20m kubelet, master Created container nsx-ovs
Normal Started 20m kubelet, master Started container nsx-ovs
Warning Unhealthy 20m (x3 over 20m) kubelet, master Liveness probe failed:
but I don't know exactly what is the problem.
When I command ovs-vsctl show
Bridge br-int
fail_mode: standalone
Port "coredns-5c98db65d4-9zd85_5367583b9d79cf8"
tag: 10
Interface "5367583b9d79cf8"
Port "ens192"
Interface "ens192"
Port "coredns-5c98db65d4-r2c28_0ab9b86af5700af"
tag: 12
Interface "0ab9b86af5700af"
Port br-int
Interface br-int
type: internal
Port nsx_agent_outer
tag: 4094
Interface nsx_agent_outer
ovs_version: "2.10.4.15054368"
show like this. So, What is the problem??and How can I fix it??
Here is my system information
docker 18.09.7
kubernetes 1.15.3
nsx-t manager 2.4.1
ncp 2.5.1
ovs 2.10.4
ESXi 6.7.U2
ubuntu 16.04
It is a kind hard to see with this information if there is something not correctly configured, I will suggest checking of NCP new version is available is not a coredns problem at all, and try this to check for logs
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name
kubectl logs --namespace=kube-system
Look at the logs for you ncp-node-agent with kubectl log nsx-node-agent-mkgv9 -n nsx-system. They are per node and you might have problem in only one of them.
Also double check the tags of the interfaces.
Have you tagged the segment ports where the K8s nodes are connected? Each port needs to be tagged with scopes ncp/cluster and ncp/node_name and tags with the values for your environment.