VMware Networking Community
parkjoungwoong
Contributor
Contributor

There are problems after integrating k8s with nsx-manager 2.4.1 using ncp 2.5.1.

Hi.

I integrated k8s with nsxt manager. There are some problems.

1. one of coredns liveness probe failed.

    - as you know, There are two coredns in the system. one is ok while another is not.

      here is kubectl describe message.

Events:

  Type     Reason     Age                    From               Message

  ----     ------     ----                   ----               -------

  Normal   Scheduled  10m                    default-scheduler  Successfully assigned kube-system/coredns-5c98db65d4-r2c28 to master

  Normal   Pulled     8m44s (x2 over 10m)    kubelet, master    Container image "k8s.gcr.io/coredns:1.3.1" already present on machine

  Normal   Created    8m44s (x2 over 10m)    kubelet, master    Created container coredns

  Warning  Unhealthy  8m44s (x5 over 9m24s)  kubelet, master    Liveness probe failed: HTTP probe failed with statuscode: 503

  Normal   Killing    8m44s                  kubelet, master    Container coredns failed liveness probe, will be restarted

  Normal   Started    8m43s (x2 over 10m)    kubelet, master    Started container coredns

  Warning  Unhealthy  24s (x61 over 10m)     kubelet, master    Readiness probe failed: HTTP probe failed with statuscode: 503

what is the problem??

2. nsx-node-agent liveness probe failed.

   - It is the same message with coredns.

Events:

  Type     Reason     Age                From               Message

  ----     ------     ----               ----               -------

  Normal   Scheduled  20m                default-scheduler  Successfully assigned nsx-system/nsx-node-agent-mkgv9 to master

  Normal   Pulled     20m                kubelet, master    Container image "nsx-ncp" already present on machine

  Normal   Created    20m                kubelet, master    Created container nsx-node-agent

  Normal   Started    20m                kubelet, master    Started container nsx-node-agent

  Normal   Pulled     20m                kubelet, master    Container image "nsx-ncp" already present on machine

  Normal   Created    20m                kubelet, master    Created container nsx-kube-proxy

  Normal   Started    20m                kubelet, master    Started container nsx-kube-proxy

  Normal   Pulled     20m                kubelet, master    Container image "nsx-ncp" already present on machine

  Normal   Created    20m                kubelet, master    Created container nsx-ovs

  Normal   Started    20m                kubelet, master    Started container nsx-ovs

  Warning  Unhealthy  20m (x3 over 20m)  kubelet, master    Liveness probe failed:

but I don't know exactly what is the problem.

When I command ovs-vsctl show

    Bridge br-int

        fail_mode: standalone

        Port "coredns-5c98db65d4-9zd85_5367583b9d79cf8"

            tag: 10

            Interface "5367583b9d79cf8"

        Port "ens192"

            Interface "ens192"

        Port "coredns-5c98db65d4-r2c28_0ab9b86af5700af"

            tag: 12

            Interface "0ab9b86af5700af"

        Port br-int

            Interface br-int

                type: internal

        Port nsx_agent_outer

            tag: 4094

            Interface nsx_agent_outer

    ovs_version: "2.10.4.15054368"

show like this. So, What is the problem??and How can I fix it??

Here is my system information

docker 18.09.7

kubernetes 1.15.3

nsx-t manager 2.4.1

ncp 2.5.1

ovs 2.10.4

ESXi 6.7.U2

ubuntu 16.04

Tags (1)
0 Kudos
3 Replies
RaymundoEC
VMware Employee
VMware Employee

It is a kind hard to see with this information if there is something not correctly configured, I will suggest checking of NCP new version is available is not a coredns problem at all, and try this to check for logs

kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name

kubectl logs --namespace=kube-system

+vRay
0 Kudos
mauricioamorim
VMware Employee
VMware Employee

Look at the logs for you ncp-node-agent with kubectl log nsx-node-agent-mkgv9 -n nsx-system. They are per node and you might have problem in only one of them.

Also double check the tags of the interfaces.

0 Kudos
serbl
Enthusiast
Enthusiast

Have you tagged the segment ports where the K8s nodes are connected? Each port needs to be tagged with scopes ncp/cluster and ncp/node_name and tags with the values for your environment.

Best regards, Rutger
0 Kudos