5 Replies Latest reply on Aug 9, 2019 12:32 AM by Chris Mentjox

    nsx-t and kubernetes

    Chris Mentjox Enthusiast

      Hi,

       

      I have created a small nsx-t and kubernetes setup.

      Running CentOS 7.6.1810

      Docker 18.06.3-ce

      REPOSITORY                               TAG             IMAGE ID        CREATED         SIZE
      registry.local/2.4.1.13515827/nsx-ncp-rhel   latest          d80a0f1e9112    3 months ago    714MB
      k8s.gcr.io/kube-proxy-amd64              v1.11.4         5071d096cfcd    9 months ago    98.2MB
      k8s.gcr.io/kube-apiserver-amd64          v1.11.4         de6de495c1f4    9 months ago    187MB
      k8s.gcr.io/kube-controller-manager-amd64 v1.11.4         dc1d57df5ac0    9 months ago    155MB
      k8s.gcr.io/kube-scheduler-amd64          v1.11.4         569cb58b9c03    9 months ago    56.8MB
      k8s.gcr.io/coredns                       1.1.3           b3b94275d97c    14 months ago   45.6MB
      k8s.gcr.io/etcd-amd64                    3.2.18          b8df3b177be2    16 months ago   219MB
      k8s.gcr.io/pause-amd64                   3.1             da86e6ba6ca1    19 months ago   742kB
      k8s.gcr.io/pause                         3.1             da86e6ba6ca1    19 months ago   742kB

       

      NAME       STATUSROLES AGE   VERSION
      k8s-master01   Ready master3h    v1.11.10
      k8s-node01 Ready <none>3h    v1.11.10
      k8s-node02 Ready <none>3h    v1.11.10

       

      kube-system   nginx-deployment-67594d6bf6-nvnhd  0/1   ContainerCreating   0      44m
      kube-system   nginx-deployment-67594d6bf6-vtmjp  0/1   ContainerCreating   0      44m
      nsx-systemnsx-ncp-vp4qq                      1/1   Running         1      1h
      nsx-systemnsx-node-agent-hwtd9               2/2   Running         15     44m
      nsx-systemnsx-node-agent-pr472               2/2   Running         15     44m

       

      I keep getting:

      Warning  FailedCreatePodSandBox  8m (x5 over 23m)kubelet, k8s-node02  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "f4cfd0266b09b4b43f17696a2f2982533999acb0ad5c0f7406be5ec9fa612b2e" network for pod "nginx-deployment-67594d6bf6-nvnhd": NetworkPlugin cni failed to set up pod "nginx-deployment-67594d6bf6-nvnhd_kube-system" network: Failed to receive message header from nsx_node_agent, failed to clean up sandbox container "f4cfd0266b09b4b43f17696a2f2982533999acb0ad5c0f7406be5ec9fa612b2e" network for pod "nginx-deployment-67594d6bf6-nvnhd": NetworkPlugin cni failed to teardown pod "nginx-deployment-67594d6bf6-nvnhd_kube-system" network: Failed to connect to nsx_node_agent: [Errno 111] Connection refused]

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.634Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="INFO"] __main__ Initialized CNI configuration

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.634Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="DEBUG"] __main__ CNI Command in environment: DEL

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.634Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="INFO"] __main__ nsx_cni plugin invoked with arguments: DEL

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.637Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="INFO"] __main__ Reading configuration on standard input

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.637Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="INFO"] __main__ Unconfiguring networking for container 8d9e76c3ef3f9d928fff3a822b75ceb584a12a087daff51c0e61084144189f5c

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.637Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="DEBUG"] __main__ Network config from input: {u'cniVersion': u'0.3.1', u'type': u'nsx', u'name': u'nsx-cni', u'mtu': 1500}

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.637Z k8s-master01 NSX 9685 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="ERROR" errorCode="NCP04002"] __main__ Failed to connect to nsx_node_agent: [Errno 2] No such file or directory

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: E0807 15:29:30.641209    8084 cni.go:280] Error deleting network: Failed to connect to nsx_node_agent: [Errno 2] No such file or directory

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: E0807 15:29:30.641878    8084 remote_runtime.go:115] StopPodSandbox "8d9e76c3ef3f9d928fff3a822b75ceb584a12a087daff51c0e61084144189f5c" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "coredns-78fcdf6894-n2pwr_kube-system" network: Failed to connect to nsx_node_agent: [Errno 2] No such file or directory

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: E0807 15:29:30.641907    8084 kuberuntime_gc.go:153] Failed to stop sandbox "8d9e76c3ef3f9d928fff3a822b75ceb584a12a087daff51c0e61084144189f5c" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "coredns-78fcdf6894-n2pwr_kube-system" network: Failed to connect to nsx_node_agent: [Errno 2] No such file or directory

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: W0807 15:29:30.643632    8084 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "0421c69b6339249854d7ec1885c8094460e3cc7c4048ea70e473c4fc9796cbff"

      Aug 07 15:29:30 k8s-master01 kubelet[8084]: 1 2019-08-07T15:29:30.680Z k8s-master01 NSX 9686 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_cni" level="INFO"] __main__ Initialized CNI configuration

       

       

       

      Any ideas ?

        • 1. Re: nsx-t and kubernetes
          daphnissov Guru
          Community WarriorsvExpert

          What version of the NCP are you using? I noticed you're using a fairly old version of Kubernetes (1.11), so there may be a compatibility issue there. Check the release notes for your version of the NCP to ensure you're using one of the compatible versions.

          • 2. Re: nsx-t and kubernetes
            Chris Mentjox Enthusiast

            Running 1.14.5 now.

             

            NAME       STATUS   ROLESAGE VERSION
            k8s-master01   Readymaster   8m15s   v1.14.5
            k8s-node01 Ready<none>   7m14s   v1.14.5
            k8s-node02 Ready<none>   7m4sv1.14.5

             

             

             

            Still the same "network: Failed to connect to nsx_node_agent: [Errno 111] Connection refused]"

            • 3. Re: nsx-t and kubernetes
              Chris Mentjox Enthusiast

              NCP 2.4.1.13515827

              NSX 2.4.1

              • 4. Re: nsx-t and kubernetes
                daphnissov Guru
                Community WarriorsvExpert

                Earliest version that supports is 1.13.

                • 5. Re: nsx-t and kubernetes
                  Chris Mentjox Enthusiast

                  Found the problem (thx to vmware support!)

                   

                  I made the <nodename> ncp/node_name and ncp_cluster on the VM name instead of the interface/logical switch.

                  Because of this the hyperbus was unhealthy.

                   

                  On the esx host in 'nsxcli' you can type 'get hyperbus connection info'

                  this showed nothing.

                  That was esaclty the reason why i got a connection refused.

                  After changing the tagging the hyperbus was healthy and everything works.

                   

                  xxxxx.infra.test> get hyperbus connection info

                                  VIFID                            Connection                         Status

                  198c008e-dc61-406e-bf75-688c4dae0a24         169.254.1.12:2345                     HEALTHY

                  4601b498-1ee7-4232-8ead-a70663a221e1         169.254.1.11:2345                     HEALTHY

                   

                  Also the nsx-agent-node gives a healthy now:

                   

                  kubectl exec -n nsx-system -it  nsx-node-agent-hclhs  nsxcli
                  Defaulting container name to nsx-node-agent.
                  Use 'kubectl describe pod/nsx-node-agent-hclhs -n nsx-system' to see all of the containers in this pod.
                  NSX CLI (Node Agent). Press ? for command list or enter: help
                  k8s-node01> get node-agent-hyperbus status
                  HyperBus status: Healthy

                  k8s-node01>