VMware Networking Community
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

NSX-T Application Platform Deployment Failed

Hello All,

I am trying to install NSX-T application platform. I am using vSphere with Tanzu as my Kubernetes Cluster. Also, I am sing the public registry for pulling the docker images.

I am getting the error.

 

NSX Application Platform deployment failed! Contour chart installation failed

 

 

Reply
0 Kudos
1 Solution

Accepted Solutions
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello Shank &  p0wertje,

The problem is been fixed.

Assumption: As I stated, I was getting some internal error in the contour pods. So, assumed it has something to do with the container images of contour. 

Solution: I have upgraded my NSX-T from 3.2.0 to 3.2.1 and used the VMware Registry to pull the container images. Once it is done, the installation went through without any issues.

Thanks a lot to you guys for spending a lot of time. And it guided me to the right path.

 

 

View solution in original post

Reply
0 Kudos
19 Replies
ShahabKhan
VMware Employee
VMware Employee
Jump to solution

Hi,

At which stage you are getting this error? Could you please share the logs or screenshots?

Reply
0 Kudos
p0wertje
Hot Shot
Hot Shot
Jump to solution

You might be able to see why it fails with kubectl

kubectl get pods -owide  -n projectcontour
And see if a pod is in Error state
kubectl describe pods <POD name>  -n projectcontour

Screenshots and logs would be useful

Cheers,
p0wertje | VCIX6-NV | JNCIS-ENT | vExpert
Please kudo helpful posts and mark the thread as solved if solved
Reply
0 Kudos
shank89
Expert
Expert
Jump to solution

Contour is ingress, which means an LB issue.  

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
Reply
0 Kudos
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello p0wertje ,

Firstly, I don't have much skills in Tanzu or K*s. I am following VMware Docs and Blogs (Thanks to Shank Mohan). 

I am getting the below error while I am using the command  kubectl get pods -owide  -n projectcontour

Warning SyncLoadBalancerFailed 3m3s (x2 over 3m8s) service-controller Error syncing load balancer: failed to ensure load balancer: VirtualMachineService IP not found

 

And the log for the kubectl describe pods <POD name>  -n projectcontour is attached. 

 

Thanks a ton again all.

 

 

 

Reply
0 Kudos
p0wertje
Hot Shot
Hot Shot
Jump to solution

Does 
kubectl get services -n projectcontour
Show an external IP on projectcontour-envoy?

It might be this: (taken from shanks website)
The 
Service Name needs to be correct or else the deployment will fail almost immediately. From what I can see, a DNS lookup is performed with the FQDN entered, and whatever IP address comes back is added as the ingress / contour / envoy load balancer IP. So ensure you create an appropriate DNS entry and IP address. In my case, I have created a DNS entry in the vip-tkg range.

The service name you use when installing NAPP needs to resolve to an IP that is in the VIP range you configured (i assume you use AVI, since you follow the blog)
NSX Application Platform Part 3: NSX-T, NSX-ALB (Avi), and Tanzu (lab2prod.com.au)

Cheers,
p0wertje | VCIX6-NV | JNCIS-ENT | vExpert
Please kudo helpful posts and mark the thread as solved if solved
Reply
0 Kudos
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello,

Yes, I do see the external IP when I am using get service with kubectl command.

akumbalakandy_0-1656335002065.png

Also, the VIP is being created in the AVI. 

akumbalakandy_1-1656335048856.png

And this is in the vip-tkg range

akumbalakandy_2-1656335091402.png

akumbalakandy_3-1656335106356.png

What I see when I use the kubectl describe services -n projectcontour projectcontour-envoy, I can see the below error. I am not sure if this is causing the issues

akumbalakandy_0-1656335597278.png

 

 

 

 

Reply
0 Kudos
p0wertje
Hot Shot
Hot Shot
Jump to solution

What is AVI saying why it is down ?

Cheers,
p0wertje | VCIX6-NV | JNCIS-ENT | vExpert
Please kudo helpful posts and mark the thread as solved if solved
Reply
0 Kudos
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hi,

Because it is unable to reach the back end pool remembers. 

akumbalakandy_0-1656339315761.png

 

Reply
0 Kudos
p0wertje
Hot Shot
Hot Shot
Jump to solution

The service looks fine.
The Warning you get is the first it shows. Then it moves to ensuring and last ensured
How are the pods ?

kubectl get pods -owide  -n projectcontour


Cheers,
p0wertje | VCIX6-NV | JNCIS-ENT | vExpert
Please kudo helpful posts and mark the thread as solved if solved
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

The pods looks OK.

 

akumbalakandy_0-1656339659467.png

 

Reply
0 Kudos
p0wertje
Hot Shot
Hot Shot
Jump to solution

I am running a bit out of ideas.

Maybe a stupid question (but I got to ask) Is it not the nsx firewall blocking things?

Cheers,
p0wertje | VCIX6-NV | JNCIS-ENT | vExpert
Please kudo helpful posts and mark the thread as solved if solved
Reply
0 Kudos
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello, 

I had the same doubt so that I moved the networking from NSX-T to vCenter. Still the same. Now I have Any to Any allow policy in NSX-T. 

Tags (1)
Reply
0 Kudos
shank89
Expert
Expert
Jump to solution

If you are using AVI for LB, name resolution / IPAM in AVI and VIP / workload is reachable, did you ensure that the static route in AVI is correctly configured? 

It sounds like a routing issue.

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
Reply
0 Kudos
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello Shank,

The routing is in place. The VS and VIP are working fine for the workload cluster.  I can reach and manage the workload cluster through Avi VIP. 

 

akumbalakandy_0-1656392794664.png

172.25.93.0/24 --> this is my workload segment

172.25.92.1 --> this is the DG for VIP network. 

 

I am getting the below error in the contour. It looks like an commination issue within the service network. 

akumbalakandy_1-1656392927085.png

Can you please advise how to get the logs specific to contour pods? As I said before I have very limited knowledge in K8s. 

Thanks a lot. 

 

Reply
0 Kudos
p0wertje
Hot Shot
Hot Shot
Jump to solution

You can get the logs by using 
kubectl logs <POD name>  -n projectcontour

Cheers,
p0wertje | VCIX6-NV | JNCIS-ENT | vExpert
Please kudo helpful posts and mark the thread as solved if solved
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello,

I am attaching the contour pods logs. It looks like some internal error. 

 

 

Reply
0 Kudos
shank89
Expert
Expert
Jump to solution

To me it still seems like a routing / network issue based on the error you are getting. What topology are you deploying, NSX/ vDS/ AVI, have you tested connectivity between nodes, ensured all the basics for NSX functionality are in place?

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello Shank,

Yes. I do have the basic NSX-T infrastructure properly working. I can reach the Supervisor cluster VIP from external network. The supervisor cluster VIP is connected to a NSX-T segment. Also, I can reach the VIP of the new TKC deployed for NAPP. The VIP for the NAPP is being created in the AVI but not completing the backend pods deployment. 

 

 

 

 

Reply
0 Kudos
akumbalakandy
Enthusiast
Enthusiast
Jump to solution

Hello Shank &  p0wertje,

The problem is been fixed.

Assumption: As I stated, I was getting some internal error in the contour pods. So, assumed it has something to do with the container images of contour. 

Solution: I have upgraded my NSX-T from 3.2.0 to 3.2.1 and used the VMware Registry to pull the container images. Once it is done, the installation went through without any issues.

Thanks a lot to you guys for spending a lot of time. And it guided me to the right path.

 

 

Reply
0 Kudos