VMware Networking Community
June9858
Contributor
Contributor

NSX-T Application Platform Deployment Failed (Failed to install NSX Application Platform chart)

Hello,

I'm trying to deploy NAPP, but I get "Failed to install NSX Application Platform chart" during deployment. Deployment fails with the message:

June9858_0-1685673141223.png

 

June9858_0-1685672353669.png

 

June9858_1-1685672379424.png

 

 

I passed the pre-check and tried to install it through the vmware repo. I would like some advice on which part I should check.

The nsx version is using version 4.1 and tkg of 1.21.

best regards

.

Tags (1)
0 Kudos
19 Replies
shank89
Expert
Expert

Did you look at the logs for the crashed pods and the napps.log on the manager?

 

We'll need more information to assist.

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

hello
Thanks for the help.

I didn't know which pod was helm, so I only checked the event of the crashed pod.

 monitor-5c54d9494b-dxkbn

June9858_0-1685675021156.png

postgresq

June9858_3-1685675176851.png

 

l-ha-pgpool-6965f8dc6c-7btnl

June9858_1-1685675091797.png

cluster-api-75479ffd94-z76n6

June9858_2-1685675132302.png

common-agent-669fbf8d57-x6hll

June9858_4-1685675182066.png

regards

 

0 Kudos
shank89
Expert
Expert

What is the network setup, firewalls, MTU, routing issues?

 

It seems like it may be a networking problem.  How far into the deployment does it get before it fails %?

Are the ingress IPs being created?

What topology are you deploying NAPP into?

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

hello

Installation hangs at 40% (platform distribution), after about 10 minutes the error occurs

Ideploying to a supervisor environment with nsx, and I'm getting the following topology:

June9858_0-1685679455563.png

There seems to be no problem with routing and firewall. mtu is set to jumbo frames (9000)

The communication of externer-ip (ingress), which can be checked with kubectl get services -n projectcontour, is confirmed to be normal.

June9858_1-1685679940810.png

Please let me know if I am doing something wrong 

 

0 Kudos
shank89
Expert
Expert

Have you checked the napps log yet?

Can you please detail your entire IP schema Including what is used for the managers.

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

Sorry for the late confirmation. I'm very clumsy because I'm distributing napp for the first time

May I know where I can check the napps log log? Doesn't seem to exist in manager's /var/log

June9858_0-1685682846092.png

I will organize the ip scheme and respond right away.


 
Updating Media

 

0 Kudos
shank89
Expert
Expert

It's in the proton directory which from memory is car/log/VMware/proton.

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

Thanks for the kind explanation. I've confirmed this to be happening in the logs and I think it fits my current problem.

#######


2023-06-01 09:47:59,624 ERROR nsx_kubernetes_lib.vmware.kubernetes.common.utility[36]:execute Error executing command ['kubectl', 'delete', 'namespaces', 'projectcontour', '--timeout=5m', '--kubeconfig=/config/vmware/napps/.kube/config', '-n', 'projectcontour'], 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "projectcontour" not found\n'
2023-06-01 09:47:59,624 WARNING nsx_kubernetes_lib.vmware.kubernetes.service.kubectl.kubectl_117_service[151]:delete_namespace Error occurred during namespace delete: 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "projectcontour" not found\n'
2023-06-01 12:20:18,322 ERROR nsx_kubernetes_lib.vmware.kubernetes.common.utility[36]:execute Error executing command ['helm', 'install', 'cert-manager', '/config/vmware/napps/charts/cert-manager', '-f', '/config/vmware/napps/charts/cert-manager/values.yaml', '--set', 'webhook.image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--set', 'image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--set', 'postinstall.image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--set', 'cainjector.image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--kubeconfig=/config/vmware/napps/.kube/config', '-n', 'cert-manager', '--registry-config', '/image/napps/.config/helm/registry.json', '--repository-cache', '/image/napps/.cache/helm/repository', '--repository-config', '/image/napps/.config/helm/repositories.yaml', '--create-namespace'], 'W0601 12:15:15.477211 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.480765 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.484388 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.487031 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.995478 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.997144 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.998379 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.999588 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nError: INSTALLATION FAILED: failed post-install: timed out waiting for the condition\n'
2023-06-01 12:20:18,322 ERROR __main__[62]:main Error executing function deploy_chart. Error message: W0601 12:15:15.477211 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.480765 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.484388 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.487031 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.995478 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.997144 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.998379 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.999588 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nError: INSTALLATION FAILED: failed post-install: timed out waiting for the condition\n
2023-06-02 00:11:37,042 ERROR nsx_kubernetes_lib.vmware.kubernetes.common.utility[36]:execute Error executing command ['kubectl', 'delete', 'namespaces', 'nsxi-platform', '--timeout=10m', '--kubeconfig=/config/vmware/napps/.kube/config', '-n', 'nsxi-platform'], 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "nsxi-platform" not found\n'
2023-06-02 00:11:37,043 WARNING nsx_kubernetes_lib.vmware.kubernetes.service.kubectl.kubectl_117_service[151]:delete_namespace Error occurred during namespace delete: 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "nsxi-platform" not found\n'
2023-06-02 00:11:38,069 ERROR nsx_kubernetes_lib.vmware.kubernetes.common.utility[36]:execute Error executing command ['kubectl', 'delete', 'namespaces', 'projectcontour', '--timeout=5m', '--kubeconfig=/config/vmware/napps/.kube/config', '-n', 'projectcontour'], 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "projectcontour" not found\n'
2023-06-02 00:11:38,069 WARNING nsx_kubernetes_lib.vmware.kubernetes.service.kubectl.kubectl_117_service[151]:delete_namespace Error occurred during namespace delete: 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "projectcontour" not found\n'

0 Kudos
June9858
Contributor
Contributor

Thanks for the kind explanation.

I've confirmed this to be happening in the logs and I think it fits my current problem.

 

2023-06-01 12:20:18,322 ERROR nsx_kubernetes_lib.vmware.kubernetes.common.utility[36]:execute Error executing command ['helm', 'install', 'cert-manager', '/config/vmware/napps/charts/cert-manager', '-f', '/config/vmware/napps/charts/cert-manager/values.yaml', '--set', 'webhook.image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--set', 'image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--set', 'postinstall.image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--set', 'cainjector.image.registry=https://projects.registry.vmware.com/nsx_application_platform/clustering', '--kubeconfig=/config/vmware/napps/.kube/config', '-n', 'cert-manager', '--registry-config', '/image/napps/.config/helm/registry.json', '--repository-cache', '/image/napps/.cache/helm/repository', '--repository-config', '/image/napps/.config/helm/repositories.yaml', '--create-namespace'], 'W0601 12:15:15.477211 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.480765 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.484388 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.487031 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.995478 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.997144 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.998379 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.999588 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nError: INSTALLATION FAILED: failed post-install: timed out waiting for the condition\n'
2023-06-01 12:20:18,322 ERROR __main__[62]:main Error executing function deploy_chart. Error message: W0601 12:15:15.477211 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.480765 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.484388 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.487031 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.995478 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.997144 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.998379 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nW0601 12:15:15.999588 3860308 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+\nError: INSTALLATION FAILED: failed post-install: timed out waiting for the condition\n
2023-06-02 00:11:37,042 ERROR nsx_kubernetes_lib.vmware.kubernetes.common.utility[36]:execute Error executing command ['kubectl', 'delete', 'namespaces', 'nsxi-platform', '--timeout=10m', '--kubeconfig=/config/vmware/napps/.kube/config', '-n', 'nsxi-platform'], 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "nsxi-platform" not found\n'
2023-06-02 00:11:37,043 WARNING nsx_kubernetes_lib.vmware.kubernetes.service.kubectl.kubectl_117_service[151]:delete_namespace Error occurred during namespace delete: 'warning: deleting cluster-scoped resources, not scoped to the provided namespace\nError from server (NotFound): namespaces "nsxi-platform" not found\n'

0 Kudos
shank89
Expert
Expert

Looks like it, is it a production or supported environment?

 

I've seen success with mucking around with tkr's to get past this issue or looking into role bindings. GSS might a good place to start.

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

Unfortunately This is my test environment.

I'm so confused about where to start

0 Kudos
shank89
Expert
Expert

The log file indicates the issue with podsecuritypolicy, which version of tkR are you using exactly ? 

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

Sorry confirmation late. According to the guide, I am using version 1.12.6

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.2/nsx-application-platform/GUID-D54C1B87-8EF3-...

 

0 Kudos
shank89
Expert
Expert

Did you create the service account and role binding when creating the non expiring token?

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

Yes, I created the account, token, and Kubeconfig file by referring to the KB below.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.2/nsx-application-platform/GUID-52A52C0B-9575-...

0 Kudos
shank89
Expert
Expert

Do any of the nodes have taints for unable to be scheduled?

What form factor are you deploying and what resources have you configured ?

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

I proceeded with best-effort-large. I will check the taint and share it.

0 Kudos
shank89
Expert
Expert

Pay close attention to the resource requirements here and everything else required for the platform as well.

https://docs.vmware.com/en/VMware-NSX/4.1/nsx-application-platform/GUID-85CD2728-8081-45CE-9A4A-D72F...

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
June9858
Contributor
Contributor

aha!. Thank you. I will try again referring to the KB and share it.

0 Kudos