I’m struggling to deploy the NSX Application Platform in our environment. The deployment prechecks all pass, but the deployment consistently fails at the ‘Registering Platform’ step. Admittedly I am a newbie when it comes to Tanzu and k8s, but hoping someone can point me in the right direction.
I have deployed a Tanzu CE cluster, 3 control plane nodes and 3 worker nodes. All meeting the spec required to deploy NSX intelligence (16CPUs, 64GB RAM, 1TB Disk). Kube VIP + antrea is used for networking.
MetalLB has been configured to provide an entry point for the service name / fqdn. It has been given a pool of 15 addresses, and I have configured 2 A records to point to the first two addresses from this range:
Service name - nsx-application-platform.domain.com
Messaging Service Name - nsx-application-platform-msn.domain.com
(To be honest, I’m not exactly clear what the ‘messaging service name’ is - it seems new with nsx 3.2.x - I’m also just taking it on faith that the deployment will somehow assign the correct IPs from the metallb pool, to correspond with the A records I have created…..)
For context, I’ve been using this chap’s guide, and found it very helpful - https://lumberjackwizard.com/2022/03/09/deploying-nsx-application-platform-part-six-metallb/
Aside from the obvious symptom / error of ‘NSX Application Platform Registration failed’ during deployment, the only other errors I can see are these, which occur on the metallb speaker pods
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 45m kubelet Liveness probe failed: Get "http://10.50.16.169:7472/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 45m kubelet Readiness probe failed: Get "http://10.50.16.169:7472/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
That said, all the pods in the metallb namespace look like they are running ok:
>kubectl get pods -n metallb-system
NAME READY STATUS RESTARTS AGE
controller-66445f859d-589zw 1/1 Running 0 20h
speaker-c6gqt 1/1 Running 0 20h
speaker-dnrbh 1/1 Running 0 20h
speaker-ncpcl 1/1 Running 0 20h
speaker-qg6zz 1/1 Running 0 20h
speaker-qt7mw 1/1 Running 0 20h
speaker-r6kgs 1/1 Running 0 20h
I appreciate these scenarios are very difficult to diagnose and troubleshoot - but I’d really appreciate any pointers you could throw my way!
Thanks in advance