VMware Networking Community
TryllZ
Expert
Expert

NSX-T 4.0.0.1 Edge Transport Registration Timedout, only 1 of 2 facing this issue ?!

Hi All,

This is my 3rd try to deploy a 2nd Edge Node but for some reason it keeps timing out.

I have deployed both edge nodes more than once just to confirm, this only happens with the 2nd node.

What I'm not understanding is why it fails on the second one.

Any thoughts on where to start troubleshooting ?

TryllZ_0-1690291332930.png

Thank You

Reply
0 Kudos
9 Replies
TryllZ
Expert
Expert

I tried to edit some settings on the Node and it shows the below error.

TryllZ_1-1690291702643.png

 

 

Reply
0 Kudos
TryllZ
Expert
Expert

Somebody had suggested accepting block modes for the Distributed Switch, I tried setting all the 3 to reject, did not resolve the issue so I went into the logs, and found the following

The one that registers successsfully shows:

Node successfully registered as Fabric Node: dd279991-b731-4a57-9b93-8d869b52078e 

cmd: su admin -c join management-plane 10.10.15.6:443 thumbprint b81fe78dab07ea11aa62358eac55a575311972ed693e818e4a544c165ab5d86f token <obfuscated> node-uuid dd279991-b731-4a57-9b93-8d869b52078e

While the one that fails shows

NAPI server is not ready, unable to register with manager.

Something that rang into my mind is UUID having experienced it long back when cloning ESXi Nested VMs where VMware Workstation would not generate new UUID.

So I checked  UUID of both nodes.

The working node has a UUID

edge1> get node-uuid
Tue Jul 25 2023 UTC 18:54:26.140
uuid: dd279991-b731-4a57-9b93-8d869b52078e

The failed node does not have a UUID at all

edge2> get node-uuid
Tue Jul 25 2023 UTC 18:54:04.293
uuid: None

Need to check joining manually from Edge

Reply
0 Kudos
NicoRenard
Enthusiast
Enthusiast

Hello!

strange error...

What I notice of older version is when the communication with manager is not working well the customisation of the edge is not finished.

- Have you try to validate with a curl that the communication is possible between your edge and the manager on port 443?

- Have you also try to deploy a standalone edge and to attach it to manager after?

- Have you recheck that all the prerequistes are ok for your edge?

- No ip conflict?communication with DNS is ok?ntp etc.?

- Same network, no FW on the road?

 

Good luck 

NicoRenard_0-1690536134558.png

 

Please KUDO helpful posts and mark the thread as solved if answered

Reply
0 Kudos
CyberNils
Hot Shot
Hot Shot

Put a VM on an Overlay Segment and make sure the tunnels come up between the Edge and the Hosts. Check with vmkping that you can get proper sized frames across. Try again with the the VM running on the segment.



Nils Kristiansen
https://cybernils.net/
Reply
0 Kudos
TryllZ
Expert
Expert

- Have you try to validate with a curl that the communication is possible between your edge and the manager on port 443?

No, not sure if I understand how to do this with curl.

- Have you also try to deploy a standalone edge and to attach it to manager after?

Yes this has been tried, and it works fine, setting up TNP, TZ, and other configurations works fine as well, once a Tier-0 is deployed it does not communicate with the outer firewall for BGP peering. The Edge deployed via the NSX UI does not have this issue.

- Have you recheck that all the prerequistes are ok for your edge?

The 1st Edge deployment deploys fine, and both Edge nodes have exactly the same configuration, so yes.

- No ip conflict?communication with DNS is ok?ntp etc.?

All of this is good, checked from within the failed Edge node.

- Same network, no FW on the road?

I will redeploy without a firewall and check again, but again, 1 node installs fine, not sure why the 2nd one fails.

Reply
0 Kudos
TryllZ
Expert
Expert

@CyberNils 

Sorry I could not understand what you mean by the below, what VM are you referring to, any VM deployed for testing ?!


Put a VM on an Overlay Segment and make sure the tunnels come up between the Edge and the Hosts.

 

Reply
0 Kudos
CyberNils
Hot Shot
Hot Shot

Just any random VM so that you can get the Geneve tunnels up.



Nils Kristiansen
https://cybernils.net/
Reply
0 Kudos
TryllZ
Expert
Expert

Thanks @CyberNils 

Will try that..

Reply
0 Kudos
TryllZ
Expert
Expert

Hi All,

I  can confirm this issue to be due to iSCSI storage (SATA SAS with SSD but not NVME) over a 1G connection, and network traffic.

I had 8 Nested ESXi, multiple iSCSI storage.

I removed all and kept 4 Nested ESXi, 2 iSCSI storage, and tested this twice, installed from the NSX Manager Web UI both times without any issues.

Reply
0 Kudos