VMware Networking Community
lucsan
Contributor
Contributor

dup IP's from TEP pool, orphaned edges, NSX-T 3.2.0.1

Hi all, 

1st issue

is the second or third time when I am getting duplicate ip's from the TEP ip pool assign to edge nodes and from NSX manager UI - Networking > IP Address Pools  ''Allocated IP's'' Tab shows zero allocated IP's when in fact they are more than 20 in use. 

2nd

I have two edge transport nodes in a cluster with same uplink profile, nsx vlan backed segment uplinks, the general MTU is 9000. edge-1 controller connectivity is down, edge-2 is fine and I see there is a discrepancy on edge-1 network interfaces fp-eth0 and fp-eth1 .. MTU shows 1500, when supposed to be 9000. edge-2 uplinks has the correct values.

I was looking for a command to force the mtu setting from the cli but I have not find one .. from https://vdc-download.vmware.com/vmwb-repository/dcr-public/28fdf409-4954-4ada-b9b7-63c2490af81d/aa81...

I have deployed a new edge and replaced the edge cluster member with the new edge but still not fixed. I have redeployed the edge with the API  and still not right.  If anyone has hint please let me know. 

3rd issue 😞

I still have 2 old deleted edges under Edge Transport nodes UI,  and neither API or UI detete action will not remove them. I wanted to try the corfu db trick but I want to be sure that is the last resort thing.

Many Thanks.

Lucian

Labels (2)
Reply
0 Kudos
5 Replies
engyak
Enthusiast
Enthusiast

First off, obligatory statement. You need to contact GSS if you have this many issues in production.

If not production:

Issue #1: I actually have the same issue on 3.2.1.0 - where I cannot view TEP allocations. Mine are created in Policy (I'd recommend you do the same) and if I switch to Manager the usage is visible.

Issue #2:

  • are those edges in sync with the manager?
  • Is the Host (uplink) VDS 9000?
  • How does this guy look for you?
    • engyak_0-1655737470619.png

Issue #3:

I have seen this issue with 3.2.0/3.2.0.1/3.2.1, and the force delete option has been deprecated, along with the Corfu client required to do the trick. For my home lab I'm going to wait a bit and see if these options become available later...

lucsan
Contributor
Contributor

Thanks a lot for your reply, 

It is  POC for now, I think they were few bugs in 3.2.0 version and some of the config mapping between vCenter and NSX manager were off. 

the MTU on Global Fabric Settings was 9000 and switch infra too.

I have uninstalled everything and redeployed the new 3.2.1 version, will see how it goes.

Many Thanks Again. 

 

Reply
0 Kudos
Bengtsonb
Enthusiast
Enthusiast

Thanks for the information, I will try to figure it out for more. Keep sharing such informative post keep suggesting such post.

Reply
0 Kudos
kanecharles92
Contributor
Contributor

Hi @lucsan,

I wanted to reply to this thread and let you know that we have (twice now) experienced the symptom outlined in the first half of your "1st issue", ie duplicate TEP IPs being handed out to Edge Nodes.

The first time this issue occurred we did not have logs that went back far enough to allow GSS/Engineering to identify/resolve the issue. The second time it occurred we (luckily) had the relevant logs saved from a previous SR that then allowed Engineering to do a log analysis/code walk to identify the root cause.

This issue was introduced with 3.2.0 and affects all 3.2.0, 3.2.1, and 3.2.2 releases. It will be fixed in 3.2.3 and 4.1.1.0.

This issue only affects Edge Nodes, it does not affect TEP IP allocation for other Host TN types.

The issue occurs when a host switch (N-VDS) name change occurs while an Edge Node is in Maintenance Mode. Upon releasing the Edge Node from Maintenance Mode the issue is primed and will occur next time you deploy a fresh Edge Node, ie it will hand out the same IPs to the new Edge Node as the ones assigned to the existing Edge Node that had an N-VDS name change.

The workaround is to ensure that you aren't doing any operations where the N-VDS host switch name will change while the Edge Node is in maintenance mode. Some examples would be:

  • An Edge redeploy operation where the message body is modified to have a new N-VDS name (automatically enters MM)
  • An Edge replacement operation where the new Edge Node has a different N-VDS name to the Edge that is being replaced (automatically enters MM)
  • Reconfiguring the Edge N-VDS (perhaps changing uplink profile, etc) while in MM

There are probably more examples, however the above ones were discussed with Engineering as triggers.

VMware have put together KB 91378 for this.

For the intervening period until 3.2.3 and 4.1.1.0 are made available, we will need to ensure that no Edge N-VDS name changes are made where IP Pools are being used for TEP IP allocation.

Hopefully this helps anyone coming across this issue as well.

For reference, when engaging with support should you have the same issue, we had SR 23405990302 and PR 3104308.

 

Regards,

Kane.

Reply
0 Kudos
kanecharles92
Contributor
Contributor

Additionally, for the second part of problem #1, this is a known issue and will not be fixed in the 3.2.x stream. It is fixed in 4.1.0.

For reference, our SR# was 22364583509.

For now, you will need to use the Manager Mode view for seeing assigned IPs from the IP Pools. Whilst this is frustrating, at least it is fixed in 4.1.0.

Regards,

Kane.

Reply
0 Kudos