Achmo
Contributor
Contributor

Issues between newly created T1 and T0

Hello all,

I am having a weird issue and I would like your assistance to solve it.
So, we recently purchased a private cloud solution that comes with vSphere and NSX-T.
The provider has created a single T0 gateway that connects to their infrastructure for Internet connectivity, a T1 gateway(let’s call it T1-A), and a couple of segments, from which we can use one of them for internet connectivity on a VM(let’s call it segment-A).
T1-A also comes with a sNAT rule and all the firewalls are allowing everything.
We created a new VM(VM-A), used segment-A, placed the correct IPs and everything works straight away, we can ping, install updates, etc..

Now, we created our own segment(segment-B), a new T1(T1-B) and we linked segment-B to T1-B and T1-B to the T0. We placed the sNAT rules, checked the firewall and configuration to be the same as the working ones, and we can not ping outside our infrastructure.
We tried changing segment-B to connect to T1-A(and we also created the same sNAT rules we did on T1-B to T1-A) and everything works, so we deduced the fault lies in the T1-B.

We checked the NAT, the firewall, and even the configuration(since we can compare it with the working example) and everything seems the same.
Now here is where the issue is getting even weirder:
Between each T1 and the single T0, we use 100.64.0.0/16(which I believe is by default).
When we try to ping from VM-B the IP of T0 on that subnet(in our example the IP is 100.64.32.4) it does not work, whereas from VM-A we can ping its corresponding IP(100.64.32.0).
To make things even weirder, we used the Traceroute tool from within NSX, and it says that both VM-A and VM-A can ping google IP(8.8.8.8) but in reality, we can not ping it from VM-B.

Do you guys have any suggestions on what to check next?

Thank you.

0 Kudos
12 Replies
shank89
Expert
Expert

The advertisements on the T1 are often forgotten, have you ensured SNAT / connected segments and whatever else you require to be advertised, is?

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
Achmo
Contributor
Contributor

Hello,

Thanks for the swift reply. Yeah, I followed the same configuration as the working T1(T1-A).
Here is a screenshot:

Achmo_0-1624613521368.png


It's really confusing because T1-A works, and with the same configuration T1-B does not work, so I am a bit lost

 

0 Kudos
shank89
Expert
Expert

Can you get east-west working? VM on Seg-B pinging VM on Seg-A ?

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
Achmo
Contributor
Contributor

Hello,

East-west is not working in the current setup. The only way that I can ping each other is if I connect both VMs to the same T1.
Otherwise, it fails.
However, if I use the Port Connectivity Tool it shows that there is(or there should be) port connectivity between the 2:

connectivity.png

Traceflow tool also says that the package is delivered.
Just to know, do these tools assume if something is working based on the setup, or do they actually run the commands/simulate things?
Because I find it very weird that for example I can not ping each other, but traceflow says that the packet is delivered without any issues. Same when I try to ping Google, traceflow is saying that the packet is delivered on the uplink, but nothing happens.

 

0 Kudos
shank89
Expert
Expert

So it looks like you have workload on different host transport nodes.  If that is the case, are your GENEVE tunnels up?
Have you tested pings from TEP to TEP using jumbo frames?  vmkping ++netstack=vxlan <destTEPIP> -s 8972 -d (from an ESXi host).

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
Achmo
Contributor
Contributor

I checked the tunnels and they are all showing UP through NSX-T monitor's page.
Unfortunately, I do not have access to any console/ssh session, so all the troubleshoot and testing I can do is via the GUI manager and the console of a VM itself.

0 Kudos
shank89
Expert
Expert

It might be helpful to contact someone who can check things in the CLI for you as well.

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos
SrVMoussa
VMware Employee
VMware Employee

Hi, 

 

What firewalls applied rules to the VM, and let me ask you may have configured Your T1s to advertise the routes to its respective T0s, correct? 

 

 

Regards,
Khalid Moussa
0 Kudos
Gregor2
Contributor
Contributor

Did you have a fix on this issue? I have the same problem with it.

 

Paycheckrecords

0 Kudos
Achmo
Contributor
Contributor

Hey @SrVMoussa  the firewall that is applied to every T1 and T0, is to allow all traffic from any to any because so far this is a staging environment.
And yes, we have enabled everything on the T1 route re-distribution.

@Gregor2 so far nothing, we are now waiting for further checks from the provider's side.

0 Kudos
SrVMoussa
VMware Employee
VMware Employee

Hi, 

 

I am sorry I didn't follow this one - I think you resolved it now 

If you did please let us know, and if not we may continue updating this thread 

Regards,
Khalid Moussa
0 Kudos
zlq5001
Contributor
Contributor

Was there a solution to this issue? I am facing the same problem.

0 Kudos