I'm doing my first implementation of NSX-T and have an issue in the T-0 and T1 Gateways that I think is because a fault of mine.
I deployed a Tier-0 Gateway that manages only one uplink to a physical core switch with a VLAN segment as the uplink (192.168.16.0/24) , and for its downlink I have two LIFs on overlay segments (10.10.101.0/24 and 192.168.17.0/24) and the other one connected to a Tier-1 Gateway (the auto plumbed NSX Network).
On the Tier-1 Gateways I have as its uplink the auto plumbed network and as its downlinks two overlay segments (10.10.102.0/24 and 10.10.103.0/24).
The problem I'm facing is that I see that both gateways route the traffic between their downlinks without problem. It is to say, hosts on 192.168.17.0/24 can ping hosts con 10.10.101.0/24 and also to the auto plumbed network (100.64.112.0/31). The same happens on the Tier-1 gateway, hosts on 10.10.102.0/24 can ping hosts on 10.10.103.0/24. But no one can ping hosts or gateway's interfaces that have to traverse the Gateway. For example hosts on overlay segments connected to Tier-1 Gateway cannot ping hosts on Tier-0 Gateway's overlay segment. They neither can ping the inter gateways interfaces.
The question is, could be possible that this happens because the gateways theirself are not the same device but actually two ones, the DR and SR and the DRs don't know how to reach its half in the SR??
One more thing, I'm not using BGP neither statics routes. What I have observed is that when I choose to distribute the connected subnets in the Tier-1 Gateway, from the VRF CLI of the Tier-0 Gateway SR I can see those subnets in the "get routes" command as "t1c" (tier-1 connected). But from that CLI in the Tier-0 SR if I ping those LIFs in the Tier-1 DR I cannot reach them.
It seems that there are no connections between the DR and SR entities, although there is a subnet (169.254.0.0/24).
The other problem I have is that I don't know what the Tier-1 Gateway routing table is because the "get router" command is only available on the Tier-0 Gateway and not in the Tier-1 one. This might be a conceptual error I'm not realizing why that is happening.
All the documents and videos I read and saw, show that these type of implementations are done activating BGP, that perhaps it solves all the issues I'm having. But because the customer doesn't have a core for now that supports BGP, I preferred no to use a dynamic routing protocol. Perhaps, if it is convenient, if the problem is I'm not using BGP, I could enable it just for the internal virtual networking without advertising any route to the physical network. It is to say, I could use BGP for NSX networking and use statics routes between the Edge and physical network (I know it is not optimal). For the moment, the production will have only one or two NSX segments, that is manageable thru static routes. In the future the customer will change his core for a BGP capable one.
I'll appreciate if you can help me why this is going on.
I attach a network diagram below.
Thanks in advance.