Hi
I try to setup a nested (on vsphere) NSX-T 3.1 environment. Everything worked fine so far till i tried to connect the vms in the segments via a T1 gateway with each others. It is not possible to ping the different VMs over the T1 gateway. The esxi hosts, the nsx manager and all TEP are reachable and working. The T1 gateway has the setting to advertise all static routes and all connected segments & service ports. The segments are configured to use the T1 gateway. I made no firewall configuration, everything is as it was after installation.
For a better understanding of the setup i attached a picture of the setup.
web01 and web02 are in the same segment (i can ping web01 from web02 but not in the other direction) and i can not ping app01 or db01 from any system. If i connect web01 and web02 to a normal virtual portgroup (not NSX managed) i can ping from both sides.
An IP Pool for the TEP is also configured (192.168.10.200-240)
I'm new in the nsx field and appreciate every help.
What are the IPs of the VMs and their default gateway?
Looking carefully at everything you sent it seems that the VMs are using .1 or .2 in the last octet and they are configured with .254 as their default gateway. If this is correct you have a problem with your Tier1s as they are also configured with .1 instead of with .254.
In the drawing you sent in the initial post it also shows that the default gateway is .254.
If my findings are correct you should change the IP under each segment to .254, as this is the IP of the Tier1 attached to that segment.
Make sure you can ping with 1600 MTU (if you use the default MTU) between all your Host TEPs, all your Edge TEPs, and between your Host and your Edge TEPs.
Check this out: https://spillthensxt.com/how-to-validate-mtu-in-an-nsx-t-environment/
Can the VMs ping their default gateways? If you put all VMs in the same host are you still not able to ping between VMs?
Also because this is a nested environment, you may need to look into enabling promiscuous mode and forged transmits on the portgroups.
It is generally required in a nested environment.
Hi Nils
Thanks for your help. Indeed i forgot to change the vswitches and vmk ports of the esxi hosts to change to a MTU of 1600. I created the environment from scratch but i still have the connection issues between the VMs.
If web01 and web02 are on the same host, i can ping each other - but not the VMs in the other subnets.
Hi
Thanks for you post. When i put all VMs on the same host the vms web01 and web02 in the same segment can ping, but not the other vms in the other segments. No the can not ping the default gateways.
Any idea where i can find the problem?
Hi Shashank
Promiscuous mode and forged transmits are enabled on the portgroup where the virtual esxi hosts are connected.
Thanks, regards
Patrick
Can you send a screen shot of the configuration of these segments and of the Tier-1 Gateway they are attached to? You might be missing something there.
Those screenshots seem fine.
Try some troubleshooting commands on your ESXi host. SSH to the ESXi host where VMs are and type nsxcli
You will enter NSX CLI mode where you can try commands like:
get logical-switches
get logical-switches [VNI] arp-table
get logical-routers
get logical-routers [UUID] interfaces
get logical-routers [UUID] forwarding
Take a look and see if all tables are being built correctly. Is your host preparation all green with no alarms in System > Fabric > Nodes?
What ESXi and NSX versions are you using? With your configuration you should at least be able to ping the default gateways.
Can you ping with 1600 MTU between your Host TEPs and your Edge TEPs?
There is no edge here, as his Tier-1 Gateway screenshot showed. He can't even ping the default gateway which is a DR inside the ESXi host.
Hi Mauricio
Thank you for your input. In my eyes, everything looks as it should - see attachement?
Could it be a problem with the nested environment - i also made a sketch how the setup looks like. The vSwitch where all virtual esxi hosts are connected has no uplink into a physical environment, because i dont want to affect the rest of the network - could that be a problem?
Regards, Patrick/Chnoili
You should be seing ARP entries for the VMs connected to the NSX overlay segments. If you go to Networking > Segments and click on the number under the Ports column do you see your VMs?
Have you specifically tested vmkping ++netstack=vxlan <dstTEPIP> -s 1572 -d from between each host?
I know a link was suggested, but don't recall seeing that you had successfully done this.
Tunnels "Not available" indicates you have no VMs attached to the Segments, but looks like you verified that earlier though.
Hi Nils
I noticed this too, but the VMs are connected to the Segment. I see the VM configured on the esxi side to the "portgroup" and i see the vms in the NSX manager.
Is it a problem that the management ip of the esxi hosts and the ip pool for the TEPs are in the same ip range?
Example:
Host ip: 192.168.10.101
TEP ip: 192.168.10.200
Regards, Patrick
What are the IPs of the VMs and their default gateway?
Looking carefully at everything you sent it seems that the VMs are using .1 or .2 in the last octet and they are configured with .254 as their default gateway. If this is correct you have a problem with your Tier1s as they are also configured with .1 instead of with .254.
In the drawing you sent in the initial post it also shows that the default gateway is .254.
If my findings are correct you should change the IP under each segment to .254, as this is the IP of the Tier1 attached to that segment.