Long post ahead. I'm trying to configure an NSX-T environment for the purpose of home learning. This is a nested environment.
I have a physical ESXi host, which hosts the nested ESXi VM's and the NSX Manager VM.
The 3 x nested ESXi VM's (ESX 7) are essentially the transport nodes and on that cluster - a pair of Edge VM's reside.
The ESXi hosts, and Edge VM's get their IP addresses from the DHCP pool and are on the same subnet.
I can successfully ping all TEP addresses using the VXLAN stack.
I have created a Tier 0 gateway, with a VRF, and a segment attached to the VRF.
I can ping all the way from my home laptop (outside the nested environment), down to the segment level. I have a VM attached to the segment however for the life of me i just can not ping this VM. From the VM, I also can not ping past the segment. (Yes, checked windows firewall etc etc).
All MTU's are set to 9000.
Promiscous mode, forged transmits etc all enabled on the vDS.
BGP is configured on pfSense and on the T0 and VRF interfaces.
Whenever a VM that is connected to the segment is powered on - the ESXi host shows as 'node status' down for some reason. The tunnels also show as down.
When I select Edge Transport Nodes, they show as degraded when the VM is on.
On the ESXi hosts, there is a vDS with 2 x Trunk Port Groups (0-4094) for overlay traffic that the Edge nodes are connected to. (I have also tried connecting the Edge nodes to 2 x trunk segments instead but this doesn't work either)
Any help in being able to ping that VM would be much appreciated!
Some pics of my config:
Try setting your nested VDS MTU lower than 9000, for example 4000, then set your outer VDS to 9000. Set your NSX-T MTU to 1700. Try pinging between all your TEPs (Hosts and Edge) with MTU 1700.
vmkping <dst IP> -S vxlan -s 1700 -d
ESX host TEP interfaces can ping each other. Should the ESX host TEPs be able to ping the Edge TEPs if they are on a different subnet? The only way I can get ping working from ESX to Edge TEPs is if they are on the same subnet.
Yes, they should be able to ping each other with at least 1600 MTU, depending on what you have set as MTU in NSX-T.
If they are on different subnets you need to route between them on an external router.
If they are on the same subnet, you need to configure according to this KB:
How is CPU/Memory usage on physical host? Is it very high? I can vaguely see Edge NIC Receive out of buffer alarm in 3rd screenshot which usually occurs when there is very CPU usage on edge nodes.