I have deployed NSX in lab environment. I have problem that VM can not reach ESGW or outside network. VM can ping DLR. ESGW can ping the DLR on the transit link but VM's traffic doesn't pass through the ESGW. I checked that VXLAN VTEP on each host can ping each other on >1600 packet size. I have done troubleshooting but I am still stuck. Some tips or help would be beneficial. The up-links are LAG.
The layout of SDDC looks like this
VM (172.16.100.1/26 GW: 172.16.100.62)
DLR (INTERNAL: 172.16.100.62/26, UPLINK: 172.16.100.193/28)
ESGW (INTERNAL: 172.16.100.194/28, UPLINK: 172.16.100.225/28 with Default GW: 172.16.100.226)
ping from VM (172.16.100.1) to (172.16.100.194 or 172.16.100.226) is timeout or destination host unreachable.
Note: Firewall is disabled on DLR and ESGW and allowed on DFW. Also, in traceroute from 172.16.100.1 to 172.16.100.194 the path is green.
If you place VM and edge on same host are they able to reach other ? I would also recommend to remove the LAG - that is not a supported design.
There are few test that you can do to isolate the issue:
1. Move the VM to EDGE host and test the ping.
2. Put VM in exclusion list and test.
3. Move DLR -VM and EDGE-VM to new host and test the connectivity.
3a. Follow step 1.
I was wondering if you have configured static route or default gateway for DLR. It is mentioned that default gateway is configured for ESG and if VM is directly connected to ESG then it worked fine. So its worth a while to take a look at DLR for static routes or DG.
aggarwalvinay31 already gave you a hint. Since you're using default gateway on DLR, packets from your VM can reach DLR and then ESG, but you don't have a return route. You can configure a static route on ESG to specify that the next hop for 172.16.100.0/26 is 172.16.100.193.
Ideally you'd want to configure dynamic routing between DLR and ESG, using something like BGP.