this is a long shot... but I'm slowly running out of ideas. I have a cluster with NSX running on it. I have some segments - routes are advertised with BGP to my router which is a physical box running vyos that connects onto my home network and default route is out my draytek router to tinternet.
I can't curl files from the internet on VMs on these nsx segments but I can ping anything on the internet.
I have vlans on the same cluster mgmt etc going out thru same vyos router, so same route, they can get files just fine.
vms on the nsx segments can curl files from the home network thru vyos physical machines fine.
monitoring traffic on vyos, when I curl a file from internet, I can see packets are coming back from the initial curl and out vyos to the nsx edge uplink but they do not get to the vm as viewed with tcpdump on the initiating vm. I am using jumbo frames... thought it might be an mtu problem, but same jumbo frames etc on mgmt network and vm on that can curl files fine (and messed around changing mtu back to 1500 all up and down and no change at all). Its just thru NSX edge gateways that I have this problem... and only when it goes out thru my draytek router (there is a cisco 2950 switch between vyos and the draytek... i have check arp on that and learned addresses in cache look good) ...
Basically packets are coming back from curl to internet but not getting to vm, yet obviously small packets with ping works.
I'm still poking and prodding but running out of ideas - I checked the outgoing interface on vyos to the vlan that nsx-edge uplinnk is on and the packets are leaving there... I'm not really sure how to diagnose further on the edge, I guess I will try looking at logs on there next.
Any ideas on what else I can be looking at would certainly be appreciated! Thanks, Bill
1.Is this an NSX-V or T environment?
2. Edges are deployed in A-A or A-S?
3. Can you test with a VLAN backed NSX network - Directly connect to your vyos and perform a curl test ?
4. When the machines are on the overlay network are you seeing any drops reported on the ESXI host where Edge VM's are residing or Any drops on the Edge interface itself?
5. From your description if I understand correctly ICMP works fine, is that true?
its NSX T 3.1
its a lab environment, there is only one edge, I haven't come across A-A or A-S terms - or anm too brain dead to know what that is at the moment
also not sure what yo mean by a vlan backed nsx network. These are on segments, I haven't tried doing any vlan in the segment creation at this point. But from the segments, I can curl files from vyos machine fine, and also from the other side of vyos on my home network a mac I can curl large files... but not from anyting on the internet out the home router - but on vlan portgroups on the same VDS as the host overlay and edge uplink portgroups, I have vlan portgoups also going out vyos and out the home router and they can curl large files fine - this is how I finally got my lemp stack installed on the vms for demostrating a 3 tier app with micro segmentation.
I am looking at edge logs and did not see drops but only just started looking, I have not looked on esxi hosts, will do so.
yes, icmp works perfectly everywhere and it tried mturoute and that is not reporting any fragmentation or any problems...
I am doing some demos and now that I have the software on need on the vms installed, it is working well enough to demo for a couple of days, but then I will start to dig into this again.