I am working on a NSX-T lab. I am having trouble not being able to ping the vmkernel IPs of the TEPs (even from the host, can't ping itself). The IPs are set via pool. I can run the command "esxcli network ip interface ipv4 get" and see the assigned IP and that it is on vmk10. The gateway is 0.0.0.0. Shouldn't the IP of the GW set in the IP Pool be configured?
Here is the IP Pool:
Transport Node Profile, Overlay. I assigned a physical nic vmnic1.
I am out of ideas. Any hints to get me moving again?
what you observe is expected. Since these are different stacks, ping will not work locally and will go out and route back in.
Can you try to ping TEP GW IP from host cli -> vmkping ++netstack=vxlan -I vmk10 10.1.1.1
Check if your vmnic1 is physically UP.
If this doesnt work then most probably it seems TEP VLAN mis config. Check the TEP Vlan configured in your uplink profile and make sure same is trunked on physical infra.
I have built this out in VMware Workstation. I just cannot find where my issues is. I am unable to get VMs to communicate accross the overlay network. I looked everything over. Since it is workstation, I am not using any VLAN/Trunks. Here is a simplified version. The vCenter and NSX managers are running on Workstation directly, plugged into the "management segment".
Both ESXi host's vmkernal for management are in the management segment in workstation.
Both ESXi host's TEP physical NICs are vmnic1, plugged into the "Underlay Segment" in workstation.
In NSX manager, I am using the default single nic uplink profile.
I have posted my transport node profiles above.
Am I missing something? I have confirmed communications across all segments (host to host to router) etc. Everything responds.
I recall this being a defect in VMware Workstation in that the MTU setting is broken. With that being the case, the ping isn't working because it's not allowing enough room in the frame for GENEVE. Not 100% sure about that, but nested labs like these are always problematic. It'd be far better if you can model this on vSphere.
With lower MTU also it ll work as long as your actual unencapsulated frame is less than 1400 bytes,
normal ping packet is 50-100bytes and plus Geneva header it ll be around 200bytes so we should not see MTU problem in a ping in this scenario.
It was *said* to resolve the issue, but I heard it didn't. The test to do here is ping with an explicit packet size starting low and walking up. Bottom line is, regardless of what Workstation says, if you cannot ping between TEPs with an MTU of 1572, your tunnels are not going to come up and it won't work. Easy way to check: vmkping -S vxlan <TEP> -d -s 1572 -c 10
I am unable to ping between tep interface, or from the router to the tep interface. I don't think it is MTU related at this point. Here is my uplink profile. Default configuratin from install.
This uplink profile says you're not tagging the VLAN used for that traffic. The profile defines an MTU of 1600, but that has to be a capability on whatever the underlying network is. Focus on testing between TEPs on ESXi hosts at this point which this uplink profile doesn't apply to. As I said, if cannot ping between TEPs among ESXi hosts using that command I gave you, nothing is going to work. So don't pass go and don't collect $200 until that works.
well, NSX-T configurations looks perfect.
I dont have workstation experience but I am assuming that for the overlay network (which is default Vlan) you are connecting the ESXi-1 vmnic-1 and ESXi-2 vmnic1 via some network on workstation.
Thanks for the input. No vlan tagging is used as the connection is not coming in on a trunk port, so is equivalent to default access port vlan1. As a next step I am going to remove the NSX install from these hosts, setup a distributed switch between the hosts on the network segment uplinking to the same pnics I am currently using for NSX and see if two VMs can communicate accross that. Basically, verify the network.
That'd be a good place to start. If you can ping across there, increase the MTU on the vDS to 1600 and try a ping with a larger MTU. Your goal is to reach at least 1572 with enough space to account for the encapsulation overhead.
I did this. Setup two windows VMs, one on each host using a distributed switch and the same NIC I had configured in NSX. . Enabled Jumbo Frames in Windows. I can do a standard ping between the VMs, but cannot ping between them with 1572 packet size. I need to dig into workstation.
Hold on! I forgot to increase MTU for the distributed switch in my test. I can ping with an 8000 packet size. So it does appear I am getting jumbo frames between the host servers.
So I was able to prove communications between the two ESXi hosts over the physical nics using jumbo frames. I just noticed in buried in the status, the geneve tunnel is down. Are there any logs I can get to to determine why it is down? I have a running VM on each host on the segment, trying to communicate with each other. The tunnel should be up.