Unable to get VMs communicating in NSX environment

Trying to get one VM on one host pinging another VM on another host across VXLAN tunnel.

u1 is on host knoppix2 is on host

I ping from one VM to the other, but don't even get an ARP response.




Anything else I can provide?

Any suggestions are welcome.

Tags (1)
0 Kudos
4 Replies

Did you have VXLAN working previously?

There's an exclamation mark on host, you may want to check that too.

You can perform these checks and see if any of the checks failed

I think you should be able to troubleshoot further based on these checks.

Check the NSX Dashboard see if there is any issue on host preparation or logical switch status


Check the channel communication of both ESXi hosts



Do a logical switch ping test from the UI using both minimum and VXLAN standard packet sizes


Do a vmkping test for the VTEPs from ESXi host CLI

     vmkping ++netstack=vxlan <vmknic IP> -d -s <packet size>

See this KB: Testing VMkernel network connectivity with the vmkping command (1003728) | VMware KB


To validate this, ping using MTU smaller than 1500 e.g. 1470 then try again using MTU highter thatn 1500 e.g. 1570.

If the ping works with the smaller (1470) size, but not 1570, then you have MTU issue in your physical switch.

Do a Traceflow


Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook | twitter @bayupw
0 Kudos

You can automate the validation of your environment explained by Bayu using the healthchecks in NSX-PowerOps

Sounds like you need to do VTEP to VTEP tests

0 Kudos

So I don't see one thing in all of this, are the problems purely within NSX or are they having problems with the physical layer?

One thing to make sure is that the 10.100.28.x subnet is not the native vlan, from the looks of your nic's they all ride over one network.

I mean from the looks of things in the pictures everything is in the 10.100.28.x subnet, one thing to check I suppose is the physical switch. Case-in-point, is the links from the physical layer to the vmnicssetup as access or as a trunk uplink and then what is the native vlan in conjunction with the 10.100.28.x subnet? - if the 10.100.28.x subnet is the native vlan, try making the native vlan someithng like 4000, then if it's not the native vlan make sure you have encapsulation enable on the uplink since it should be a trunk. I guess it's worth asking, as you using unicast, hybrid or multicast?

is your 10.100.28.x super-subnetted?

Just some observations.

VCP5/6-DCV, VCP6-NV, vExpert 2015/2016/2017, A+, Net+, Sec +, Storage+, CCENT, ICM NSX 6.2, 70-410, 70-411
0 Kudos

So decided to start fresh and it now works fine.

Had been using nested ESXi - now doing no nesting.

Think I'll stay clear of nested ESXi for a while.

Bayu - many thanks for the suggestions.

0 Kudos