abhisheksha
Enthusiast
Enthusiast

Cannot ping VMs on different ESXi hosts

Hi,

I'm running a nested NSX homelab on vCloud Director. I've completed all the installation, and host preparation as you'll see from the images below. I'm trying to ping between 2 VMs that are connected to the same VNI. These VMs are on different hosts. The ping fails when they're on different hosts. When they're on the same host, I'm able to ping, indicating that there is something wrong on the VTEPs. Can you please have a look at the config below and let me know what is it that I'm doing wrong?

Many thanks for your help!

All the ESXi hosts are connected to MGMT_NW on VCD.

Screen Shot 2018-02-12 at 1.10.08 AM.png

Settings Of MGMT_NW

Screen Shot 2018-02-12 at 1.10.41 AM.png

All NSX controllers are seeing each other here.

Screen Shot 2018-02-12 at 1.18.33 AM.png

Host Preparation is complete.

Screen Shot 2018-02-12 at 1.18.41 AM.png

TRANSPORT VXLAN VLAN is set to 0.

Screen Shot 2018-02-12 at 1.18.50 AM.png

VMs M1 and M2 are connected to the same VNI.

Screen Shot 2018-02-12 at 1.19.34 AM.png

VNI is set to UNICAST.

Screen Shot 2018-02-12 at 1.20.35 AM.png

0 Kudos
8 Replies
bayupw
Leadership
Leadership

Make sure to enable promiscuous mode on the portgroups, see these 2 blog posts:

Why is Promiscuous Mode & Forged Transmits required for Nested ESXi? https://www.virtuallyghetto.com/2013/11/why-is-promiscuous-mode-forged.html

How To Enable Nested ESXi Using VXLAN In vSphere & vCloud Director

If you are still experiencing the issue, you might have similar situation as explained in this blog post: NSX and nested ESXi environments: caveats and layer-2 troubleshooting – vLenzker

Bayu Wibowo | VCIX6-DCV/NV Author of VMware NSX Cookbook http://bit.ly/NSXCookbook https://github.com/bayupw/PowerNSX-Scripts https://nz.linkedin.com/in/bayupw | twitter @bayupw
0 Kudos
abhisheksha
Enthusiast
Enthusiast

Hi bayupw​,

Thank you for your reply. I think the settings that you're talking about are already enabled because, I connected the VMs to a non-NSX PG on the same DSwitch, and even though the VMs were on different ESXi hosts, the VMs were able to talk to each other.

0 Kudos
abhisheksha
Enthusiast
Enthusiast

UPDATE - So, there was an issue with the default gateway. I rectified the issue with the default gateway and now the VTEPs on the individual hosts are able to communicate with each other, but the VMs themselves are still unable to communicate. I'm stumped. :smileyconfused:

Hi,

I think the issue is, from any of the esxi host, I'm not able to ping the other host's VTEP IP. The pings itself are failing. What do you reckon?

I'm not even able to ping my own VTEP.

Here are the settings from the PG which was created when preparing the host.

Screen Shot 2018-02-13 at 9.16.36 PM.png

Screen Shot 2018-02-13 at 9.17.02 PM.png

Screen Shot 2018-02-13 at 9.17.10 PM.png

Screen Shot 2018-02-13 at 9.17.15 PM.png

Screen Shot 2018-02-13 at 9.17.24 PM.png

I tried tweaking the security settings already, and I don't think it's the VCD underlay issue as it is already allowing traffic on non-NSX PGs for VMs on different hosts and even with packet sizes upto 1700 bytes.

Thank you,

Abhishek

0 Kudos
abhisheksha
Enthusiast
Enthusiast

UPDATE 2:

I have 2 clusters - Management and Compute. On the management cluster, the communication health channel is clearing.

On the Compute cluster, I can see the below:

Screen Shot 2018-02-14 at 12.22.59 AM.png

Based on this, I moved the M1 to Host1 of Management cluster, and M2, to Host2 of Management cluster, and they were able to ping each other. As far I understand, I think, the L2 is fine, but there's some network issue as above.

0 Kudos
Mid_Hudson_IT
Contributor
Contributor

Can I ask your reasoning for setting the vlan setting to none?

as for the phyiscal L2 layer, your managment layer, is that the default vlan 1?

If you have a physical layer, what is your native vlan?

Are the links for the VXLAN going over the managment nics or thier own vDS switch and vmnics?

VCP5/6-DCV, VCP6-NV, vExpert 2015/2016/2017, A+, Net+, Sec +, Storage+, CCENT, ICM NSX 6.2, 70-410, 70-411
0 Kudos
bayupw
Leadership
Leadership

Hi, what did you tweak on the security settings?

As explained in some blog posts in previous reply, in a nested environment you would want to set the promiscuous to accept and not all reject as per your screenshot

Bayu Wibowo | VCIX6-DCV/NV Author of VMware NSX Cookbook http://bit.ly/NSXCookbook https://github.com/bayupw/PowerNSX-Scripts https://nz.linkedin.com/in/bayupw | twitter @bayupw
0 Kudos
jimmyeys
Contributor
Contributor

Please check the subnet mask for the vmkernal ports created by nsx or check theMTU of the switch

0 Kudos
sarvp
Contributor
Contributor

Im new so please pardon if someone has already mentioned this. Did you check the DFW rules? Maybe some rule is blocking this, even though they are on the same subnet. Maybe a deny all rule at if DFW is enabled.

In addition, are both clusters part of the same Transport zone?

Thanks

Sarvjit

0 Kudos