Hello all,
i have some issues with a nested NSX lab. I've created 4 clusters:
1) management cluster - 3 hosts, not prepared for VXLAN
2) edge cluster - 2 hosts, prepared for VXLAN
3) compute cluster 01 - 2 hosts, prepared for VXLAN
4) compute cluster 02 - 2 hosts, prepared for VXLAN
Management cluster and Edge Cluster have dedicated VDS (Mgmt_VDS and Edge_VDS respectively), Compute Clusters shares a single VDS (Compute_VDS)
A single transport zone has been created for the edge and compute clusters.
I'm able to ping all VTEPs between hosts (even with MTU size at 1570) but if i attach 2 test VM to a logical switch VM traffic is not working.
Some additional infos:
VIB seems correctly installed:
esx-nsxv 6.5.0-0.0.7119877 VMware VMwareCertified 2017-12-06
VTEP teaming policy set to failover
Attached some debug commands on two hosts on which VMs resides.
Thank you very much for help,
Nicola
Hi, is you physical host/cluster hosting the nested ESXi hosts having VXLAN installed?
Just want to make sure that you are not facing this issue: From the dept of the knowledge arcane: NSX-v with nested ESXi | Telecom Occasionally
From the logs i can see that you have captured the output from two host and their respective VTEP mac table is not populated with destination VM MAC on both the hosts, controllers are showing the MAC
172.18.0.102 00:50:56:af:dd:f0
172.18.0.101 00:50:56:af:d4:7a
show control-cluster logical-switches mac-table 15000
VNI MAC VTEP-IP Connection-ID
15000 00:50:56:af:b8:fd 172.28.30.1 3
15000 00:50:56:af:d4:7a 172.28.30.1 3
15000 00:50:56:af:dd:f0 172.28.30.2 13
Can you check the uplink order/policy for those logical switches ? Also can you migrate these VM's to same host and confirm it ,also do provide output of below command after these test from both the hosts
esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
Both VM on vesxi04:
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~] esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~] esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~]
[root@vesxi04:~] esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
[root@vesxi04:~]
And.. I can ping VMs each other ..
The same for vesxi05:
[root@vesxi05:~] esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
[root@vesxi05:~]
[root@vesxi05:~]
[root@vesxi05:~]
[root@vesxi05:~]
[root@vesxi05:~] esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
[root@vesxi05:~]
[root@vesxi05:~]
[root@vesxi05:~]
[root@vesxi05:~]
[root@vesxi05:~] esxcli network vswitch dvs vmware vxlan network mac list --vds-name Edge_VDS --vxlan-id=15000
[root@vesxi05:~]
thanks for help,
N
Thanks for the update. Can you check/provide the uplink policy for those logical switches , do check host4&5 Uplinks which are connected to Logical switch 15000 and perform a basic connectivity test .
vesxi04 uplinks:
vesxi05 uplinks:
From vesxi04 I can ping vesxi05 VTEP:
[root@vesxi04:~] ping ++netstack=vxlan 172.28.30.2 -s 1570 -d
PING 172.28.30.2 (172.28.30.2): 1570 data bytes
1578 bytes from 172.28.30.2: icmp_seq=0 ttl=64 time=1.638 ms
1578 bytes from 172.28.30.2: icmp_seq=1 ttl=64 time=0.779 ms
1578 bytes from 172.28.30.2: icmp_seq=2 ttl=64 time=0.751 ms
Also, I can ping vesxi04 VTEP from vesxi05:
[root@vesxi05:~] ping ++netstack=vxlan 172.28.30.1 -s 1570 -d
PING 172.28.30.1 (172.28.30.1): 1570 data bytes
1578 bytes from 172.28.30.1: icmp_seq=0 ttl=64 time=0.887 ms
1578 bytes from 172.28.30.1: icmp_seq=1 ttl=64 time=0.816 ms
1578 bytes from 172.28.30.1: icmp_seq=2 ttl=64 time=0.727 ms
But if I make a test from vCenter:
the same for vesx05 to vesxi04:
Thanks for your help,
N.
Your test is failing because you are sending standard VXLAN packet and i don't think you have set MTU to a minimum value of 1600 .
yes, it is 1600.
Can you flip the packet size to standard in the same test and check the results ?
fails again, both ways:
So you have a potential connectivity issue between these two hosts which is why standard/vxlan packets are getting dropped. This being nested set-up i'm curious to know the topology also for simplicity sake can you remove those VLAN's from VXLAN Portgroup and test it with one uplink connected to logical switches from each hosts ?
Hi, is you physical host/cluster hosting the nested ESXi hosts having VXLAN installed?
Just want to make sure that you are not facing this issue: From the dept of the knowledge arcane: NSX-v with nested ESXi | Telecom Occasionally
Hi, yes! VXLAN is installed also on physical hosts.
thank you very much for your help!