Hi all,
I am having problems with a new install of NSX-V 6.2. I am unable to ping between VMs on the same Logical Switch.
Here are my build steps:
- Registered NSX Manager with vCenter
- Deployed three NSX Controllers
- Prepared hosts by installing the VIBs
- Created VTEP VMkernel interfaces
- Set Segment ID Pool (5000-5999)
- Created a new Transport Zone (unicast mode)
- Created a new Logical Switch (unicast mode)
- Migrated VM vNICs to Logical Switch
Everything looks OK in the vSphere Web Client, for example the Controller status is normal and VXLAN ping tests are fine between hosts.
Just no commination between VMs. Here are some outputs from the controller:
nsx-controller # show control-cluster status
Type Status Since
--------------------------------------------------------------------------------
Join status: Join complete 09/29 16:24:25
Majority status: Connected to cluster majority 09/29 16:23:58
Restart status: This controller can be safely restarted 09/29 16:24:20
Cluster ID: ef3e531e-cc5a-4086-a1f4-d9f3b69077e7
Node UUID: ef3e531e-cc5a-4086-a1f4-d9f3b69077e7
Role Configured status Active status
--------------------------------------------------------------------------------
api_provider enabled activated
persistence_server enabled activated
switch_manager enabled activated
logical_manager enabled activated
directory_server enabled activated
nsx-controller # show control-cluster connections
role port listening open conns
--------------------------------------------------------
api_provider api/443 Y 2
--------------------------------------------------------
persistence_server server/2878 - 0
client/2888 Y 0
election/3888 Y 0
--------------------------------------------------------
switch_manager ovsmgmt/6632 Y 0
openflow/6633 Y 0
--------------------------------------------------------
system cluster/7777 Y 0
nsx-controller # show control-cluster logical-switches vni 5000
VNI Controller BUM-Replication ARP-Proxy Connections
5000 192.168.100.50 Enabled Enabled 0
nsx-controller # show control-cluster logical-switches connection-table 5000 <—— Empty output
nsx-controller # show control-cluster logical-switches arp-table 5000 <—— Empty output
nsx-controller # show control-cluster logical-switches vtep-table 5000 <—— Empty output
Can anyone suggest any troubleshooting tips?
Thanks,
Bobby
there are few thing wants to check, Did you enable MTU size more than 1600 your physical lan? what is the load balancing mechanism you are using for your DVS nic which you have chosen as transport for VTEP? how many NICs u have been using per hosts?
multiple ip needed for multiple vtep
Hi,
I am using 1 NIC per host. Teaming policy is set to Failover.
Any more suggestions?
Thanks,
Bobby
try logging in on the other 2 controllers and try the command you have in red.
1 of the controllers is responsible for the vtep and mac table. So it is possible that in your case it is controller-2
You can also login on the vmware host and ping the other vtep:
(vmk2 is the vtep interface and 192.168.249.241 is the vtep ip on the other host.
ping ++netstack=vxlan -I vmk2 192.168.249.241 -s 1572 -d
PING 192.168.249.241 (192.168.249.241): 1572 data bytes
1580 bytes from 192.168.249.241: icmp_seq=0 ttl=64 time=0.563 ms
1580 bytes from 192.168.249.241: icmp_seq=1 ttl=64 time=0.447 ms
If this does not ping, the mtu is probably incorrect on the physical switch.
from version 6.2 you also have centralized cli commands (run them from the nsx-manager cli)
For example:
manager> show controller list all
NAME IP State
controller-3 192.168.249.251 RUNNING
controller-1 192.168.249.250 RUNNING
nsx-controller-node3 192.168.249.252 RUNNING
manager> show logical-switch list vni 5000 host
ID HostName VdsName
host-83 192.168.249.15 management dvs
host-11 192.168.249.25 management dvs
host-186 192.168.249.20 compute dvs 1
host-176 192.168.249.10 compute dvs 1
manager> show logical-switch list all
NAME UUID VNI Trans Zone Name Trans Zone ID
transport 9ce11b50-9af4-4d96-b3bf-65fdf847a8e5 5000 compute vdnscope-2
web_tier 08bde95a-1e72-4ff2-b0de-bdd293610b41 5001 compute vdnscope-2
network_test_lab01 dea2a8d0-fde3-468f-aa96-163fae13be70 5002 compute vdnscope-2
network_test_lab02 afad1b08-d3d9-4393-ace5-a28a3d8b65c7 5003 compute vdnscope-2
test_lab_management_network 61d52086-3f75-42f0-ac2f-7ae76536d4c0 5004compute vdnscope-2
network_test_lab03 6a1afc92-25d0-45d7-bf68-d9d8cc14432b 5005 compute vdnscope-2
manager> show logical-switch controller controller-1 vni 5002 brief
VNI Controller BUM-Replication ARP-Proxy Connections
5002 192.168.249.250 Enabled Enabled 1
Hi - I would probably check to see that the netcpa agent on the hosts is communicating with the Controller cluster as well over TCP port 1234. You can do an 'esxcli network ip connection list|grep 1234' to see if there are some established sessions from one of the hosts in question. Seems a bit suspicious that your tables are empty on the Controllers. Check out /var/log/netcpa.log on the hosts as well - for those with the VMs on your logical switch you should see some VTEP joins.