Hi all,
I'm looking for some assistance with a nested NSX lab I'm trying to build. I have followed the examples on the net but seem to be hitting a silly problem getting basic VXLAN up and running. I'm sure it's something simple. Any help would be greatly appreciated.
I have a single ESXi 5.5 running...
- vCenter Server Appliance
- Nested ESXi host 1 (compute node 1)
- Windows Server 1
- Nested ESXi host 2 (compute node 2)
- Windows Server 2
- Nested ESXi host 3 (management and edge node)
- NSX Manager
I have kept the base networking simple; the vCenter and nested ESXi hosts have a single vNIC (VM Network) and everything is sitting in 192.168.1.0/24. I then put all three hosts in a single cluster and setup a single Distributed Switch.
I have managed to:
- Register the vCenter with the NSX Manager
- Deployed a single NSX Controller (installed on nested host 3, also on 192.168.1.0/24)
- Prepared the hosts by installing the VIBs
- Created VTEP VMkernel interfaces (again I kept these on the same subnet - 192.168.1.0/24)
- Set Segment ID Pool (5000-5999)
- Created a new Transport Zone (unicast mode)
- Created a new Logical Switch (unicast mode)
- Connected Windows Server vNICs to Logical Switch
Ping tests between VTEP IPs works fine but VM traffic over VXLAN is not working.
Can anyone see anything obviously witch the above? Or could point be in the direction of what to check? I have hit the wall.
Many thanks
Bobby
Keep also in mind the following issue
BR,
Spas Kaloferov
Note: for Nested LAB the VTEP NIC teaming must be "failover"
You need to check what NSX Controller see in your LAB.
Lets say you connect VM1 and VM2 to VXLAN 5001.
VM1 ip 172.16.10.11
VM2 IP 172.16,10.12
find which controller manage VM1,VM2 VXLAN. (if you have 3 controller). SSH to one of your 3 NSX controller and type:
nvp-controller # show control-cluster logical-switches vni 5001
VNI Controller BUM-Replication ARP-Proxy Connections VTEPs
5001 192.168.110.201 Enabled Enabled 6 3
so now we know 192.168.110.201 is the controller that manage VXLAN 5001.
SSH to 192.168.110.201 and type:
nvp-controller # show control-cluster logical-switches arp-table 5001
VNI IP MAC Connection-ID
5001 172.16.10.11 00:50:56:a6:7a:a2 3
5001 172.16.10.12 00:50:56:a6:a1:e3 4
Does your controller in your lab know the IP/MAC/VXLAN of VM1 and VM2 ?
Although it sounds like you did from the way you phrase your question, just to check - when you say ping tests between VTEPs are working is that with minimum or VXLAN packet size?
Keep also in mind the following issue
BR,
Spas Kaloferov
Hi all,
Many thanks for the responses.
It turned out to be the agents not starting on the hosts as the NSX manager was not available during boot time. After restarting the hosts things started working. It seems the boot order of the nested hosts is important.
Thanks also for the useful controller verification commands. I can now see the VM MACs.
nsx-controller # show control-cluster logical-switches arp-table 5000
VNI IP MAC Connection-ID
5000 172.16.10.1 00:0c:29:fe:3c:20 1
5000 172.16.10.2 00:0c:29:6a:ff:4b 2
Regards,
Bobby
HI ,
The boot order is not so important. If fails cause the hosts need access to the NSX Manager when they boot while they reinitialize the NSX Agent. So if you are rebooting hosts make sure the NSX Manager VM is always available. Might wanna reboot in groups and migrate the NSX Manager VM so that it is always on. If not you might use the workaround i've pointed in the article above. Same will happen if you remove a host from the clsuter or add new host to the cluster and the NSX Manager is not accessible.
BR,
Spas Kaloferov
Hello All
I also built a Nested Lab that only use Workstation 10. Inside that workstation i use :
6 ESXi 5.5
1 Vcenter Aplliances
1 Win 7 ( For FTP Server and Oracle DB )
1 Vyatta Router
1 Openfiler ( For NFS )
My story, everytime i lab NSX, in tomorrow, i always re-installed NSX manager again. So it always same lab over and over again. My topology are simple :
VM guest --- NSX distributed router --- NSX edge router --- vyatta router --- internet
At first installation, everything worked perfectly. But if i start again tomorrow, VM guest cannot ping to default gateway at NSX distributed router, and NSX Edge router cannot ping to distributed router.
First i check connectivity. I am using OSPF. All route table are in there, but from distributed router, i cannot ping to 8.8.8.8, even there was a default route in routing table. So routing are not the issue, so i check something else.
Then i suspecting at controller nodes. Search google then found this blog >> Some useful NSX Troubleshooting Tips | CormacHogan.com
I check that with CLI, and this is what i found :
~ # esxcli network vswitch dvs vmware vxlan network list --vds-name=VM_NSX_VXLAN
VXLAN ID Multicast IP Control Plane Controller Connection Port Count MAC Entry Count ARP Entry Count
-------- ------------------------- ------------- --------------------- ---------- --------------- ---------------
5000 N/A (headend replication) Enabled () 0.0.0.0 (down) 2 0 0
I do what he told, like switch from unicast to multicast, that switch back again. And the problem still in there.
Then i "stalking" this forum, and i found this thread.
Using this blog >>ESXi host Enable Agent error "Cannot complete the operation." | Spas Kaloferov's ...
I follow the step, and it worked!!
~ # esxcli network vswitch dvs vmware vxlan network list --vds-name=VM_NSX_VXLAN
VXLAN ID Multicast IP Control Plane Controller Connection Port Count MAC Entry Count ARP Entry Count
-------- ------------------------- ----------------------------------- --------------------- ---------- --------------- ---------------
5000 N/A (headend replication) Enabled (multicast proxy,ARP proxy) 10.0.99.1 (up) 1 1 0
Now is "normal" situation again :smileygrin:
TLDR :
If you Lab, or maybe production server running VMware, then suddenly all your machine, and NSX manager died, and your check everything "like it should be", the problem maybe not in you interconnection, most likely its in controller.
The first thing you do is to ssh to ESXi host and do this command = esxcli network vswitch dvs vmware vxlan network list --vds-name=<YOUR VDS NAME>
If it show up like this :
~ # esxcli network vswitch dvs vmware vxlan network list --vds-name=VM_NSX_VXLAN
VXLAN ID Multicast IP Control Plane Controller Connection Port Count MAC Entry Count ARP Entry Count
-------- ------------------------- ------------- --------------------- ---------- --------------- ---------------
5000 N/A (headend replication) Enabled () 0.0.0.0 (down) 2 0 0
Then type in your ESXi this >> /etc/init.d/netcpad restart
And now it should be like this :
~ # esxcli network vswitch dvs vmware vxlan network list --vds-name=VM_NSX_VXLAN
VXLAN ID Multicast IP Control Plane Controller Connection Port Count MAC Entry Count ARP Entry Count
-------- ------------------------- ----------------------------------- --------------------- ---------- --------------- ---------------
5000 N/A (headend replication) Enabled (multicast proxy,ARP proxy) 10.0.99.1 (up) 1 1 0
Troubleshooting Step :
1. Check NSX Manager, is up or down.
2. Check Controller status.
3. Check installation status at host preparation, is it green or with red "resolve" in it.
4. Check ESXi with SSH, and use this command to determine if controller connection are up or down >> esxcli network vswitch dvs vmware vxlan network list --vds-name=<YOUR VDS NAME>
HI,
i'm glad the post helped.
BR,
Spas Kaloferov
hi rbenhaimrbenhaim
>Note: for Nested LAB the VTEP NIC teaming must be "failover"
Do yo mean VTEP NIC must be teaming *AND* set to "failover" ?
OR
VTEP NIC teaming must be "failover" if there is more than 2 NIC.
im trying to setup nested NSX on my desktop PC and it has only 2 physical NIC. is it possible some how?
thanks in advance.
regardless of the number of links you have, VTEP NIC teaming must be "failover" in nested environment.
Thank you sir.
In a nested environment you can also use Load Balance - SRC ID or SRC MAC for the VXLAN teaming policy. Just can't use Etherchannel/LACP.
Hello everyone !
i also plan to build a NSX lab but could not find any evaluation sets..
I even contacted my VMware partner but they did not even have it..
how did you find NSX installers ?
thanks in advance
There isn't yet.
One simple way is attend at the ICM course and you will be enabled on Nicira web site.