I can't for the life of my get the ospf neighbouring between the DLR and ESG to work. I strongly suspect the root cause lies somewhere within the LR protocol address config.
- I have a transit logical switch (192.168.100.0/24) which connects the forwarding address (.254) (DLR uplink) to the ESG internal LIF (.1)
- I have also configured the DLR control VM protocol interface to .253 on the same LS
- I have disabled the firewall on both the ESG and DLR
- From the ESG I can ping the DLR forwarding address
- from the NSX controllers I can ping the ESG internal lif & the DLR forwarding uplink address
- I placed a VM inside of the transit logical switch and it can ping ESG and DLR forwarding and protocol address
However - The DLR CANT ping the ESG interface (.1) and the ESG can't ping the protocol address (but can ping the forwarding address) despite all components being on the same logical transit switch. (NSX controllers a side)
when I do a "debug ip ospf" on both the ESG and DLR VMs all i can see are the following entries: "OSPF 1 sent OSPF message type 1 to IP addr 220.127.116.11 on interface i/f idx 0x0000005"
When i do show "show ip ospf neighbor", nothing is returned between the ESG and DLR
Both the DLR uplink and the ESG internal lif are part of the same OSPF area.
The DLR has multiple internal lifs attached and does not propagate those routes to the ESG
Any ideas?? I'm a bit lost on this one, any help would be greatly appreciated.
-Are you ESX and DLR on the same host?
-When you do the "debug ip ospf" do you see any received messages from 192.168.100.1?
-Not sure if you've already done this, but in order for the locally connected internal networks of the DLR to be shared with the ESG via ospf, route redistribution would need to be configure.
-May want to try a force sync and/or redeploy on both the DLR and ESX. I've had a similar issue resolved this way.
- The ESG & DLRs are on different hosts, however I have tried to see if having them on the same host would make any difference and unfortunately, it didn't
- No, I see no receive messages on 100.1
- Route redistribution has been configured (at both ESG & DLR) for both ospf and static
- I've already tried the redeploye and force sync to no avail.
Appreciate the response larsonm. Its a strange one and I can't get my head around it. Last night I blew away my DLR and re-deployed from scratch. I haven't got around to configuring the dynamic routing just yet. One thing that may be causing this issue that I didn't think of prior to deleting the DLR was licensing. I'm currently running NSX in eval mode, which may or may not reduce functionality??
Anyway, i'll re-configure the ospf routing and see if the clean deployment has made any difference. If not, I will license NSX appropriately.
In the meantime if you (or anyone else) has any ideas, i would be extremely grateful.
What version of NSX are you using in your deployment?
yes!!! It appears there is a "bug" in nsx 6.2 where the VXLAN installer creates 2 vteps even when only one is selected. I resolved this by uninstalling vxlan from each affected host, then removing the "rogue" vtep via ssh'int onto each ESXi host. i then installed VXLAN onto the hosts again and all was resolved