I have a scenario whereby I'm seeing routes from a Juniper SRX firewall being reflected back to it from a DLR behind an ESG in NSX. The SRX ECMP peers with two ESGs and what's happening is that the ESGs are advertising to the DLR and the DLR is advertising routes BACK to one of the two ESGs (the "second one") and that ESG is advertising the route back to the SRX and presto, a route loop. AS Path loop prevention does not work as the ESG is only advertising it's own next-hop AS, not the whole AS Path.
Scenario #1 – physical networks advertised to ESG.
2 x ESG (PLR) learning from Physical Firewall and distributed to UDLR and local DLR.
The DLR is then reflecting the learned route BACK to the second PLR.
Second PLR re-advertises the network it’s learning from the Physical Firewall back to the Physical Firewall.
Physical Firewall is installing the route! (luckily benign as OSPF has the cheapest cost)
Scenario #2 – virtual wires advertised to ESG from UDLR.
2 x ESG (PLR) learning Virtual Wires from UDLR and distributed to local DLR.
The DLR is then reflecting the learned route BACK to the second PLR.
ESG installs the route, but it’s benign as it can see the real preferred path via its local AS direct connected network instead of the peer AS path.
I've attached a diagram for Clarity. I've logged a support request with VMware already, but figured I'd post here and see if anyone else has observed such an issue. The obscure thing is that I can't reproduce this in NSX 6.3.2 setup in practically the same fashion. The workaround in 6.3.5 is that I've setup ingress route filters on the second ESG to prevent it from receiving the routes.
For the benefit of the community I can now advise that GSS investigated this issue and published a KB: VMware Knowledge Base , I've pasted it below.
Document Id
Symptoms
This article applies if your NSX BGP configuration satisfies all of the following conditions:
When all of the above conditions are observed, you might experience routing loops in your NSX domain:
Cause
There are two parts to this problem. Together they cause this issue.
Impact / Risks
Resolution
Workaround
To work around this issue, either do one of the following:
For the benefit of the community I can now advise that GSS investigated this issue and published a KB: VMware Knowledge Base , I've pasted it below.
Document Id
Symptoms
This article applies if your NSX BGP configuration satisfies all of the following conditions:
When all of the above conditions are observed, you might experience routing loops in your NSX domain:
Cause
There are two parts to this problem. Together they cause this issue.
Impact / Risks
Resolution
Workaround
To work around this issue, either do one of the following: