Anyone has done successful local egress and local ingress deployments in 3 site scenarios?
Need Local Egress & Local Ingress routing solution for below use case:
==========================================================
- 3 Sites ( Site1+Site2 in Stretched cluster CA primary setup and Site 3 is a DR).
- Site1 and Site2 have one vCenter and one NSX manager. (Primary)
- Site3 has one vCenter and one NSX manager. (DR)
- There are physical firewalls in all sites and asymmetric routing is not supported.
- Physical network running on OSPF/EIGRP routing.
- Two ECMP enabled ESG's in each site. ESG's have BGP peering with UDLR and Physical network.
- One UDLR and all universal logical logical switches should be stretched across 3 sites.
- VM should send outbound traffic and receive inbound traffic from the site it currently lives. for example, if VM is living in Site 1 then VM outbound/inbound traffic should be through Site 1. and if VM vMotioned to Site2 or Site3 then VM outbound/inbound traffic should be through Site 2 or Site 3 respectively.
My Thoughts:
- Deploy UDLR with Local egress and ECMP.
- NSX local egress feature will ensure VM's 100% local Egress traffic routing in each site.
- Site1 will relay on subnet based routing.
- Site2 will relay on /32 based script injected routes on Site2 ESG's.
- Site3 will also relay on /32 based script injected routes on Site3 ESG's.
- vMotion/periodic VM placement scan events will be the trigger for the script to add or remove /32 routes(one route per VM) on ESG's of respective sites.
- ESG's in each site will advertise UDLR advertised VM subnets to the physical network. Site2 and Site3 physical network gear to be configured to manipulate OSPF/EIGRP route metrics in redistribution configuration in order to not to prefer ingress traffic on Site2 or Site3 for subnet based routing. And Site1 will be preferred for ingress traffic for subnet based routing.
- Script-based /32 route injections on Site 2 ESG's will keep ingress traffic for VM's living in Site2 via Site 2 ESG's.
- Script-based /32 route injections on Site 3 ESG's will keep ingress traffic for VM's living in Site3 via Site 3 ESG's.
Challenges:
- The above-discussed throughs would work straight forward in case if all 3 sites have dedicated vCenter & NSX managers. But it doesn't work straight forward in case of Site1 & Site2 which are in stretched cluster setup with one vCenter and one NSX manager.
- As per design guide, only static routing is supported in the case of Site1 & Site2 and there is no universal control VM. but in my case considering Site 3 requirement we need dynamic routing and Universal control VM.
- The design guide discusses only two site scenarios with Active/Active(one vCenter & one NSX manager) and Active/Standby(two vCenters and two NSX managers). But my case is hybrid with Active/Active/Standby and this scenario is not mentioned.
- Any workarounds?
Design guide I am refering: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/nsx/vmware-multi-site-sol...
If anyone can share your thoughts would be really helpful.