NSX-T Multisite 101 ToI

NSX-T Multisite 101 ToI

This document highlights NSX-T MultiSite capabilities including:

  . Latest enhancements

  . What is NSX-T Multisite

  . NSX-T Multisite Capabilities

  . Recorded Demos

For deeper information, we also offer the "NSX-T Federation Multi-Location Design Guide (Federation + Multisite)" here.

Also FYI, VMworld 2019 had a public session presenting NSX-T Multisite "NSX-T Design for Multi-Site [CNET1334BU]" Recording + Deck.

 

Note1:

This ToI may be updated in the future so always check you have the latest version.

    . NSX-T 4.0 Multisite 101 ToI version is 1.1 done on 01/10/2023.

    . NSX-T 3.2 Multisite 101 ToI version is 1.1 done on 01/10/2023.

 

Note2:

NSX-T Multisite solution is perfect for customers who want a "Smaller NSX-T Management Footprint" (with only 3x NSX-T Mgr VMs for all their locations), and accept a "DR Recovery Procedure with few more requirements or steps".

For other use cases, NSX-T 3.0 introduced a second Multi-Location solution: NSX-T Federation.

NSX-T Federation solution is based on a new component: “NSX-T Global Manager Cluster” (GM).

GM offers a central global configuration of multiple (local) NSX-T Manager Cluster, each offering Network & Security services for a Location.

NSX-T Federation solution replies to "Specific Site Management/ GDPR/ Policy Requirements", and offers "Simplified DR".

Attachments
Comments

Thank you very much for this presentation.

For "Current NSX-T Multisite Lite Requirements and Limitations" it is mentioned: "No Local-egress capabilities (100% North/South traffic goes via 1 site)"

  • Is there any plan to support this in future releases?

There is no committed day/release for that feature yet.

Can you contact ddesmidt@vmware.com for your "Local Egress" Use case and understand how you are looking at using it.

Thanks,

Dimitri

thanks guys!

Is this going to become a part of the product from a deployment option or will it remain a design/manual deployment?

It's a design you choose.

You can make it manually (click-click-click) or orchestrate that via your orchestration tool (API).

I hear your voice saying "click click click" Smiley Happy

Hi Dimitri,

I have seen the published presentation about the NSX-T 2.5 Multisite and DR recovery procedure (https://communities.vmware.com/docs/DOC-39405). It is a great job and it helps us a lot. Thanks a lot for sharing this document.

We have a question and we need your support and assistance please if possible.

Through the Data plane recovery steps when the production site is down, we have remarked only these two actions to be performed:

  1. Attach the Tier 1 to the DR tier 0
  2. Modify the Tier 1 EDGE cluster to the DR site EDGE cluster.

Because the transit subnet between Tier 1 and Tier 0 is assigned automatically (from Subnet 100.64.0.0/10), normally when moving Tier 1 Gateway from production site to DR site, this transit subnet will be changed to a new one and we have to change the static routing on both Tier 1 and Tier 0 gateway. Is that correct?

Thanks a lot

Ahmed

Hi Dimitri,

We are seeking to deploy NSX-T with multi-site failover where stretch-L2 is not an option for the NSX-T manager cluster.

What does the failover process look like without stretch L2? We understand that DNS entries are required and must be changed to new IPs as the nodes move to the new site. We are wondering if this is simply a case of re-starting the NSX-T manager node at the other site and changing IP and DNS entry or if it is more complex? Documentation is hard to find on this one.

thanks

You're correct. Those "internal T0/T1 subnets" will be different after the move of T1 to a T0-DR.

However, T0/T1 routing is something "internal".

There is nothing to do on the NSX-T Managers nor the physical fabric.

You can watch the embedded demo videos of the recovery to validate that by yourself.

Dimitri

The exact steps of NSX-T Manager recovery are well detailed in the deck.

And you even have embedded videos for each step (ppt section "Demo Script/Manual").

I let you look at those in detail.

Thanks,

Dimitri

Very useful guide! Is it really correct that the hole traffic has to go over 1 site?

I mean: blue data plane goes 1 site and if you want with AS Prepend green data plane goes the other..

Then the statement 100% north/south traffic goes via 1 site is wrong in my opinion, its more like,

all blue traffic has to use the same north/south site and all green traffic has to use same north/south site (if you want the other one with as prepend).

Hint: In this scenario of course you have to use /24 BGP external ip adress blocks for blue and green site.

I dont see the limitation here. Its limited to the green and blue data plane but not to the hole traffic, right?

How you do the configuration of the additional overlay uplink for blue and green site?

Its a geneve segement where you route the sites between each other, right?

Thanks for this document its very helpful.

I do have a question, if I want active/active multi-site cross vCenter topology. Site A and Site B are going to host there own site specific workloads, and then SRM will protect some workloads in Site A to Site B, and Site B will protect some workloads in Site B to Site A.

I dont want to have to Re-IP the workloads during failover.

Is this possible?

Correct. The "interlink" between the T0-Blue and T0-Green is an Overlay Segment connected on those T0 as uplink (so BGP can be configured).

Dimitri

Yes that's possible. That's what the deck talks about in the "Active/Active Use Case" 🙂

Dimitri

Hi Dimitri, 


Excellent deck, 2 quick questions

1. Is protecting the NSX Managers with vSphere Replication and SRM a valid approach in the case where stretched management vSphere cluster is not an option (we do have an NSX-V universal LS to place the managers on) - any drawbacks to using vSphere replication?

2. For the T0 is there local site recoverability from Blue to Green?

Thanks

DB

Hey DB,

  1. What sort of RTT do you have between sites?
  2. By site recovery, do you mean workload and segments fail from blue to green? If so the answer is no,  the edge clusters attached to their respective t0s maintain their own segments and work loads,  when the active edge fails,  the dataplane comes up in the standby site.  As Dimitri mentioned earlier,  there is a segment between the two for bgp and route exchange. 

 

Cheers

Slide 26 is a little misleading, it makes it look like you have 4x edge nodes when in fact you can only use 2x edge nodes for automated active-standby T0 failover I believe?

 

This topic which I'm describing below is do with the local egress.

What I'm depicting is a pretty common Datacenter scenario. Maybe I might have not fully understood

In VCF design document it state "local egress in VCF is not recommended"

I want to understand how others are handling below scenarios.

DC1 --is running Domain Controller DomContr01 and DC2 is running another Domain Controller e.g. DomContr02.

Server01 is deployed on overlay segment. Server01 wants to communicate with DomContr02. In this case, traffic will move out of NSX Domain and go to Preferred Site and come back to DomContr02?

Since local egress is not allowed, all traffic will hairpin to preferred site.

I'm concern about the local services (not deployed in Overlay or VLAN Backed segments) deployed in the DC2. Will this communication always hairpin to Preferred Site.

Recording Link not working!

@ddesmidt 

I have a VMC on cloud as my third DC , latency between the sites is as follow
DC1 < --> DC2 - 1-3ms

DC1< --> DC3 - 2-4ms

DC2<-->DC3 - 3-6ms

I'm deploying an NSX-T environment for the clusters in DC1 and DC2 (single VC , with a Stretched VSAN cluster and a couple of non-Stretched)

Can I consider deploying my NSX-T managers one per location given my latency is less that 10ms between each site? I'd be hosting the NSX-T manager quorum for the on-prem infrastructure on the VMC on cloud 

Version history
Revision #:
9 of 9
Last update:
‎01-10-2023 02:32 PM
Updated by:
 
Contributors