NSX Multisite 101 ToI

NSX Multisite 101 ToI

The ToI ppt documents highlight NSX MultiSite capabilities including:

  . Latest enhancements

  . What is NSX Multisite

  . NSX Multisite Capabilities

  . Embedded Recorded Demos

I also attached a full recorded demo of the use case "Multisite + vSphere-HA".

 

For deeper information, we also offer the "NSX Federation Multi-Location Design Guide (Federation + Multisite)" here.

Also FYI, VMworld 2019 had a public session presenting NSX Multisite "NSX-T Design for Multi-Site [CNET1334BU]" Recording + Deck.

 

Note1:

This ToI may be updated in the future so always check you have the latest version.

    . NSX 4.0-4.1 Multisite 101 ToI version is 1.0 done on 03/02/2023.

    . NSX-T 3.2 Multisite 101 ToI version is 1.1 done on 01/10/2023.

 

Note2:

NSX Multisite solution is perfect for customers who want a "Smaller NSX Management Footprint" (with only 3x NSX Mgr VMs for all their locations), and accept a "DR Recovery Procedure with few more requirements or steps".

For other use cases, NSX-T 3.0 introduced a second Multi-Location solution: NSX Federation.

NSX Federation solution is based on a new component: “NSX Global Manager Cluster” (GM).

GM offers a central global configuration of multiple (local) NSX Manager Cluster, each offering Network & Security services for a Location.

NSX Federation solution replies to "Specific Site Management/ GDPR/ Policy Requirements", and offers "Simplified DR".

Labels (2)
Attachments
Comments

Thank you very much for this presentation.

For "Current NSX-T Multisite Lite Requirements and Limitations" it is mentioned: "No Local-egress capabilities (100% North/South traffic goes via 1 site)"

  • Is there any plan to support this in future releases?

There is no committed day/release for that feature yet.

Can you contact ddesmidt@vmware.com for your "Local Egress" Use case and understand how you are looking at using it.

Thanks,

Dimitri

thanks guys!

Is this going to become a part of the product from a deployment option or will it remain a design/manual deployment?

It's a design you choose.

You can make it manually (click-click-click) or orchestrate that via your orchestration tool (API).

I hear your voice saying "click click click" Smiley Happy

Hi Dimitri,

I have seen the published presentation about the NSX-T 2.5 Multisite and DR recovery procedure (https://communities.vmware.com/docs/DOC-39405). It is a great job and it helps us a lot. Thanks a lot for sharing this document.

We have a question and we need your support and assistance please if possible.

Through the Data plane recovery steps when the production site is down, we have remarked only these two actions to be performed:

  1. Attach the Tier 1 to the DR tier 0
  2. Modify the Tier 1 EDGE cluster to the DR site EDGE cluster.

Because the transit subnet between Tier 1 and Tier 0 is assigned automatically (from Subnet 100.64.0.0/10), normally when moving Tier 1 Gateway from production site to DR site, this transit subnet will be changed to a new one and we have to change the static routing on both Tier 1 and Tier 0 gateway. Is that correct?

Thanks a lot

Ahmed

Hi Dimitri,

We are seeking to deploy NSX-T with multi-site failover where stretch-L2 is not an option for the NSX-T manager cluster.

What does the failover process look like without stretch L2? We understand that DNS entries are required and must be changed to new IPs as the nodes move to the new site. We are wondering if this is simply a case of re-starting the NSX-T manager node at the other site and changing IP and DNS entry or if it is more complex? Documentation is hard to find on this one.

thanks

You're correct. Those "internal T0/T1 subnets" will be different after the move of T1 to a T0-DR.

However, T0/T1 routing is something "internal".

There is nothing to do on the NSX-T Managers nor the physical fabric.

You can watch the embedded demo videos of the recovery to validate that by yourself.

Dimitri

The exact steps of NSX-T Manager recovery are well detailed in the deck.

And you even have embedded videos for each step (ppt section "Demo Script/Manual").

I let you look at those in detail.

Thanks,

Dimitri

Very useful guide! Is it really correct that the hole traffic has to go over 1 site?

I mean: blue data plane goes 1 site and if you want with AS Prepend green data plane goes the other..

Then the statement 100% north/south traffic goes via 1 site is wrong in my opinion, its more like,

all blue traffic has to use the same north/south site and all green traffic has to use same north/south site (if you want the other one with as prepend).

Hint: In this scenario of course you have to use /24 BGP external ip adress blocks for blue and green site.

I dont see the limitation here. Its limited to the green and blue data plane but not to the hole traffic, right?

How you do the configuration of the additional overlay uplink for blue and green site?

Its a geneve segement where you route the sites between each other, right?

Thanks for this document its very helpful.

I do have a question, if I want active/active multi-site cross vCenter topology. Site A and Site B are going to host there own site specific workloads, and then SRM will protect some workloads in Site A to Site B, and Site B will protect some workloads in Site B to Site A.

I dont want to have to Re-IP the workloads during failover.

Is this possible?

Correct. The "interlink" between the T0-Blue and T0-Green is an Overlay Segment connected on those T0 as uplink (so BGP can be configured).

Dimitri

Yes that's possible. That's what the deck talks about in the "Active/Active Use Case" 🙂

Dimitri

Hi Dimitri, 


Excellent deck, 2 quick questions

1. Is protecting the NSX Managers with vSphere Replication and SRM a valid approach in the case where stretched management vSphere cluster is not an option (we do have an NSX-V universal LS to place the managers on) - any drawbacks to using vSphere replication?

2. For the T0 is there local site recoverability from Blue to Green?

Thanks

DB

Hey DB,

  1. What sort of RTT do you have between sites?
  2. By site recovery, do you mean workload and segments fail from blue to green? If so the answer is no,  the edge clusters attached to their respective t0s maintain their own segments and work loads,  when the active edge fails,  the dataplane comes up in the standby site.  As Dimitri mentioned earlier,  there is a segment between the two for bgp and route exchange. 

 

Cheers

Slide 26 is a little misleading, it makes it look like you have 4x edge nodes when in fact you can only use 2x edge nodes for automated active-standby T0 failover I believe?

 

This topic which I'm describing below is do with the local egress.

What I'm depicting is a pretty common Datacenter scenario. Maybe I might have not fully understood

In VCF design document it state "local egress in VCF is not recommended"

I want to understand how others are handling below scenarios.

DC1 --is running Domain Controller DomContr01 and DC2 is running another Domain Controller e.g. DomContr02.

Server01 is deployed on overlay segment. Server01 wants to communicate with DomContr02. In this case, traffic will move out of NSX Domain and go to Preferred Site and come back to DomContr02?

Since local egress is not allowed, all traffic will hairpin to preferred site.

I'm concern about the local services (not deployed in Overlay or VLAN Backed segments) deployed in the DC2. Will this communication always hairpin to Preferred Site.

Recording Link not working!

@ddesmidt 

I have a VMC on cloud as my third DC , latency between the sites is as follow
DC1 < --> DC2 - 1-3ms

DC1< --> DC3 - 2-4ms

DC2<-->DC3 - 3-6ms

I'm deploying an NSX-T environment for the clusters in DC1 and DC2 (single VC , with a Stretched VSAN cluster and a couple of non-Stretched)

Can I consider deploying my NSX-T managers one per location given my latency is less that 10ms between each site? I'd be hosting the NSX-T manager quorum for the on-prem infrastructure on the VMC on cloud 

What do 101 and Tol mean respectively?

101, 201, 301 means level of content details. for example like 101 - Core Basic, 201 - Advanced, 301 - Expert etc...

 

TOI - Transfer of Information

 

Hope it helps.

@chandrakm Thank you for your answer!

In the scenario of dual sites in metropolitan region, two Edge nodes are deployed at each site, and the four Edge nodes belong to the same Edge Cluster.
Then T0 uses A/A architecture, so are the T0-SRs of the 4 Edge nodes all in the Active state? After ECMP is enabled, which Edge node participates in north-south data forwarding, which Edge node is configured with T0-uplink?

 

 

Next, if the VMs of a segment are located at two sites, is there a way for NSX-T to make the north-south traffic of VMs in the same network segment and different sites go to the corresponding site T0-SR Edge node?
For example, VM-192.168.1.1 is located at SiteA, VM-192.168.1.2 is located at SiteB, so that the north-south traffic of 1.1 goes to SiteA's T0-Edge, and the north-south traffic of 1.2 goes to SiteB's T0-Edge. 

Is there a requirement for a T1 (with SR) architecture for this issue?

@SilenCN 

Question1:

You configure interfaces on your T0. And for each interface you explicitly tell on which Edge Node that T0-interface is.

Then based on that configuration, all the Edge Nodes with an interface will have that T0.

And if you have configured the T0 in A/A; then all those Edge Nodes will participate to N/S via ECMP.

Question2:

No there is no ability to do "Local-Egress" with NSX Multisite Design, as you can see in the "NSX Multi-Location Design Guide" - chapter 3.3.2.3 (Google it to find that document).

Version history
Revision #:
13 of 13
Last update:
‎02-21-2024 02:58 PM
Updated by:
 
Contributors