VMware Networking Community
PhoenixVM
Contributor
Contributor
Jump to solution

Active/Active T0 : What's the point? When is it appropriate?

Hey there,

I have been debating between an Active/Active (w/ECMP) and Active/Standby T0 design in my environment.  

After considering my options, I'm left wondering why one would use an Active/Active T0 topology at all.  An Active/Active topology means that you can't run stateful services at the T0, which leaves you undesirable workarounds:

 

1. Don't run stateful services at all (what use is this system without them?)

 

2.  Deploy a T1 to one of your edge clusters and do services there.  This would cause all N-S traffic destined to those services to hook through the *single* active T1 SR, which defeats the purpose of load-balancing with an Active/Active T0 + ECMP to begin with.  This could also create an inefficient traffic path depending from which T0 traffic enters the environment.

The only scenario that I can conceive where this would be acceptable or desirable, is if you are service provider creating multiple T1s for different tenants (in which case, it's common sense that a tenant's traffic would only enter/leave via their dedicated T1).  

Outside of that scenario, is there any reason to run T0 Active/Active at all? 

0 Kudos
1 Solution

Accepted Solutions
shank89
Expert
Expert
Jump to solution

Hi PhoenixVM,

It's definitely something to consider,  but sometimes overthought. 

To be honest, the only time I have seen customers go with A/S, is for a specific reason, eg.  their upstream devices only support A/S etc.

A couple of pointers

  • A/A is VVD compliant: https://docs.vmware.com/en/VMware-Validated-Design/6.2/sddc-architecture-and-design-for-a-virtual-in...
  • A/A aside from ECMP, also allows higher throughput
  • Rather than enabling a stateful service on the Tier-0 which would pin all tenancies / T1's to a specific Edge for ingress and egress.  The better option is to configure them on a Tier-1, ensuring only traffic that needs to route through a specific edge does
  • Depending on your edge cluster design, from the many deployments of NSX-T, the impact of traffic ingressing the Edge not active for the stateful service is non-existent.  That is, the impact of traffic ingressing this edge unnoticeable.  It is common for this to be over thought and over-engineered.  Again, this comment is dependent on appropriate edge cluster design (A/A SRs local and not spanning sites).

At the end of the day, the deployment model is up to the customer and their specific requirements.  But if you are talking about a single site, SRs not spanning across WAN links etc, then it shouldn't really be a concern.

Cheers

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3

View solution in original post

3 Replies
shank89
Expert
Expert
Jump to solution

Hi PhoenixVM,

It's definitely something to consider,  but sometimes overthought. 

To be honest, the only time I have seen customers go with A/S, is for a specific reason, eg.  their upstream devices only support A/S etc.

A couple of pointers

  • A/A is VVD compliant: https://docs.vmware.com/en/VMware-Validated-Design/6.2/sddc-architecture-and-design-for-a-virtual-in...
  • A/A aside from ECMP, also allows higher throughput
  • Rather than enabling a stateful service on the Tier-0 which would pin all tenancies / T1's to a specific Edge for ingress and egress.  The better option is to configure them on a Tier-1, ensuring only traffic that needs to route through a specific edge does
  • Depending on your edge cluster design, from the many deployments of NSX-T, the impact of traffic ingressing the Edge not active for the stateful service is non-existent.  That is, the impact of traffic ingressing this edge unnoticeable.  It is common for this to be over thought and over-engineered.  Again, this comment is dependent on appropriate edge cluster design (A/A SRs local and not spanning sites).

At the end of the day, the deployment model is up to the customer and their specific requirements.  But if you are talking about a single site, SRs not spanning across WAN links etc, then it shouldn't really be a concern.

Cheers

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
PhoenixVM
Contributor
Contributor
Jump to solution

Hey Shashank,

"Depending on your edge cluster design, from the many deployments of NSX-T, the impact of traffic ingressing the Edge not active for the stateful service is non-existent."

This insight is very helpful.  I've come across a few blogs now where admins discourage deploying a T1 SR due in order to avoid the flow above.  That gives me something to think about if the universe will not collapse on itself. 🙂

"Rather than enabling a stateful service on the Tier-0 which would pin all tenancies / T1's to a specific Edge for ingress and egress. "

This wasn't top-of-mind for me, being that we're not a traditional service provider.  That being said, I'm going with a two-tier architecture to give me options for multiple T1s down the road, so this is a consideration worth bearing. 

Thanks for sharing your experience and your logic.  It's great to get to the 'why's behind the design decisions.

 

0 Kudos
shank89
Expert
Expert
Jump to solution

Not a problem, glad I could help.

If you find the answer adequate, please mark the post as resolved 🙂

Shashank Mohan

VCIX-NV 2022 | VCP-DCV2019 | CCNP Specialist

https://lab2prod.com.au
LinkedIn https://www.linkedin.com/in/shankmohan/
Twitter @ShankMohan
Author of NSX-T Logical Routing: https://link.springer.com/book/10.1007/978-1-4842-7458-3
0 Kudos