VMware Networking Community
rajeevsrikant
Expert
Expert
Jump to solution

NSX Edge ECMP -> HA

Currently I have 2 NSX Edge Gateways in Active - Active (ECMP OSPF)

I want to change this setup to HA. Is it possible to achieve by configuration change or it is required to redeploy the NSX Edge ?

Let me know what is the best option.

Reply
0 Kudos
1 Solution

Accepted Solutions
bayupw
Leadership
Leadership
Jump to solution

You don't need to specify/allocate the IP and can just create a new logical switch for Edge HA.

The recommendation as described in VMware® NSX for vSphere Network Virtualization Design Guide ver 3.0 is using VXLAN for HA

HA-interface.PNG

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw

View solution in original post

Reply
0 Kudos
16 Replies
rajeevsrikant
Expert
Expert
Jump to solution

Further to add to the below my understanding is that i can delete one of the Edge gateways.

By this only 1 gateway will remain. But is it technically possible to configure it into HA or does it required to be re-deployed.

Reply
0 Kudos
bayupw
Leadership
Leadership
Jump to solution

I haven't try this in detail but you can possibly do below steps:

- delete the 2nd NSX edge gateways

- disable ECMP on the 1st NSX edge gateway

- disable ECMP on physical router and DLR

- enable HA on the 1st NSX edge gateway


I have tried in the VMware Hands On Lab that I can disable ECMP from an ECMP-enabled edge and enable HA on that edge

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

Thanks.

Where is the option to enable the HA in the NSX Edge ?

Reply
0 Kudos
bayupw
Leadership
Leadership
Jump to solution

Under Edge > Manage > Settings > Configuration > HA Configuration > Change

See below screenshots

edge-ha.PNG

edge-ha2.PNG

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

Thanks Got it.

Regarding the management IP which needs to be configured for the HA does this needs to be on closed network (not reachable from any other network)

What is the normal recommendation for the network allocation for the management IP ( which logical switch or port group does this belongs to)

Reply
0 Kudos
bayupw
Leadership
Leadership
Jump to solution

You don't need to specify/allocate the IP and can just create a new logical switch for Edge HA.

The recommendation as described in VMware® NSX for vSphere Network Virtualization Design Guide ver 3.0 is using VXLAN for HA

HA-interface.PNG

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

Thanks

Reply
0 Kudos
DaleCoghlan
VMware Employee
VMware Employee
Jump to solution

You should also make sure that Graceful Restart is enabled when running an ESG in HA mode. When running ECMP, Graceful Restart should be disabled.

Also make sure you adjust your dynamic routing protocol timers accordingly once you move from ECMP back to a Edge HA type deployment. This goes for both the ESG and the DLR.

And if you did ECMP correctly you will also have some floating static routes configured on the ECMP Edges which you should be able to remove once you fix up all the OSPF/BGP timers.

Dale

Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

Thanks.....

You should also make sure that Graceful Restart is enabled when running an ESG in HA mode. When running ECMP, Graceful Restart should be disabled.

[Reply] - In my current scenario I am using ECMP & Graceful restart is enabled. Let me know the reason why  Graceful restart should be disabled when using ECMP.

Also make sure you adjust your dynamic routing protocol timers accordingly once you move from ECMP back to a Edge HA type deployment. This goes for both the ESG and the DLR.

[Reply] - Does this mean that the OSPF Hello & Dead Interval should match between the Physical Router <-> Edge & Edge <-> DLR.

And if you did ECMP correctly you will also have some floating static routes configured on the ECMP Edges which you should be able to remove once you fix up all the OSPF/BGP timers.

[Reply] - Sorry i didn't get this point. Could you please give more insights to this.

Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

bayuwibowo

I have query regarding your below steps

- delete the 2nd NSX edge gateways

- disable ECMP on the 1st NSX edge gateway

[Query] - I have the edge gateways having OSPF neighbour relationship to 2 physical L3 switches.

There are 2 OSPF Paths between the edge gateways & the Physical L3 switches.

So from my understanding I need to have the ECMP enabled in the Active NSX Edge Gateway. Please clarify.

- disable ECMP on physical router and DLR

- enable HA on the 1st NSX edge gateway


Further to the above since i am planning to do this in Production environment i am planning to take backup of the NSX Edge gateways before making any changes.

Would like to know how to do this. If any thing goes wrong by doing this, i need to have backup or way to ensure that i can revert back to the original settings.

Reply
0 Kudos
bayupw
Leadership
Leadership
Jump to solution

Hi, if you are going to change from ECMP to HA then you will disable ECMP after changing the Edge to HA

Those steps are high level, the additional details are mentioned by DaleCoghlan‌ above

For example in ECMP you may have the routing timers, for example in OSPF hello/dead timers at 1/3 seconds, for HA the recommendation is 30/120

Same goes on the summarized floating static routing that is normally used to handle DLR Control VM failure.

The static routing is no longer required in HA as the dynamic routing protocol timers are long enough

In terms of NSX Edge backup, you can't backup individual NSX Edge using snapshot or backup software

VMware NSX for vSphere 6.2 Documentation Center - Back Up NSX Edges

"Taking individual NSX Edge backups is not supported."

The NSX Edge configuration is part of NSX Manager, if you restore manually from snapshot/backup software the config will be out of sync with NSX Manager

Redeploying the Edge through the vSphere Web Client will restore your NSX Edge to the latest config.

One possible way is to backup existing configuration is through REST API by getting the edge configuration and save the XML

To restore, edit the XML and redeploy through REST API.

Here's a blog on how to do it: NSX Edge Backup and Restore – VMTECHIE

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

Thanks.

Below is the current OSPF Timers in my Edge & DLR.

OSPF Hello Interval – 10 seconds

OSPF Dead Interval – 40 seconds

So as per the recommendation it has to be changed as below.

OSPF Hello Interval – 30 seconds

OSPF Dead Interval – 120 seconds

I will do the same & i will ensure that it is same in my physical network device also.

Regarding the other question i asked, my NSX Edge Gateway has 2 uplinks for OSPF routing adjacency to 2 Physical L3 switches.

So there will be 2 paths from the NSX Edge Gateway to the physical network. So considering this , is this ECMP or should i not consider this as ECMP.

Reply
0 Kudos
bayupw
Leadership
Leadership
Jump to solution

FYI the timers is from design guide

ha timers.PNG ecmp timers.PNG

Regarding your OSPF, it depends on your setup.

If you have multilink OSPF on a different network and you would like to load balance them, then use ECMP

But when you want active/standby or it's a one network connected to two physical routers, you do not need ECMP

edge activestandby ha.PNG

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw
Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

Thanks

My setup is similar to the one shown in the diagram from the design guide which you have shown.

The active edge will have OSPF neigh with 2 physcial routers. So from ESG prespective it has 2 equal paths for any network from the 2 physical routers.

So in this case will it require to enable ECMP in the Edge Gateway.

Reply
0 Kudos
rajeevsrikant
Expert
Expert
Jump to solution

bayupw

- enable HA on the 1st NSX edge gateway

Does it involve any down time when we enable HA on the NSX gateway ?

Reply
0 Kudos
bayupw
Leadership
Leadership
Jump to solution

When you have one NSX Edge, the traffic will pass through that Edge.

Once you enable HA on that Edge, a new Edge will be deployed and act as standby so there should be no downtime involved

Bayu Wibowo | VCIX6-DCV/NV
Author of VMware NSX Cookbook http://bit.ly/NSXCookbook
https://github.com/bayupw/PowerNSX-Scripts
https://nz.linkedin.com/in/bayupw | twitter @bayupw