Single NSX-T Edge Node N-VDS with correct VLAN pinning

Single NSX-T Edge Node N-VDS with correct VLAN pinning

Dear readers

Welcome to a new blog post talking about a specific NSX-T Edge Node VM deployment with only a single Edge Node N-VDS. You may have seen the 2019 VMworld session "Next-Generation Reference Design with NSX-T: Part 1" (CNET2061BU or CNET2061BE) from Nimish Desai. On one of his slides he mentions how we could deploy a single NSX-T Edge Node N-VDS instead of the three Edge Node N-VDS. This new approach (available since NSX-T 2.5 for Edge Node VM) with a single Edge Node N-VDS has the following advantages:

  • Multiple TEPs to load balance overlay traffic for different overlay segments
  • Same NSX-T Edge Node N-VDS design for VM-based and Bare Metal (with 2 pNIC)
  • Only two Transport Zones (Overlay & VLAN) assigned to a single N-VDS

The diagram below shows the slide with a single Edge Node N-VDS from one of the VMware sessions (CNET2061BU):

Edge Support with Multi-TEP-Nimish-Desai-VM.png

However, the single NSX-T Edge Node design comes with additional requirements respective recommendations:

  • vDS port group Trunks configuration to carry multiple VLANs (requirement)
  • VLAN pinning for deterministic North/South flows (recommendation)

This blog talks mainly about the second bullet point and how we can achieve the correct VLAN pinning. A correct VLAN pinning requires multiple individual configuration steps at different levels, as an example vDS trunk port group teaming or N-VDS named teaming policy configuration. The goal behind this VLAN pinning is a deterministic end-to-end path.

When configured correctly the BGP session is enforced to be aligned with the data forwarding path and hence the MAC addresses from the Tier-0 Gateway Layer 3 Interfaces (LIF) are only learnt at the expected ToR/Leaf switch trunk interfaces.

In this blog post the NSX-T Edge Node VMs are deployed on ESXi hosts which are NOT prepared for NSX-T. The two ESXi hosts belong to a single vSphere Cluster exclusively used for NSX-T Edge Node VMs. There are a few good reasons NOT to prepare these ESXi hosts with NSX-T where you host only NSX-T Edge Node VMs:

  • It is not required
  • Better NSX-T upgrade-ability (you don't need to evacuate the NSX-T Edge Node VM during host NSX-T software upgrade with vMotion to enter maintenance mode; every vMotion of the NSX-T Edge Node VM will cause a short unnecessary data plane glitch)
  • Shorter NSX-T upgrade cycles (for every NSX-T upgrade you only need to upgrade the ESXi hosts which are used for the payload VMs and only the NSX-T Edge Node VMs, but not the ESXi hosts where you have your Edge Nodes deployed
  • vSphere HA can be turned off (do we want to move a highly loaded packet forwarding node with vMotion in a host vSphere HA event? No I don't think so - as the routing HA model is much quicker)
  • Simplified DRS settings (do we want to move an NSX-T Edge Node with vMotion to balance the resources?)
  • Typically a resource pool is not required

We should never underestimate how important smooth upgrade cycles are. Upgrade cycles are time consuming events and are typically required multiple times per year.

To have the ESXi host NOT prepared for NSX-T is considered best practice and should always be deployed in any NSX-T deployments which can afford a dedicated vSphere Cluster only for NSX-T Edge Node VMs. Install NSX-T on ESXi hosts where you have deployed your NSX-T Edge Node VMs (called collapsed design) is appropriate for customers who have a low number of ESXi hosts to keep the CAPEX costs low.

The diagram below shows the lab test bed of a single ESXi host with a single Edge Node appliance which uses only a single N-VDS. The relevant configuration steps are marked with 1 to 4.

Networking – NSX-T Edge Topology-NEW.png

The NSX-T Edge Node VM is configured with two transport zones. The same overlay transport zone is used for the compute ESXi hosts where I host the payload VMs. Both transport zones are assigned to a single N-VDS, called NY-HOST-NVDS. The name of the N-VDS might confuse you a little bit due to the selected name, but the same NY-HOST-NVDS is used for all compute ESXi hosts prepped with NSX-T and indicate that only a single N-VDS is required independent of Edge Node or compute ESXi host. However, you might select a different name for the N-VDS.

Screen Shot 2020-04-11 at 11.40.18.png

The single N-VDS (NY-HOST-NVDS) on the Edge Node is configured with an Uplink Profile (please see more details below) with two static TEP IP addresses, which allow us to load balance the Geneve encapsulated overlay traffic for North/South. Both Edge Node FastPath interfaces (fp-eth0 & fp-eth1) are mapped to a labelled Active Uplink name as part of the default teaming policy.

Screen Shot 2020-04-11 at 11.40.26.png

There are 4 areas where we need to take care of the correct settings.

<1> - At the physical ToR/Leaf Switch Level

The trunk ports will allow only the required VLANs

  • VLAN 60 - NSX-T Edge Node management interface
  • VLAN 151 - Geneve TEP (Edge Nodes) VLAN
  • VLAN 160 - Northbound Uplink VLAN for NY-N3K-LEAF-10
  • VLAN 161 - Northbound Uplink VLAN for NY-N3K-LEAF-11

The resulting interface configuration along with the relevant BGP configuration is in the table shown below. Please note for redundancy reason both Northbound Uplink VLAN 160 and 161 are allowed on the trunk configuration. Under normal conditions, NY-N3K-LEAF-10 will learn only MAC addresses from VLAN 60, 151 and 160 and NY-N3K-LEAF-11 will learn only MAC addresses from VLAN 60, 151 and 161.

Table 1 - Nexus ToR/LEAF Switch Configuration

NY-N3K-LEAF-10 Interface Configuration
NY-N3K-LEAF-11 Interface Configuration

interface Ethernet1/2

  description *NY-ESX50A-VMNIC2*

  switchport mode trunk

  switchport trunk allowed vlan 60,151,160-161

  spanning-tree port type edge trunk

interface Ethernet1/2

  description *NY-ESX50A-VMNIC3*

  switchport mode trunk

  switchport trunk allowed vlan 60,151,160-161

  spanning-tree port type edge trunk

interface Ethernet1/4

  description *NY-ESX51A-VMNIC2*

  switchport mode trunk

  switchport trunk allowed vlan 60,151,160-161

  spanning-tree port type edge trunk

interface Ethernet1/4

  description *NY-ESX51A-VMNIC3*

  switchport mode trunk

  switchport trunk allowed vlan 60,151,160-161

  spanning-tree port type edge trunk

router bgp 64512

  router-id 172.16.3.10

  log-neighbor-changes

  ---snip---

  neighbor 172.16.160.20 remote-as 64513

    update-source Vlan160

    timers 4 12

    address-family ipv4 unicast

  neighbor 172.16.160.21 remote-as 64513

   update-source Vlan160

    timers 4 12

    address-family ipv4 unicast

router bgp 64512

  router-id 172.16.3.11

  log-neighbor-changes

  ---snip---

  neighbor 172.16.161.20 remote-as 64513

    update-source Vlan161

    timers 4 12

    address-family ipv4 unicast

  neighbor 172.16.161.21 remote-as 64513

    update-source Vlan161

    timers 4 12

    address-family ipv4 unicast

As part of the Cisco Nexus 3048 BGP configuration we see that only NY-N3K-LEAF-10 terminates the BGP session on VLAN 160 and only NY-N3K-LEAF-11 terminates the BGP session on VLAN 161.

<2> - At the vDS Port Group Level

The vDS is configured with four vDS port groups in total:

  • Port Group (Type VLAN): NY-VDS-PG-ESX5x-NSXT-EDGE-MGMT60: carries only VLAN 60 and has an active/standby teaming policy
  • Port Group (Type VLAN): NY-vDS-PG-ESX5x-EDGE2-Dummy999: this dummy port group is used for the remaining unused Edge Node Fastpath (fp-eth2) interface to avoid that NSX-T reports it as admin status down
  • Port Group (Type VLAN trunking): NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkA: Carries the Edge Node TEP VLAN 151 and Uplink VLAN 160
  • Port Group (Type VLAN trunking): NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkB: Carries the Edge Node TEP VLAN 151 and Uplink VLAN 161

The two trunk port groups have only one vDS-Uplink active, the other vDS-Uplink is set to standby. This is required so that the Uplink VLAN traffic along with the BGP session can only be forwarded on the specific vDS-Uplink (vDS-Uplink is mapped to the corresponding pNIC) during normal condition. With these settings we can achieve

  • Failover order gets deterministic
  • Symmetric Bandwidth for both overlay and North/South traffic
  • The BGP session between the Tier-0 Gateway and the ToR/Leaf switches should stay UP even if one or both physical links between the ToR/Leaf switches and the ESXi hosts goes down (the BGP session is then carried over the Trunk Link between the ToR/Leaf switches).

The table below highlights the relevant VLAN and Teaming settings:

Table 2 - vDS Port Group Configuration

NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkA ConfigurationNY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkB Configuration
Trunka-vlan-Screen Shot 2020-04-11 at 10.38.25.pngTrunkb-vlan-Screen Shot 2020-04-11 at 10.39.49.png
Trunka-teaming-Screen Shot 2020-04-11 at 10.38.06.pngTrunkb-teaming-Screen Shot 2020-04-11 at 10.39.58.png

<3> - At the NSX-T Edge Uplink Profile Level

The NSX-T Uplink Profile is a global construct that defines how traffic will leave a Transport Node respective Edge Transport Node.

The single Uplink Profile used for the two Edge Node FastPath interfaces (fp-eth0 & fp-eth1) needs to be extended with two additional Named Teaming Policies to steer the North/South uplink traffic to the corresponding ToR/Leaf switch.

  • The default teaming requires to be configured as Source-port-ID with the two Active Uplinks (I am using label EDGE-UPLINK1 & EDGE-UPLINK2)
  • An additional teaming policy called NY-Named-Teaming-N3K-LEAF-10 is configured with failover teaming policy with a single Active Uplink (label EDGE-UPLINK1)
  • An additional teaming policy called NY-Named-Teaming-N3K-LEAF-11 is configured with failover teaming policy with a single Active Uplink (label EDGE-UPLINK2)

Please note, the Active Uplink labels for the default and the additional Named Teaming Policies need to be the same.

Screen Shot 2020-04-11 at 10.58.49.png

<4> - At the NSX-T Uplink VLAN Segment Level

To activate the previous configured Named Teaming Policies for the specific Tier-0 VLAN segment 160 respective segment 161 we need to first assign the Named Teaming Policy to the VLAN transport zone.

Screen Shot 2020-04-11 at 11.07.12.png

The last step involves the configuration of each of the two Uplink VLAN segments (160 & 161) with the corresponding Named Teaming Policy. NSX-T 2.5.1 requires to configure the VLAN segment with the Named Teaming Policy in the "legacy" Advance Networking&Security UI. The recently released NSX-T 3.0 will support Policy UI.

Table 3 - NSX-T VLAN Segment Configuration

VLAN Segment NY-T0-EDGE-UPLINK-SEGMENT-160
VLAN Segment NY-T0-EDGE-UPLINK-SEGMENT-161

Screen Shot 2020-04-11 at 11.09.50.png

Screen Shot 2020-04-11 at 11.09.37.png
Screen Shot 2020-04-11 at 11.29.17.pngScreen Shot 2020-04-11 at 11.29.25.png

Verification

The resulting topology with both NSX-T Edge Nodes and the previous shown steps is shown below. It shows how the Tier-0 VLAN Segment 160 respective 161 is "routed" through the different levels from the Tier-0 Gateway towards the Nexus Leaf switches via the vDS trunk port groups.

Networking – NSX-T Edge Pinned VLAN.png

The best option to verify if all your settings are correct is to validate on which ToR/Leaf trunk port you learn the appropriate MAC address of the Tier-0 Gateway Layer 3 interfaces. These Layer 3 interfaces belong to the Tier-0 Service Router (SR). You can get the MAC address via CLI.

Table 4 - NSX-T Tier-0 Layer 3 Interface Configuration

ny-edge-transport-node-20(tier0_sr)> get interfacesny-edge-transport-node-21(tier0_sr)> get interfaces

Interface: 2f83fda5-0da5-4764-87ea-63c0989bf059

Ifuid: 276

Name: NY-T0-LIF160-EDGE-20

Internal name: uplink-276

Mode: lif

IP/Mask: 172.16.160.20/24

MAC: 00:50:56:97:51:65

LS port: 40102113-c8af-4d4e-a94d-ca44f9efe9a5

Urpf-mode: STRICT_MODE

DAD-mode: LOOSE

RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0)

Admin: up

Op_state: up

MTU: 9000

Interface: a3d7669a-e81c-43ea-81c0-dd60438284bc

Ifuid: 289

Name: NY-T0-LIF160-EDGE-21

Internal name: uplink-289

Mode: lif

IP/Mask: 172.16.160.21/24

MAC: 00:50:56:97:84:c3

LS port: 045cd486-d8c5-4df5-8784-2e49862771f4

Urpf-mode: STRICT_MODE

DAD-mode: LOOSE

RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0)

Admin: up

Op_state: up

MTU: 9000

Interface: a1f0d5d0-3883-4e04-b985-e391ec1d9711

Ifuid: 281

Name: NY-T0-LIF161-EDGE-20

Internal name: uplink-281

Mode: lif

IP/Mask: 172.16.161.20/24

MAC: 00:50:56:97:a7:33

LS port: d180ee9a-8e82-4c59-8195-ea65660ea71a

Urpf-mode: STRICT_MODE

DAD-mode: LOOSE

RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0)

Admin: up

Op_state: up

MTU: 9000

Interface: 2de46a54-3dba-4ddc-abe7-5b713260e7d4

Ifuid: 296

Name: NY-T0-LIF161-EDGE-21

Internal name: uplink-296

Mode: lif

IP/Mask: 172.16.161.21/24

MAC: 00:50:56:97:ec:1b

LS port: c32e2109-32d0-4c0f-a916-bfba01fdd6ac

Urpf-mode: STRICT_MODE

DAD-mode: LOOSE

RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0)

Admin: up

Op_state: up

MTU: 9000

The MAC address tables show that ToR/Leaf switch NY-N3K-LEAF-10 learns the Tier-0 Layer 3 MAC addresses from VLAN 160 locally and from VLAN 161 via Portchannel 1 (Po1).

And the MAC address tables show that ToR/Leaf switch NY-N3K-LEAF-11 learns the Tier-0 Layer 3 MAC addresses from VLAN 161 locally and from VLAN 160 via Portchannel 1 (Po1).

Table 5 - ToR/Leaf Switch MAC Address Table for Northbound Uplink VLAN 160 and 161

ToR/Leaf Switch NY-N3K-LEAF-10
ToR/Leaf Switch NY-N3K-LEAF-11

NY-N3K-LEAF-10# show mac address-table dynamic vlan 160

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*  160     0050.5697.5165   dynamic  0         F      F    Eth1/2

*  160     0050.5697.84c3   dynamic  0         F      F    Eth1/4

NY-N3K-LEAF-11# show mac address-table dynamic vlan 160

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*  160     0050.5697.5165   dynamic  0         F      F    Po1

*  160     0050.5697.84c3   dynamic  0         F      F    Po1

*  160     780c.f049.0c81   dynamic  0         F      F    Po1

NY-N3K-LEAF-10# show mac address-table dynamic vlan 161

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*  161     0050.5697.a733   dynamic  0         F      F    Po1

*  161     0050.5697.ec1b   dynamic  0         F      F    Po1

*  161     502f.a8a8.717c   dynamic  0         F      F    Po1

NY-N3K-LEAF-11# show mac address-table dynamic vlan 161

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*  161     0050.5697.a733   dynamic  0         F      F    Eth1/2

*  161     0050.5697.ec1b   dynamic  0         F      F    Eth1/4

*  161     780c.f049.0c81   dynamic  0         F      F    Po1

As we have seen in the Edge Transport Node configuration each Edge Node has two TEP IP addresses statically configured. Both Fastpath interfaces load balance the Geneve encapsulated overlay traffic. Table 8 shows the TEP MAC address in order to verify the Edge Node TEP MAC addresses.

Table 7 - ToR/Leaf Switch MAC Address Table for Edge Node TEP VLAN 151

ToR/Leaf Switch NY-N3K-LEAF-10ToR/Leaf Switch NY-N3K-LEAF-11

NY-N3K-LEAF-10# show mac address-table dynamic vlan 151

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*  151     0050.5697.5165   dynamic  0         F      F    Eth1/2

*  151     0050.5697.84c3   dynamic  0         F      F    Eth1/4

*  151     0050.5697.a733   dynamic  0         F      F    Po1

*  151     0050.5697.ec1b   dynamic  0         F      F    Po1

*  151     502f.a8a8.717c   dynamic  0         F      F    Po1

NY-N3K-LEAF-11# show mac address-table dynamic vlan 151

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*  151     0000.0c9f.f097   dynamic  0         F      F    Po1

*  151     0050.5697.5165   dynamic  0         F      F    Po1

*  151     0050.5697.84c3   dynamic  0         F      F    Po1

*  151     0050.5697.a733   dynamic  0         F      F    Eth1/2

*  151     0050.5697.ec1b   dynamic  0         F      F    Eth1/4

*  151     780c.f049.0c81   dynamic  0         F      F    Po1

Table 8 - NSX-T Edge Node TEP MAC Addresses

ny-edge-transport-node-20>ny-edge-transport-node-21>

ny-edge-transport-node-20> get interface fp-eth0 | find MAC

  MAC address: 00:50:56:97:51:65

ny-edge-transport-node-20> get interface fp-eth1 | find MAC

  MAC address: 00:50:56:97:a7:33

ny-edge-transport-node-21> get interface fp-eth0 | find MAC

  MAC address: 00:50:56:97:84:c3

ny-edge-transport-node-21> get interface fp-eth1 | find MAC

MAC address: 00:50:56:97:ec:1b

For the sake of completeness, the table below shows that only ToR/Leaf Switch NY-N3K-LEAF-10 learns the two Edge Node Management MAC address from VLAN 60 locally, ToR/Leaf Switch NY-N3K-LEAF-11 only via Portchannel 1 (Po1). This is expected, as we have configured the teaming policy in active/standby on the vDS port group. The Edge Node N-VDS is not relevant for the Edge Node management interface.

Table 8 - ToR/Leaf Switch MAC Address Table for Edge Node Management VLAN 60

ToR/Leaf Switch NY-N3K-LEAF-10
ToR/Leaf Switch NY-N3K-LEAF-11

NY-N3K-LEAF-10# show mac address-table dynamic vlan 60

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*   60     0050.5697.1e49   dynamic  0         F      F    Eth1/4

*   60     0050.5697.4555   dynamic  0         F      F    Eth1/2

*   60     502f.a8a8.717c   dynamic  0         F      F    Po1

NY-N3K-LEAF-11# show mac address-table dynamic vlan 60

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link,

        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan

   VLAN     MAC Address      Type      age     Secure NTFY Ports

---------+-----------------+--------+---------+------+----+------------------

*   60     0000.0c9f.f03c   dynamic  0         F      F    Po1

*   60     0050.5697.1e49   dynamic  0         F      F    Po1

*   60     0050.5697.4555   dynamic  0         F      F    Po1

Please note, I highly recommend always to run a few failover tests to confirm that the NSX-T Edge Node deployment works as expected.

I hope you had a little bit of fun reading this blog post about a single N-VDS on the Edge Node with VLAN pinning.

Software Inventory:

vSphere version: VMware ESXi, 6.5.0, 15256549

vCenter version:6.5.0, 10964411

NSX-T version: 2.5.1.0.0.15314288 (GA)

Cisco Nexus 3048 NX-OS version: 7.0(3)I7(6)

Blog history

Version 1.0 - 13.04.2020 - first published version

Version 1.1 - 14.04.2020 - minor changes (license)

Version 1.2 - 25.04.2020 - minor changes (typos)

Version 1.3 - 04.06.2020 - adding the 2nd Edge TEP in the second diagram and minor changes (typos)

Version history
Revision #:
1 of 1
Last update:
‎04-13-2020 10:11 AM
Updated by: