VMware Networking Community
boomchke
Contributor
Contributor

VTEP VLAN limitation

When configuring NSX the second time around, I decided that I wanted to try and use separate networks for each VTEP interface across 3 hosts.  2 of the hosts happened to be on DVS and the 3rd on a second DVS.  It appears, that NSX won't let you use two separate VLANs on the same DVS for the VTEPs.  I received an error message...

VLAN <number> can not be used.  Other VLAN IDs are in use on the specified DVS.

Am I interpreting that limitation correctly?  If so, is this restriction in place to facilitate local flooding of traffic between hosts for multicast and broadcast?  I suppose if there were two different VLANs that would be a problem if flooding is occurring from one host in the VLAN to others. 

0 Kudos
16 Replies
admin
Immortal
Immortal

Normally hosts in the same DVS would have a consistent set of VLANs on their uplinks, which is the premise that the current release of NSX is using.

0 Kudos
boomchke
Contributor
Contributor

Agreed - However, why wouldn't I be able to use separate VLANs for VTEPs?  Im thinking its because of local flooding but I'd like to confirm that. 

0 Kudos
admin
Immortal
Immortal

> why wouldn't I be able to use separate VLANs for VTEPs?

VTEPs are in essence vmk interfaces that sit on a dvPg, which is a DVS-wide construct, and thus should have a single VLAN ID associated with it.

The only within-VTEP-subnet BUM replication optimisation that exists is the Hybrid (or Multicast) control plane mode for Logical Switches.

0 Kudos
rbudavari
Community Manager
Community Manager

There is a dataplane requirement for VXLAN to use the same VLAN ID within the same VDS (this is not related to flooding). That being said as long as a global VLAN ID is used this doesn't need to be the same L2 domain and the VXLAN transport can cross L3 boundaries. This is described in the NSX design guide:

http://www.vmware.com/files/pdf/products/nsx/vmw-nsx-network-virtualization-design-guide.pdf

0 Kudos
boomchke
Contributor
Contributor

Understood - However, why cant I have multiple distributed port-groups and VTEP interfaces in each distributed port-group.  AKA...

Host1 - DVS1 - Distributed Port Group VTEP1 - VMK VTEP 1

Host1 - DVS1 - Distributed Port Group VTEP2 - VMK VTEP 2

Seems like that should work considering how VXLAN allows layer 2 extension across layer 3.

0 Kudos
admin
Immortal
Immortal

Just to clarify - did you mean VTEP2 to be on Host2? Or are you talking about multiple VTEPs on the same host?

0 Kudos
boomchke
Contributor
Contributor

The problems with copy and paste....  This is what I meant....

Host1 - DVS1 - Distributed Port Group VTEP1 - VMK VTEP 1

Host2 - DVS1 - Distributed Port Group VTEP2 - VMK VTEP 1

So two different hosts that are on the same DVS.  I would think I could have the VTEP interfaces for each host on different networks (port groups)

0 Kudos
admin
Immortal
Immortal

Thank you for clarifying.

At this point VXLAN is enabled on a per-cluster basis, and using the same dvPg for all hosts in the cluster looks like a logical choice. Adding further granularity would complicate things for no obvious benefit.

In your particular situation, one way to achieve what you're after would be to create three clusters and three DVS, with one host in each. This will allow you to configure different VLAN and different IP subnet for each.

A bit of a background on this: one of the popular ways of building "cloudy" DCs is in "pods", where say 30-32 servers are connected in a redundant manner to a pair of ToR switches. All VLANs are localised to the pod (ie, don't extend beyond the pair ot ToRs). Pod VLANs are your usual suspects - Management, vMotion, FT, IP Storage, VSAN, and VTEP. Then the pair of ToRs is in turn connected to multiple "spine" switches via multiple L3 links, providing equal cost multi-path (L3 ECMP) connectivity between pods.

This is (somewhat) described in the "Layer 3 in the Data Center Access Layer" section of the guide Ray pointed out above (Page 12), with some more detail, especially around hosts with multiple NICs, in "NSX on UCS/Nexus" http://www.vmware.com/files/pdf/products/nsx/vmware-nsx-on-cisco-n7kucs-design-guide.pdf design guide.

When infrastructure is built following this model, having the 1:1:1 relationship between DVS : VTEP dvPg : IP Subnet feels much more natural.

P.S. When using hosts with multiple NICs with port-based or MAC-based teaming, NSX will create more than one VTEP per host, but all VTEPs will be connected to the same dvPg, and use the same source of IP configuration (IP Pool or DHCP).

0 Kudos
boomchke
Contributor
Contributor

Thanks for your detailed response.  While I see your point, I still see this as a unnecessary limitation.  I mean no offense here, but 'Feeling natural' doesnt seem like valid reason to enforce an architectural decision.  This prevents me from running the DVS across multiple pods that are in different layer 3 segments.  The days of extending layer 2 up to distribution (and across DC's in my opinion) are over.  Good data center design includes limiting your layer 2 segments (failure domains) to the smallest area possible.  It seems that VMware acknowledges this with the use of a layer3 encapsulation for layer2 traffic with NSX.  Couple this with the fact that I cant vMotion from DVS to DVS and Im starting to miss the entire point of all of this.

Am I way off here?  I mean, extending a DVS past a layer 3 segment seems to be the only way to maintain the ability to vMotion.  But at the same time you're saying I can't do that with NSX since I have to use the same DVS port group.

What am I missing here?

0 Kudos
admin
Immortal
Immortal

I believe Ray in the post#5 in this thread talks about this.

You can extend your DVS across different pods, where VLANs in one pod are not connected to VLANs in the other, and VTEPs in one pod can use different subnet from those in other pods. The only requirement is that VLAN ID for the VTEP dvPg is the same across all pods.

The above will require addressing your VTEPs via DHCP; that way you can supply them with IP addresses from different subnets.

0 Kudos
boomchke
Contributor
Contributor

Ok.  Seems like its just a weird limitation.  Im not sure why it would care what VLAN is used if its a layer 3 interface.  I'm going to quit kicking the dead horse here though.  Thanks for your continued responses. 

0 Kudos
admin
Immortal
Immortal

ESXi's vmk interfaces are not "true" L3, as they associated with a portgroup, which in turn needs either (a) VLAN ID; or (b) a physical NIC, to use as dvPg's uplink.

If you're familiar with Cisco equipment, vmk is much more like an "interface VLAN123", than an "interface FastEthernet 1/0".

You're very welcome, and I think this is an excellent discussion. Please keep them questions coming!

0 Kudos
rbudavari
Community Manager
Community Manager

It matters because all of the VXLAN VMkernel interfaces (VTEPs) are connected to the same dvPortgroup for a cluster.

0 Kudos
boomchke
Contributor
Contributor

Yep - My question all along was why the VTEPs all need to be in the same dvPortGroup for a cluster.  I understand thats a requirement, just trying to figure out why since it's L3. 

0 Kudos
Titanomachia
Enthusiast
Enthusiast

I think it will change in future, it must be to do with multicast as that is limited to the subnet, plus hybrid partly too. Of course unicast is an option for the whole network so that it why I think it'll change. There is also a design perspective, if for example you use UCS, having the VTEPs on the same layer 2 network means they will only need to traverse the FIs to communicate with one another. However, if they are on different subnets, they will traverse the core to be routed.

Also having a DvS across hosts on different IP subnets means you "cannot" vMotion as it's not supported, but it will probably work. Only layer 2 support for vMotion but again this will change soon.

0 Kudos
jpiscaer
Enthusiast
Enthusiast

We're currently running into this limitation, but in a multi-site deployment.

We have the following set up:

1 vCenter appliance

2 physical sites

1 vSphere cluster per site

1 dvSwitch per site

We came from a single dvSwitch (across all hosts/clusters), but with this setup, I have issues with BUM traffic and encryption. We moved to dvSwitches per site for these reasons.

With a dvSwitch per site, we run into issues with port group names (that need to be unique across dvSwitches in a single vCenter management domain) and VLAN IDs as described above.

What are the design considerations and possible solutions here?

EDIT:

I see two possible solutions:

1. Have a single non-stretched VLAN with multiple subnets configured

2. Use different VLANs (each with their own subnet) and configure each VTEP manually to use that VLAN.

Cheers, Joep Piscaer VMware vExpert 2009 Virtual Lifestyle: http://www.virtuallifestyle.nl Twitter: http://www.twitter.com/jpiscaer LinkedIn: http://www.linkedin.com/in/jpiscaer **If you found this information useful, please consider awarding points**
0 Kudos