VMware Cloud Community
davequinlisk
Contributor
Contributor

Does Load Based Teaming Policy override LACP Policy?

Hi All,

I have a question regarding Link Aggregation Policy.  Specifically regarding setting up the policy on a vSphere 6.5 dvSwitch.

In the following scenario:

  • You have an ESXi 6.5 host with 2 x 10Gb ports configured in an LACP group with (as an example) 'Source and Destination IP Address' set as the policy.
  • You also leverage 'Route Based on Physical NIC Load' on the dvSwitch Port Group.

What policy behaviour is expected?  I ask this because I feel there is conflicting information between the following KB Article and the vSphere 6.5 Web Client UI.

In the UI, it states that the LACP policy overrides the dvSwitch Port Group policy.   However in the following KB, it states that there is an exception to this rule in the form of 'Route based on Physical NIC Load (LBT) and/or in scenarios in which NIOC is configured.'

The article I am referring to is 'Enhanced LACP Support on a vSphere 5.5 Distributed Switch':https://kb.vmware.com/s/article/2051826

The article has a note at the bottom that states:

vSphere 5.5 supports these load balancing types:

  1. Destination IP address 
  2. Destination IP address and TCP/UDP port 
  3. Destination IP address and VLAN 
  4. Destination IP address, TCP/UDP port and VLAN 
  5. Destination MAC address 
  6. Destination TCP/UDP port 
  7. Source IP address 
  8. Source IP address and TCP/UDP port 
  9. Source IP address and VLAN 
  10. Source IP address, TCP/UDP port and VLAN 
  11. Source MAC address 
  12. Source TCP/UDP port 
  13. Source and destination IP address 
  14. Source and destination IP address and TCP/UDP port 
  15. Source and destination IP address and VLAN 
  16. Source and destination IP address, TCP/UDP port and VLAN 
  17. Source and destination MAC address 
  18. Source and destination TCP/UDP port 
  19. Source port ID 
  20. VLAN

Note: These policies are configured for LAG. The LAG load balancing policies always override any individual Distributed Port group if it uses the LAG with exception of LBT and netIOC as they will override LACP policy, if configured.

These two configurations are going to load balance based on the traffic load which is better than LACP level load balancing.

How I hope this works is that if I have the ambition to leverage LACP and LBT, ESXi will utilise the policy in the LACP configuration unless LACP notices a saturated link and then allocate accordingly?  I am aware that without an LACP LAG then LBT defaults back to originating Virtual Port ID. 

In summary my question is, which is true the Web Client GUI or the note in the above article?

Many Thanks

Dave

0 Kudos
2 Replies
daphnissov
Immortal
Immortal

To directly answer your question, I believe the web client is correct in this case whereby the LACP algorithm overrides the port group policy. That said, however, if you are already licensed for vDS, there is almost no point in using LACP if you can use LBT (i.e., route based on physical NIC load). If you do some searches here and online from other writers, you'll see that in the choice between LACP vs LBT, the latter is always preferable. For those that still choose LACP, the common misconception is two-fold: 1) The only way to utilize multiple uplinks simultaneously is through a LAG and 2) LACP allows a single connection to "spread out" across multiple uplinks creating the ability to use the sum of all teamed NICs. Neither of these statements are true. The increased overhead and complexity when using LACP is just not worth it in virtually every case out there. My advice: Stick with LBT and do not use LAGs of any kind.

0 Kudos
davequinlisk
Contributor
Contributor

Hi daphnissov,

Thanks for your reply and view on the topic. I agree that the benefits of LBT are not worth losing in most instances compared to the benefits of LACP when they are mutually exclusive.  However, as I understand it, an LACP LAG is faster and less disruptive when failing over than the alternative non-LAG approach.  This is likely only measured between seconds vs milliseconds but never the less it is a more optimal configuration if (big IF) LBT worked alongside it.

If they are mutually exclusive as you state, in normal operation when all links are operating (which is obviously almost always true) LBT is going to provide the most benefit.

Having said that, I am no networking expert but do believe the above to be true.

Ideally, I was looking for the perfect situation in which I could leverage the benefits of LACP and the benefits of LBT.  That article gave me hope!

As I am eternally optimistic, I will see if I can get a support call raised that may provide further clarification and technical justification on the statement in the KB.  Alternatively, maybe it will be updated to reflect the truth.

Thanks again for your comments.

Dave

0 Kudos