VMware Cloud Community
irigoyen
Enthusiast
Enthusiast

LACP Teaming and failover best practice configuration for VM network

 

I know that this topic has already been discussed several times, but I never received an exhaustive answer….

I have the VM network configured with LACP on a VDS. The Physical switch are also configured with LACP. All is working fine.

The load balancing mode is already configured on the LAG as “Source and Destination IP address”

irigoyen_0-1646942489195.png

I don’t want to argue that LBT is better or not than LACP and his differences.

 

My question is how the Port Group Teaming and failover policy must be configured for best practices.

 

The VMware Knowledge base is a bit confusing. The KB2034277 says:

“All port groups using the LAG Uplink Port Group enabled with LACP must have the load balancing policy set to IP hash load balancing”

 

But at the same time, on the bottom of the document it says:

“The LAG load balancing policies always override any individual Distributed Port group if it uses the LAG with exception of LBT and netIOC as they will override LACP policy, if configured”

 

So, I have or not to set the Load balancing as “Route based on IP hash”?
Or with LACP on VDS is this setting ignored?

 

 

Port Group

irigoyen_1-1646942541160.png

 

0 Kudos
3 Replies
Kinnison
Commander
Commander

Hi, IMHO,

 

The load balancing method set on vSphere (vDS / dPG) should be set according to the load balancing method supported by the networks apparatus, no more no less.
Your question, an interesting one, was a discussion argument on this forum some years ago, with referecence to the very same KB article you referenced today.

Have a look to this link: https://communities.vmware.com/t5/VMware-vSphere-Discussions/Does-Load-Based-Teaming-Policy-override...

 

Many respectfull opinion here and there are against the use of LACP link aggregation on ground of some complication, some intricacies and possible network distruption on the mean time. IMHO all true but to some extent.

The "unassisted by a switch" teaming and load balancing does is job very well but, when coming to a failover / failback condition, is slow to react and as brief it can be in some environment is more then enough to bring havoc (the traffic flow is indeed impacted) and also posing the (legit) doubt if this can result in the dreaded "MAC flapping condition".

In contrast (a matter of fact) LACP is so fast to react to an eventual link failure condition that in most cases "one may not even notice" if not "in the log" or other kind of monitoring instrument. Your noted Mr. Denneman article deserves to be read (as not an issues with LACP LAG).

 

I have the impression that this topic is hardly debated, at least on forum, because LACP link aggregation is available only with virtual distributed switch and thus enterprise licensing, out of economic availability of many "small business environment".

 

Regards,
Ferdinando

0 Kudos
irigoyen
Enthusiast
Enthusiast

Hello Ferdinando,
Thank you for your reply and the Link. The discussion on the post you linked is very similar, but sadly there isn’t a definitive answer and the discussion is more focused on the overwriting if the setting is LBT. In my case, I just want to configure the LACP and I would like to know what’s the setting I have to configure on the load balancing of the port group.

Regards.

0 Kudos
Kinnison
Commander
Commander

Hi Irigoyen,


TBH, sometime I never fully understood what is "written on paper" related to the same argument but on different places.


For this reason, personally, at the "distributed port group" level I have decided to set the same load balancing hashing algorithm which matches the one configured for the "LAG group" in turn configured to use. Obviously the hashing algorithm set to the "LAG group" on the host side match the the hashing algorithm set (or available) to the physical network apparatus, it probably isn't best practice but, IMHO, I don't think it makes much sense to act differently.


To get an official answer I think the only way is to involve technical support.


In a way I'm one of the "old school", I believe in best practices as a useful reference but still subordinate to the "concrete of things".
No misunderstanding, I only speak (always) of my experiences in my specific operational context.

 

Regards,
Ferdinando

0 Kudos