VMware Cloud Community
J0S3M
Contributor
Contributor

LAG not working between ESXi 8 and Netgear 4300 series switch..

I've got this setup:

2 Netgear M4300 12X12F (Stacked using 2 ports) 

4 HP DL360Gen10 servers with a 2 port 10GB NIC each and VMWare ESXi 8

I created a STATIC LAG for each servers with one port in each switch of the stack.

LAGs has the following config: (Link Trap: Disable, STP Mode: Enable, Static Mode: Enable, Hash Mode: Src/Dst IP and TCP/UDP Port Fields)

LAGs are configured as Trunks (Switch Port Mode) with one Native VLAN (say 140) and several tagged VLANs.  ESXi Management Network uses the Native VLAN (Untagged)

At the ESXi side, Virtual Switches has the 2 interfeces of the LAG set to Active and NIC teaming Load Balancing Policy set to Route Based on IP Hash

I read Vmware KB article 1001938 and I think my setup is compliant.

The problem:

I remove the ports from the LAGs and let them as standalone VLAN trunks and everything works fine.When I add the ports to the LAGs traffic keeps flowing ok for some time but gets interrumpted a few minutes later on some ramdom hosts.

Any suggestion would be much appreciated...

Thnx.

0 Kudos
3 Replies
Kinnison
Commander
Commander

Hi,


My (very) personal approach is that when I use to aggregate multiple links I avoid use different algorithms but above all I keep away from statically aggregating multiple ethernet link. But other than that you should consult the logs of your network devices to rule out any possible anomalous conditions.


Regards,
Ferdinando

0 Kudos
compdigit44
Enthusiast
Enthusiast

My understanding of using any type of port channel , LAG etc.... is that you can only use IP Hash and have no other option.

0 Kudos
Kinnison
Commander
Commander

Hi,


It is not entirely accurate, if you statically aggregate you use "IP hash" and for consistency you set your network device in the same way, as indeed suggested in the referenced KB article. But if you aggregate via LACP and therefore have vDS objects, practically all the "load balancing" algorithms I've found around are supported.


But if you use some form of aggregation, it makes sense that the settings are consistent also both at the "virtual switch" level and at the "portgroup" level, but in any case when we are faced with a network problem, IMHO consulting the logs of network devices may helps to identify or exclude a possible problem, never seen the case (for now, tomorrow who knows) where all the settings and connections "are correct yet the things don't work".


Regards,
Ferdinando

0 Kudos