I am running ESXI 7 and am making my 1st attempt at setting up LACP/LAG on a distributed switch and am failing miserably. I have read various articles and watched videos on this until my eyes bleed, but haven't been able to get this working. My scenario is I have 4 hosts connected to a Cisco 3750 switch. I have the switch configured correctly to the best of my knowledge (4 port channels, 2 NIC's for LAG assigned on each ESXI host). As an example, ESXI host #4 is connected to G1/0/39 and G1/0/43. However, when I disable interface g1/0/43, it does not send anything through g1/0/39 and pings to my VM stop until I enable g1/0/43 again.
Here is my relevant 3750 config for host #4:
port-channel load-balance dst-ip
interface Port-channel4
description ESXI4 Etherchannel
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
spanning-tree portfast trunk
interface GigabitEthernet1/0/39
description ESXI4 VM NIC1
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
channel-group 4 mode on
spanning-tree portfast trunk
!
interface GigabitEthernet1/0/43
description ESXI4 VM NIC2
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
channel-group 4 mode on
spanning-tree portfast trunk
server#sh ether sum
Flags: D - down P - in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
Number of channel-groups in use: 4
Number of aggregators: 4
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SU) - Gi1/0/3(P) Gi1/0/7(P)
2 Po2(SU) - Gi1/0/15(P) Gi1/0/19(P)
3 Po3(SU) - Gi1/0/27(P) Gi1/0/31(P)
4 Po4(SU) - Gi1/0/39(P) Gi1/0/43(P)
I am also attaching the screenshots of the LAG config and my distributed switch config. I have read where someone said to use Explicit Failover order instead of Route based on IP hash. I presume from the preponderance of people saying to use Route based on IP hash, that is the correct one?
Any reason why you configured "channel-group 4 mode on" instead of "channel-group 4 mode active"? "Mode On" is required in case you use a standard vSwitch.
The virtual port group is not aware of the load balancing, i.e. will only see the lag as a single uplink, so you can leave the "Load balancing" setting at the default.
André
i initially had it set to channel-group 4 mode active but it wasn't working. I saw somewhere someone set it to channel-group 4 mode on so I tried it and at least I could ping my vm, but turns out only 1 nic seems to work.
what is the output of esxcli network vswitch dvs vmware lacp status get
Have you tried setting your load balancing type on the LAG settings to 'Source and Destination IP' instead of just destination?