I couldn't seem to get an etherchannel working properly on a ESXi5.5 host, I'm not using vcenter and was attempting to aggregate the links to a Cisco 3750.
I referred to the kb article below, I configured IP hash on the ESXi host, set SRC-DST-IP aggregation algorithm on the switch, and disabled LACP, however when I disabled LACP per the guide, I lost connectivity altogether to the ESXi host, although all of the ports in the etherchannel showed '(P)' and actually correctly established a bundle.
If I re-configured the etherchannel to enable LACP, I could ping and connect to the host, however all of the ports in the etherchannel are showing '(I)' which means operating independently.
So I presume its the host configuration rather than the switch configuration which is incorrect but can anyone advise?
KB Article: VMware KB: Sample configuration of EtherChannel / Link Aggregation Control Protocol (LACP) with ESXi...
Switch config:
Cisco IOS Software, C3750 Software (C3750-IPBASEK9-M), Version 12.2(50)SE1, RELEASE SOFTWARE (fc2)
CM_CORE#sh run int po1
Building configuration...
Current configuration : 122 bytes
!
interface Port-channel1
description PG1 to IBM x3650
switchport trunk encapsulation dot1q
switchport mode trunk
end
CM_CORE#sh run int fa1/0/1
Building configuration...
Current configuration : 125 bytes
!
interface FastEthernet1/0/1
switchport trunk encapsulation dot1q
switchport mode trunk
channel-group 1 mode active
end
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SD) LACP Fa1/0/1(I) Fa1/0/2(I) Fa1/0/3(I)
Fa1/0/4(I)
Does it work with
channel-group 1 mode on (rather than active, see KB article)
André
Hi there,
Thanks for the reply, I have actually tried setting it to on, and it doesn't seem to make a difference I'm afraid. I'll be back on-site this weekend so I'll see if I can gather some more information. Is there anything in particular I should collect from the host / switch.
Note at present although I'm not noticing any particular connectivity issues, the switch is reporting that the ports in the etherchannel are flapping.
There's basically not much more than configuring the physical switch, setting the policy to IP-Hash and to make sure all vmnics are active. Anyway, what's the reason you are channeling the ports? IMO there's no real benefit in doing this except for a very specific configurations, where outgoing traffic from VMs goes to a large number of targets.
André
Hi,
Unfortunately the customer's core switch is 100Mb/s, and the host is directly attached to this running multiple servers. Therefore if the bond is capable of delivering up to 4 individual 100Mb/s sessions the performance should be acceptable. In addition there is always the resiliency aspect to consider.
Since on this occasion I am only using their free hypervisor product, is there any way I can raise this as a potential bug with VMWare?
Is there perhaps an interoperability matrix so I can verify the firmware of the switch to be compatible?
It should be possible to open a per incident case with VMware. However, before doing this I would consider other solutions. I'm no dedicated network guy, but IMO the bond doesn't really have an advantage over multiple individual trunk ports, used with the default round-robin assignment (Route based on the originating virtual switch port ID). Using the default settings will distribute the VMs across the uplinks and thus kind of load balance the traffic.
Another option you may consider is to get a Gigabit switch, connect the host to this switch (again, using the default settings) and create an active LACP/PAgP bond between the physical switches.
André
Hi
I had this issue because the IP Hash load balancing configuration is not propagated to the Management PortGroup.
Try to follow the procedures described in this KB, it works in my case
VMware KB: NIC teaming using EtherChannel leads to intermittent network connectivity in ESXi
Note: To reduce the chance of network connection loss please change the vSwitch load balancing to Route based on IP hash using this method
Thanks,
Bayu
Are you using standard switches or dvSwitches and have you enabled LACP via the web client?
The KB article you reference is quite old. LACP is configured rather differently in vSphere 5.5. The "IP Hash" thing is from the days of static Etherchannels.
First, make sure you have enabled the LACP feature on the VDS uplink port group which allows the VDS to respond to LACPDUs. Example below:
For more details, check out this post I wrote on LACP in 5.5 for details and a video: Exploring Enhanced LACP Support with vSphere 5.5 | Wahl Network
Hi Chris,
Thanks for your input but I'm unable to access the web interface due to using only the free edition of ESXi for this particular installation. Is there a means of accessing this setting via the VI Client instead? Thank you.
Using LACP requires a vSphere Distributed Switch (VDS) and Enterprise Plus licensing. The feature is also only available from the vSphere Web Client.
If you are using the free edition (and are thus limited to the Standard Switch) your only choice is a static port channel (Cisco IOS "mode on").
The only GUI where you can make this change is the web but the alternate option is to perform it via the CLI.
Just to be clear:
While ESXCLI will provide many "list" type of commands for the VDS, it is an ESXi host CLI; as such, it is unable to configure the required settings for LACP on a VDS due to the necessity of vCenter (control plane).
If the OP wants to use a static LAG, either ESXCLI or the vSphere Client should suffice.
So in summary, the information contained in the first KB is actually valid and is the only method of enabling an etherchannel on 5.5 without vcenter. Chances are the problem is caused due to the KB article which Bayu referenced.
Yup. Every vSwitch port group connected to the port channel must be in "IP Hash" for a static LAG (Etherchannel). Otherwise the MAC address may end up flapping as the host will pick one uplink and the upstream switch may pick another, causing traffic to drop.
My comments are only in reference to LACP based on your earlier trial and error.