KemperValve
Contributor
Contributor

ESX 3.5 + Procurve + NFS/SAN Bonding + Teaming

I am trying to get load balancing across (4) 1000mb adapters. While I am able to get some load balancing, I am not getting the type I want. I want all 4 adapters to act as one, so I have a 4GB pipe to my NFS/SAN device which holds hosts. Currently, if I do a disk performance test from the host on the san, its capped at 100MB, I'd like to see higher if possible. I know the I/O of the SAN and array is capable, it is a 6TB RAID10 with a very good raid controller. I know currently that if I had 4 connections to 4hosts, each connection would go over a separate interface capped at 1000mb, but I want a single connection to be balanced accross all 4.

This is my setup:

Switch: HP procurve 2824

NFS/SAN: Linux disk array running (4) 1000MB adapters in bonded mode (currently mode 0/round robin, although I've tried 4/lacp and 6 with same results)

Server/Hosts: Esx 3.5 with all 4 nics configured as a virtual device with load balancing enabled.

For SAN<-->Switch: In linux bonding I've tried jsut about every mode (0/4/6), when I do mode 4(LACP) and set the switch ports to an LACP grouping, it is recognized on both sides of the connection.

For ESX<-->Switch: Ive left esx at IP Hash, and on the switch side I've tried no grouping, LACP grouping, and FEC grouping. All with the same result.

The hardest part is trying to find where the bottleneck actually is, since everything has to be correct. Has anyone had experience getting this functioning? Also, if this is not possible due to the NFS/IP session could it be done going with iSCSI?

Thanks

0 Kudos
5 Replies
KemperValve
Contributor
Contributor

Also, the switch is running the old firmware which does have an option for FEC (Fast etherchannel), this was taken out in later firmware versions.

0 Kudos
kukacz
Enthusiast
Enthusiast

You can't create a single broad pipe for a single IP<->IP connection. The balancing feature works by changing NIC ports for different IP address pairs. The same pair always goes through a single NIC. With ProCurve 2900 I'm able to check the ports load through it's web interface. Isn't the same feature on the 2824 for you to check?

There might be a solution to this problem depending on what SAN vendor you use. Eg. LeftHand arrays are able to balance different IP paths providing you use VM-level iSCSI initiator and MPIO driver. I've tested it. On the HP switch I had to set the basic trunking (non-protocol) instead of LACP. In ESX the balancing method was "IP hash", as you had.

Let us know what storage you're using.

--

Lukas Kubin

fejf
Expert
Expert

To make ESX to use more than one NIC for traffic you have three load balancing algorithms - but you need to know how they work: e.g. ip hash only gets you load balancing if you use 4 connections with at least one different ip on one end. This means to get load balancing here you need 4 different Target- or Source-IPs in your NFS-Client to NFS-Server connection (e.g. 4 NFS-Server IPs).

Btw: This is all explained in the VI3-Install and Configure courseware...

--

There are 10 types of people. Those who understand binary and the rest.

-- There are 10 types of people. Those who understand binary and the rest. And those who understand gray-code.
0 Kudos
jaredo
Contributor
Contributor

This is kempervalve also, I'm aware of the different methods of hashing but every example I have seen of people trying to get this to work was with IP hashing. So, would it be possible if I set the NFS/SAN connection to round robin (hitting each slave 1 by 1), then setting vmware to Port ID or mac hashing? Basically my question wasn't how to get it working with IP hashing, just trying to find out if what I want possible with any method and if so which one.

I'm just running a generic fedora 8 box with 4 gig nics running NFS (over 16 drives/Raid10).

0 Kudos
KemperValve
Contributor
Contributor

Has anyone done this on a cisco swtich with fast etherchannel?

0 Kudos