VMware Cloud Community
kellino
Enthusiast
Enthusiast
Jump to solution

Networking: Etherchannel vs. Fault Tolerance

Here's my question. I'm not a network guru but my understanding is that if you assign several NIC ports to a virtual switch that you gain transmit bandwith (all ports are used for send) but that the same is not true on the recieve side, because you need a single MAC and an arbitration layer than can distribute incoming traffic. My assumption is that there is no way to do this other than having both NICs go to the same switch and using Etherchannel trunking.

So now we have a trade off. If I choose to alternate NICs to different switches for fault tolerance, I am limited to only 1GB on the recieve side no matter what I do.

If I have odd ESX hosts going to switch A and even going to switch B this doesn't help much either as how many shops are building a 50% reserve into their ESX capacity reservations.

So assuming my above assumptions are true I am curious about what strategies and practices other shops use to increase the recieve pipe beyond 1GB (without using 10gbE) without exposing to too much exposure from a switch failure event.

Thanks!

Reply
0 Kudos
1 Solution

Accepted Solutions
jbogardus
Hot Shot
Hot Shot
Jump to solution

Yes, that's basically it.

I attended a VMWorld 2008 session given by a VMware employee specializing in the networking fuctionality. He stated that the 'by originating port ID' is the most recommended load balancing method to use which is the reason it is default. The TCP/IP hash method is OK, but has specific physical switch (EtherChannel) configuration requirements which makes it risky to be configure incorrectly or incorrectly maintained. The IP Hash would allow a single VM to potentially communicate with multiple hosts over multiple physical links, but in most environments this isn't really necessary.

Yes, when you have a redundant configuration of physical switches the redundant pNICs for the virtual switch should be distributed to multiple switches.

View solution in original post

Reply
0 Kudos
8 Replies
jguidroz
Hot Shot
Hot Shot
Jump to solution

Etherchanneling does not double your throughput. It doubles the available bandwidth. So say you have a host with 2 nics etherchanneled together. When Host A talks to Host B, it will only use one of the NICs in the etherchannel. If Host A needs to talk to Host C, it will use the other nic in the Etherchannel. So you have double the available bandwith, but not double the throughput capacity to a single host.

Now with ESX, the transmit bandwidth does not increase just by adding more nics to a virtual switch.

If you want to use multipe switches and etherchanneling, you will need to invest in the Cisco 3750 stackable switches.

kellino
Enthusiast
Enthusiast
Jump to solution

Now I'm really confused. Smiley Happy I think I understand your point that a single conversation would not necessarily have more bandwidth, but different conversations could be dsitributed across different NICs, enabling more agggregated network I/O capability?

As for the transmit side, I'm confused by the statement that "transmit bandwidth does not increase by adding more nics to a virtual switch". There is a load balancing policy on the virtual switches (we use route based on IP hash) , so wouldn't transmit traffic be balanced across the physical NIC ports, and therefore adding more pNICs could in theory provide a larger pipe?

Reply
0 Kudos
jbogardus
Hot Shot
Hot Shot
Jump to solution

I think there is some confussion is referring to Host A, B, and C in jguidroz's response. The load balacing calculation occurs at a per VM level not at an ESX Host level. He should refer to VM A, B, and C in the example.

If the default 'Originating Port ID' load balancing scheme is used then each VM on the host has all it's traffic directed over one physical NIC based on the virtual port number it's vNIC gets assigned on the virtual switch. The virtual port numbers in use for the all the VMs connected to the vSwitch get evenly distributed to utilize a specific pNIC for each vNIC. The physical switch will associate the VM's mac address with that one physical link so traffic returning to that VM will using the same physical link.

If the 'based on TCP/IP hash' load balancing scheme is used in combination with EtherChannel set on the physical switch then the traffic from VMs is directed over a pNIC based on the VM's IP and the IP of the destination. So a VM with traffic going to multiple destination IPs can be actively utilizing multiple pNICs at the same time if the IP hash with each different destination IP address results in a different pNIC being assigned for the traffic. EhterChannel protocols control the distribution of traffic coming into the VM so that incoming traffic may returned on a different link on the etherchannel that the related outgoing traffic used.

kellino
Enthusiast
Enthusiast
Jump to solution

I think I'm catching on. Let me restate it this way and see if this makes sense....

A VM (or vNIC) will be assigned one physical NIC (pNIC) and therefore one VM can never have more than 1GB available to it.

However, there are multiple VM's so a host can distribute (on a per VM basis) traffic across all the pNICs upstream from the vSwitch.

And because a specific external MAC is advertised, there is a level of recieve load balancing taking place (but on a per VM basis).

So taking a step back, unless I need to have a pipe larger than 1GB within a VM, then I really don't need Etherchannel. And therefore if this is the case, it would probably be a best practice to have the pNICs go to alternate switches for fault tolerance at the switch level.

Am I close? Smiley Happy Thanks!

Reply
0 Kudos
jbogardus
Hot Shot
Hot Shot
Jump to solution

Yes, that's basically it.

I attended a VMWorld 2008 session given by a VMware employee specializing in the networking fuctionality. He stated that the 'by originating port ID' is the most recommended load balancing method to use which is the reason it is default. The TCP/IP hash method is OK, but has specific physical switch (EtherChannel) configuration requirements which makes it risky to be configure incorrectly or incorrectly maintained. The IP Hash would allow a single VM to potentially communicate with multiple hosts over multiple physical links, but in most environments this isn't really necessary.

Yes, when you have a redundant configuration of physical switches the redundant pNICs for the virtual switch should be distributed to multiple switches.

Reply
0 Kudos
kellino
Enthusiast
Enthusiast
Jump to solution

Thanks and that makes sense. BTW the reason we are using "route on IP hash" is because about 18 months ago we had a problem where one of our Cisco switches was not updating the ARP table for some reason, such that when a vMotion took place it would get "lost" on the network. In working with VMWare support they had us change to IP hash at the time.

Yes, I think that originating port ID is the default, and partly I think because it has less overhead.

Since our Cisco switches were replaced since then I should proably thing about moving back to "originating port ID" at some point.

Thanks for the help.

Reply
0 Kudos
Ken_Cline
Champion
Champion
Jump to solution

Check out my blog (it's in my signature) - I've got a six part (so far) series dealing with virtual networking. This topic is covered in one of the early parts...

Ken Cline

VMware vExpert 2009

VMware Communities User Moderator

Blogging at: http://KensVirtualReality.wordpress.com/

Ken Cline VMware vExpert 2009 VMware Communities User Moderator Blogging at: http://KensVirtualReality.wordpress.com/
Reply
0 Kudos
kellino
Enthusiast
Enthusiast
Jump to solution

Ken -- Thank you so much. I really enjoyed your "vSwitch debate" seriess and found it very informative. Great job and thanks for the tip!

Reply
0 Kudos