VMware Cloud Community
patk008
Contributor
Contributor

NIC Teamig connected to Two cisco core switch 4506

Thanks for your help.

Our Virtual Infrastructure 3 have the following configuration:

\- 1 ESX 3.01 host

\- 1 pNIC dedicated for console

\- 1 pNIC dedicated for vmotion

\- 6 pNIC dedicated for VM network

\- 2 Cisco core switch connected together by Tunk and all users pc are connected on both core pSwitch.

I want the ESX host still available if one of Cisco pSwitch fail, Is it works if I make the following configuration on ESX host and pSwitch:

\- Create one vSwitch and add those 6 pNIC to it as Active Adapter and enable Teaming with IP hash.

\- connect 3 of 6 pNIC to the first Cisco switch with Trunk port configed on pSwitch with src-dst ip hash

\- Then connect second 3 of 6 pNIC to the second Cisco switch with Trunk port configed on pSwitch with src-dst ip hash

Any problem with this configuration?

Is it true that there are two path to the ESX host? any problem with it?

What's the traffic flow path if user from the first core switch connect to VM guest in ESX?

What's the traffic flow path if user from the second core switch B connect to VM guest in ESX?

Any recommendation?

What the differences if I change the second 3 of 6 pNIC as Standby Adapter in the vSwitch and connected it to the second Cisco switch. What will happend if one of active adapter fail then the standby adapter is changed to active? Two path available to the ESX?

As I know that there will be a probem in physical switch environment if I connected all 3 physical switch together with Trunk, spanning tree looping may be occured if no special configuration is set.

ie. Switch A --connect to -- Switch B --connect to Switch C --connect to -- Switch A

I'm not a network guide and I also new in ESX and got confuse with the above configuration. Thanks for your help.

Patrick

0 Kudos
12 Replies
jvde
Contributor
Contributor

Hi,

Cisco switch does not support 802.3ad (link aggregation)

This means that If one of your cisco switch fall down, no fallback garanty

0 Kudos
dinny
Expert
Expert

Hi Patrick,

I believe there are various ways you could achieve what you want to.

The following presentation gives a pretty good overview:

http://download3.vmware.com/vmworld/2006/tac9689-b.pdf

My understanding is that virtual switches do not cause spanning tree (or other loops) so that should not be an issue.

Peronally I just use 4 PNICs for my VM networks - each with route set based on originating Port ID.

Two nics are connected to one cisco switch - two are to another.

I use vlan tagging - so I set up 802.1q trunking on the cisco switches for the relevant VLANs - but I suspect you do not need to bother with trunking if you are not using VLAN tagging?

Setting portfast - (or portfast trunk (for trunks)) on the Cisco ports does seem worthwhile though.

Hope this helps

Dinny

0 Kudos
bertdb
Virtuoso
Virtuoso

Indeed, vswitches don't do spanning tree, and don't need to, because they will \_never_ send a packet that comes from a switch to another switch.

(NB vswitch communication is always with physical switches, since virtual switches don't exchange packets with each other directly.)

0 Kudos
patk008
Contributor
Contributor

Thanks all for your help. all your info are very helpful

I have read through that doc before, If I use 802.1q trunking to connect our two cisco core switch, "out-ip" hash is not recommened noted by that doc.

What's the best hash method on cisco switch and esx host according to my setup?

Does this need to the same on cisco switch and esx host?

According to my setup, it has four path from pSwitch to vSwith physically:

\- pSwitch A to vSwitch

\- pSwitch B to vSwitch

\- pSwitch A to pSwitch B then vswitch

\- pSwitch B to pSwitch A then vswitch

Is it true? Any problem with it?

Thanks

Patrick

0 Kudos
meistermn
Expert
Expert

Hi look at the following URL for VMware physical switch load balancing

http://virtrix.blogspot.com/

One of the most difficult (and almost undocument) features of ESX is to configure your switch for assisted load balancing for a VM Network vswitch with more then 1 pNICs. You should be aware of the fact that ESX supports 802.3ad Static only (EtherChannel).

In essence, you need 2 things:

1. A load balancing schema on your switch port group, based on the switch configuration. The trick here is to setup your vSwitch load balancing policy to be compatible.

src-mac, dst-mac, src-dst-mac = MAC hash

src-ip, dst-ip, src-dst-ip = IP hash

For Cisco catalyst switches, issue the show etherchannel load-balance command. This should result in something like src-dst-ip.

2. VLAN trunk port on your switch when using different VLANs (VST mode) for your Virtual Machines

To achive this, you need to configure your switch(example for a catalyst running on IOS creating an etherchannel for 3 pNICs):

\----


interface port-channel1

description VMware ESX - Trunk A

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 100,200 (= VLANs to be assigned)

switchport mode trunk

switchport nonegotiate (=ESX does not support DTP (dynamic trunking protocol). So when you configure a trunk port, set it to nonegotiate)

spanning-tree portfast trunk

!

exit

!

interface GigabitEthernet1/1

description VMware ESX - Trunk A - NIC 0

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 100,200 (= VLANs to be assigned)

switchport mode trunk

switchport nonegotiate (=ESX does not support DTP (dynamic trunking protocol). So when you configure a trunk port, set it to nonegotiate)

spanning-tree portfast trunk

channel-group 1 mode on

!

exit

!

interface GigabitEthernet1/2

description VMware ESX - Trunk A - NIC 1

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 100,200 (= VLANs to be assigned)

switchport mode trunk

switchport nonegotiate (=ESX does not support DTP (dynamic trunking protocol). So when you configure a trunk port, set it to nonegotiate)

spanning-tree portfast trunk

channel-group 1 mode on

exit

!

interface GigabitEthernet1/3

description VMware ESX - Trunk A - NIC 2

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 100,200 (= VLANs to be assigned)

switchport mode trunk

switchport nonegotiate (=ESX does not support DTP (dynamic trunking protocol). So when you configure a trunk port, set it to nonegotiate)

spanning-tree portfast trunk

channel-group 1 mode on

Look at the whitepaper:

http://www.vmware.com/pdf/esx3_vlan_wp.pdf and the very helpful FAQ

0 Kudos
patk008
Contributor
Contributor

Thanks for your details info, Meisterm. I got it.

And there are a lot of useful info in your site. Thanks again.

Patrick

0 Kudos
letoatrads
Expert
Expert

Just for future reference, if you find a posters info helpful, feel free to assign helpful and correct points before marking the thread answered as you can't assign points once the question if marked answered.

0 Kudos
patk008
Contributor
Contributor

Hi meistermn, all

Thanks for your help.

I have setup the etherchannel trunk according to your suggestion.

-both pSwitch and vSwitch use sr-dst ip hash.

-etherchannel trunk port created on each pSwitch

But I found that I got packet lost (about 3%) when I issue a ping command from my pc to to vm. And no packet lost found when ping from vm to a physical server.

Do you have any ideas?

Thanks

Patrick

0 Kudos
mprigge
Enthusiast
Enthusiast

Hi Patrick,

I've wrestled with this problem a lot. The short answer is that you cannot use IP Hash load balancing with two different pSwitches attached to the same vSwitch at the same time. The reason for this is that the two pSwitches will both, at some point or another, end up with the same guest's MAC address in their mac address tables at the same time. IP Hash load balancing depends upon both the switch and the host knowing which NIC traffic from a given IP address is going to arrive on. The balancing algorithm depends upon each "side" of a given channel to think it has the same number of members - if they don't the algorithm won't be applied correctly. If traffic arrives at an ESX host on a pNIC that it isn't expecting it on, it will throw it away. That's the loss you're seeing. I'm not really describing this in very much detail, but the bottom line is that you can't do what you're trying to do how you're trying to do it.

\*however*

You can do it if you configure half your pNICs (specifically the half going to one pSwitch) as standby adapters. Leave all of your trunking/port-channel configuration as they are and leave IP Hash turned on on your vSwitch. What this will do is force all of your traffic down three of the NICs and through one of your switches. I know that's not ideal, but it does still give you the benefit of IP hash load balancing. Should that pSwitch fail, the three NICs attached to the other server should come up and you should be no worse off.

Hopefully that makes sense. I've arrived at this conclusion after trying to make multi-pSwitch, multi-pNIC configurations work and this is the best I've been able to come up with that is both high-redundancy and high-throughput. I have it running in a number of different installations and know it works, so you shouldn't have too many problems with it unless you have something else going wrong also. Smiley Happy

HTH - Matt

0 Kudos
patk008
Contributor
Contributor

Hi Matt,

Thanks for your help.

I have tried your solution: 3 NIC as Active on pSwitch A and 3 NIC as Standby on pSwitch B. But I found that when 1 of Active NIC is failure, then one of Standby NIC will change the state from Standby to Active, this behaviour will generate the packet lost problem again, because both pSwitch have "Active connection" right now, please correct me if i'm wrong.

And I've another question about the pSwitch's etherchannel configuration,

vmware support engineer asked me to use "spanning treee portfast disable" for testing the packet lost issue. This can't solve the problem after I test it.

I've found most users on this formun suggest to use "spanning treee portfast trunk".

Which one is correct? any different?

Thanks again.

Patrick

0 Kudos
mprigge
Enthusiast
Enthusiast

That's absolutely true and I should have mentioned that. That is a serious drawback. If you're trying to find a method of obtaining \*switch* redundancy, it shouldn't matter though. You'll rarely, if ever, encounter a single failed port on a switch without the whole thing going bad (especially with a higher-end switch like the 4506). It's much more likely that someone will mistakenly unplug one of the team members or you'll get a bad cable and there's not much you can do about that.

As far as enabling multi-vlan portfast or disabling it, I can't imagine that that will help (or hurt, really). The problem is not actually a spanning tree problem at all. The problem is that sooner or later both switches will think that they have a local path to the same VMK/SC/VM MAC address and will route inappropriate traffic down one of their own links to it. You cannot have a single etherchannel spread among multiple switches - it just doesn't work. Though the reason it doesn't work doesn't have anything to do with STP. So, you can try it, but I don't imagine it will do anything.

The most loss-resistant way to solve this problem is to simply not use IP Hash load balancing at all. Remove all of the port-channel configuration from both of your pSwitches (leave the trunking configuration) and switch the ESX host over to PortID load balancing. You'll give up the ability to spread traffic from or to a single VM over multiple pNICs, but you won't lose traffic even if you lose a single NIC or all of them except for one. So you'll gain stability at the expense of performance.

It's really a decision for you to make about which you feel more comfortable with. I use the standby/active method because I can be reasonably certain that a single NIC will not experience a failed link in my environment. Use your own judgment on how you feel about that in your own environment. It also so happens that I have a few VMs that really need the performance of more than a single NIC, so I don't have much choice until 10G support is available.

0 Kudos
patk008
Contributor
Contributor

Hi Matt,

Thanks for your details Explanation, I got your meaning.

I think it is good that vm esx should able to handle the "True" active/standby mode. (only make standby nic to acitve if all active nic failure)

Thanks again.

Patrick

0 Kudos