LarryBlanco2
Expert
Expert

ESXi 4 U1 & Etherchannel = Trouble

Jump to solution

This is long and confusing but I will try to put it as simple as possible.

I have an ESXi 4 U1 box that has 8 pNics on it. The pNics are supposed to be spread out as follows:

5 on one EtherChannel (Port Channel 10) for mgmt,vm data

3 on one EtherChannel (Port Channel 11) for IP Sotrage

The Cisco's are set up as static port channels (NO LACP)

Dummy native port of 4094

passing vlan 1,10,20 on port channel 10

If I try to add either 1 or all 5 pNics on the mgmt network for the box i have no connectivity at all

If I try to just add 1 pNIC again no connectivity on the mgmt network. I can't ping the box.

I removed 1 pNIC from port channel 10 and made it a standalone trunkport still passing vlan 1,10,20

I can now connect to the box over that 1 connection with the same ip setting that I used on the prior bond.

It also allows me to configure a new virtual switch (vSwitch1) with the other 4 pNics using port channel 10 and I do get connectivity to the 3 VLANs.

I create another virtual switch (vSwitch2) and added the nics on port channel 11 and am able to connect to my ip storage vlan.

I am able to ping into the box from other VLANs into test vmkernel interfaces i created for the 3 VLANs. So far this config is allowing me to use EtherChannel. I can even attach to the box with the vsphere client from address on vlan 10 & 20 that are on the port channels.

So now I try, since I have connectivity from other subnets, to reconfigure the 1 pNIC and place it into the port channel 10. As soon as I remove that 1 pNIC, even though I have other mgmt networks, I lose connectivity to the entire box. I can no longer ping and of the ip on any of the VLAN's the box has access to.

If anyone has any as to why this occurs, I would greatly appreciate it.

Thank you,

LB

Please do not forget to award point for helpful answers.
0 Kudos
1 Solution

Accepted Solutions
s1xth
VMware Employee
VMware Employee

Excellent...that could definitly be it...let us know if you found a solution. Scotts post is very good on this configuration and I used it on my first configuration also. There are some conflicting comments though from others on there regarding the native vlan command not being needed in a config, thats why I mentioned it. In you specific configuration sounds like you do need it.

Here is another good link on the configuration too..

http://www.booches.nl/2008/05/05/port-channel-configuration-for-vmware/

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi

View solution in original post

0 Kudos
16 Replies
s1xth
VMware Employee
VMware Employee

Just a thought....have you changed the load balancing on the vswitch to route-based-ip-hash to match your Cisco configuration?

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
0 Kudos
LarryBlanco2
Expert
Expert

Yes, thank you.

The vswitch on the esxi box is set to ip hash as is the cisco switch. It is also set to the "ip src-dsc" algorithm. A tab bit more info on the cisco side.

The switch is a 4507R w/ Sup 2+ engine on it.

Larry

Please do not forget to award point for helpful answers.
0 Kudos
Rumple
Virtuoso
Virtuoso

can you post one of the etherchannel configs?

0 Kudos
LarryBlanco2
Expert
Expert

Below is what is configured for the etherchannel in relation to the Esxi box.

port-channel load-balancing src-dst-ip

!

interface Port-channel10

description 802.1Q to ESXi1

switchport trunk encapsulation dot1q

switchport nonegotiate

switchport trunk allowed vlan 1,10,20

switchport trunk native vlan 4094

switchport mode trunk

spanning-tree portfast trunk

!

interface GigabitEthernet2/20

description Member Po10

switchport trunk encapsulation dot1q

switchport nonegotiate

switchport trunk allowed vlan 1,10,20

switchport trunk native vlan 4094

switchport mode trunk

channel-group 10 mode on

spanning-tree portfast trunk

!

interface GigabitEthernet2/21

description Member Po10

switchport trunk encapsulation dot1q

switchport nonegotiate

switchport trunk allowed vlan 1,10,20

switchport trunk native vlan 4094

switchport mode trunk

channel-group 10 mode on

spanning-tree portfast trunk

!

interface GigabitEthernet3/20

description Member Po10

switchport trunk encapsulation dot1q

switchport nonegotiate

switchport trunk allowed vlan 1,10,20

switchport trunk native vlan 4094

switchport mode trunk

channel-group 10 mode on

spanning-tree portfast trunk

!

interface GigabitEthernet4/20

description Member Po10

switchport trunk encapsulation dot1q

switchport nonegotiate

switchport mode trunk

switchport trunk allowed vlan 1,10,20

switchport trunk native vlan 4094

channel-group 10 mode on

spanning-tree portfast trunk

!

interface GigabitEthernet5/20

description Member Po10

switchport trunk encapsulation dot1q

switchport nonegotiate

switchport trunk allowed vlan 1,10,20

switchport trunk native vlan 4094

switchport mode trunk

channel-group 10 mode on

spanning-tree portfast trunk

Please do not forget to award point for helpful answers.
0 Kudos
s1xth
VMware Employee
VMware Employee

One thing that is jumping out to me is the native vlan line. I have never used that in my ESXi etherchannel configs. I would take that out and see what happens. Another thought is to put the ports that are members of that port channel to channel group * mode desireable to see if you have connectivity, then change it to ON to see if connectivity stays. Make sure all of you port groups, not just the vswitch is set to the correct load balancing method, I have seen a lot of people get caught with that, sometimes the vswitch settings dont get applied to all port groups.

*One other thing I noticed is that you are tagging vlan 1, I dont believe you can vlan 1 on ESX, as that is considered a excluded vlan, but I could be wrong. I never use vlan 1.

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Rumple
Virtuoso
Virtuoso

I think you need to set a native vlan on that port channel 20 trunk to be one that will never pass over that trunk so that all packets are tagged, otherwise esx doesn't tag the packets.

I also don't see the allowed vlans either...

Rumple
Virtuoso
Virtuoso

Here is a good site on configuring port channel's,etc. Used it for multiple farm setups successfully.

http://blog.scottlowe.org/2006/12/04/esx-server-nic-teaming-and-vlan-trunking/

0 Kudos
LarryBlanco2
Expert
Expert

The native vlan here is used to pass data traffic. I know that esx and esxi do not accept untagged traffic when using "Route base on ip hash" and since natively vlan 1 goes untagged. I needed to change it.

I did see that the port group for mgmt was set to "orginiating virtual port id" You may be right there. That is what may have gotten me. I will give that a try. Now I need to wait for the cisco guy to get in. I would change it but don't hold those keys.

Thanks.

Please do not forget to award point for helpful answers.
0 Kudos
LarryBlanco2
Expert
Expert

Yup, I used his site as reference. Great information on there.

Larry B.

Please do not forget to award point for helpful answers.
0 Kudos
s1xth
VMware Employee
VMware Employee

Excellent...that could definitly be it...let us know if you found a solution. Scotts post is very good on this configuration and I used it on my first configuration also. There are some conflicting comments though from others on there regarding the native vlan command not being needed in a config, thats why I mentioned it. In you specific configuration sounds like you do need it.

Here is another good link on the configuration too..

http://www.booches.nl/2008/05/05/port-channel-configuration-for-vmware/

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi

View solution in original post

0 Kudos
Rumple
Virtuoso
Virtuoso

Ya, those packet pushers get pretty upset when you play in their sandbox 🐵

0 Kudos
LarryBlanco2
Expert
Expert

I believe it was a combination of the port group not inheriting the vswitch loadbalancing config of "ip hash". I still wasn't able to get the mgmt port group to come up on vlan 1 tagged or untagged. No matter what combination I tried.

So what did I do, what anyone else would do. Use a different vlan for mgmt and problem solved.

I went as far with the networking team, after the port channel was up and runnung, to shut down individual links on the port group and everything kept working as if nothing had happened. Great failover and a poor mans load balancing. I do wish we go with Ent. plus and get the 1000V on there to have LACP. Smiley Happy

Wishful thinking. Maybe if VMWare lowers their prices. heheheh we can get it.

Thanks again!

Larry

Please do not forget to award point for helpful answers.
0 Kudos
Rumple
Virtuoso
Virtuoso

Well...you actually do not want lacp..

True etherchannel is highly recommended over lacp because you also get link aggregation without doing any real vmware specific nic failover configuration.

0 Kudos
LarryBlanco2
Expert
Expert

Your right. I actually ment etherchannel. Link aggregation is what I would prefer. I got 3750's for IP Storage and a 4507 for data and all are capped to the 1Gbit link speed because of no true etherchannel for esx. arggg... Smiley Happy

Larry

Please do not forget to award point for helpful answers.
0 Kudos
LarryBlanco2
Expert
Expert

I wanted to post my resolution for the issue. In simplest of terms. ESXi 4 has no way of automatically setting itself to "Route based on IP Hash" and must be configured manually beforehand if your switch is already configured for it.

This was my issue, I was not able to connect to the network using EtherChannel because the hash algorithms were not the same. The switches were set to IP SRC and the ESXi box default is set to 'Originating v port id'.

After initially setting up all the IP config and getting all the nics on the port channel activated from the ESXi management interface, I then proceeded to the unsupported cli. There I logged on to the box and applied the following commands.

vim-cmd hostsvc/net/vswitch_setpolicy –nicteaming-policy=loadbalance_ip vSwitch0

and

vim-cmd hostsvc/net/portgroup_set –nicteaming-policy=loadbalance_ip vSwitch0 ‘Management Network’

and for good measure

vim-cmd hostsvc/net/refresh

For some reason, as stated by s1xth, the vSwitch settings do not propagate to the default 'Management Network' port group. Therefore the setting of the nic policy for the port group was required.

As soon as I did that, everything started communicating. Then I preceded to do it 3 more times for my other ESXi 4 servers. and it all worked and got my DR documentation down pat.

Hope this will help someone in the future!.

Thanks,

Larry B.

Please do not forget to award point for helpful answers.
0 Kudos
s1xth
VMware Employee
VMware Employee

Larry...

Through some of my trial and error, I acutally found that you can put the Port Group mode into 'desireable mode' . This way the port group is active and will still pass the vlan tags but the portgroup wont be acutally used. Hop on the server and change the vswitch, management network and any other ports to ip-hash. Go back to the switch and put portchannel mode on and the port channel will come up and foce the ip-hash LB.

It ended up being faster for me doing it this way, but I have access to my Cisco 4507 switch so I can make those changes.

Just a thought, glad you got it all working!!!

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
0 Kudos