ewoodworth
Contributor
Contributor

Nic Team isn't working

I'm hoping somebody can help me with this as NIC teaming appears to do nothing for me. Here's my config

vSwitch0 has:

1 Virtual Machine port group

1 Service console port group

2 vmnics connected to the same vlan on different catalyst 6500 class switches

vswitch0 NIC teaming is setup like this:

Load balancing: route based on originating virtual port ID

network failover detection: Link status only

Notify Switches: Yes

Failback: Yes

Active Adapters vmnic0, vmnic1

The port group for the Virtual machines has the exact same NIC teaming settings as vswitch0.

I have 4 linux boxes running on this server and all network traffic passes out a single vmnic. According to the performance charts VMNIC1 has 100% of the traffic and VMNIC0 has no traffic. I can restart the machines, move them off the box and then back on, nothing matters. I still only get 1 vmnic used.

What's wrong with my config?

Tags (1)
0 Kudos
17 Replies
ewoodworth
Contributor
Contributor

Thanks in advance for any help!

0 Kudos
Rumple
Virtuoso
Virtuoso

if you have 6500 class switches, are you using them in a virtual stacked configruation so that you are doing etherchannel across the chassis. I believe the 6500 series do support that configuration.

what is your physical port configuration on the cisco side..have you configured them as a port group/etherchannel with Mode on/Active (forgive me if my terms are off, I know just enough about cisco to be dangerous)

check your load balancing method on the switches

show etherchannel load-balance

have you looked at this: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100404...

0 Kudos
AWo
Immortal
Immortal

Can you switch to the other physical NIC? What happens, does a failover occur? Is anything working as expected then?


AWo

VCP 3 & 4

\[:o]===\[o:]

=Would you like to have this posting as a ringtone on your cell phone?=

=Send "Posting" to 911 for only $999999,99!=

vExpert 2009/10/11 [:o]===[o:] [: ]o=o[ :] = Save forests! rent firewood! =
0 Kudos
rickardnobel
Champion
Champion

if you have 6500 class switches, are you using them in a virtual stacked configruation so that you are doing etherchannel across the chassis. I believe the 6500 series do support that configuration.

Should he really have to set up an etherchannel if using Port based load balancing on the ESX side?

My VMware blog: www.rickardnobel.se
0 Kudos
weinstein5
Immortal
Immortal

No Port Based does not require Etherchannel - only IP Hash requires Etherchannel -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
weinstein5
Immortal
Immortal

What are the 4 VMs doing? Can you start a continous pong on all 4 VMs and see if traffic does not start going out both NICs?

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
rickardnobel
Champion
Champion

No Port Based does not require Etherchannel - only IP Hash requires Etherchannel -

Yes, so the switch config should be quite easy.

Ewoodworth, have you access to the command line interface of the two switches? Can you check for example:

show interface fa0/1 (or what the port are called in your setup) on the link you belive is passive and see if they are actually seing no frames entering or leaving here?

My VMware blog: www.rickardnobel.se
0 Kudos
ewoodworth
Contributor
Contributor

Fail over does appear to work. I kept a ping going against the service console and then plugged and unplugged the cables in the vmnics and I never lost a ping. So fail over seems fine. It's only the load balancing that doesn't appear to work.

The 4 machines were all Ubuntu boxes running off the same image and they all needed a large amount of patching. So I let them all download the same patches as my initial traffic load. I then had them all run the speedtest at www.speakeasy.net at the same time. Then I used a tiny scipt to loop through wgets against a local website. So I tried generating traffic in a couple fast and dirty ways and it all went out the same net.

0 Kudos
rickardnobel
Champion
Champion

Fail over does appear to work. I kept a ping going against the service console and then plugged and unplugged the cables in the vmnics and I never lost a ping.

The Service Console on the same ESX host? Is the vswif port on the same vSwitch as your VMs? If so then the frames will not leave the virtual switches and hit the physical.

If this is correct, could you test a ping against something external and do the same test with disconnecting the cables?

My VMware blog: www.rickardnobel.se
0 Kudos
Rumple
Virtuoso
Virtuoso

Its been a while but with port based load balancing and both nics set to active once the vm starts it picks a pNic to use. Once that pNic dies don't you drop your vm until its restarted and then it will start using the other active nic?

With active/standby I know it will naturally failover but with active/active and no etherchannel I thought that is how it works.

I've been wrong before though.

0 Kudos
rickardnobel
Champion
Champion

Hello, I think it works this way with Port Based Load balancing:

When the VM starts it is being assigned to one pNIC and it will send all traffic through that. The physical switch will detect the VMs MAC address on this port and send all frames that is directed to this MAC through this port.

If one of the pNICs or links goes down then ESX will just "move" the VM to the other pNIC. The switch should just think that "ok, this MAC has been moved to another port, nothing strange with that" and begin to forward the frames through the new port.

My VMware blog: www.rickardnobel.se
0 Kudos
ewoodworth
Contributor
Contributor

"Hello, I think it works this way with Port Based Load balancing..."

Yes I think it works that way as well but on a per machine basis. So VM1 might end up on VMNIC0 and then it'll stay there so long as VMNIC0 doesn't fail. But VM2 could end up on VMNIC1 and then he'd stay on VMNIC1 so long as that doesn't fail.

So if I have 4 machine on the same box at least 1 of them should be on VMNIC0...instead I have everything working on VMNIC1.

0 Kudos
AWo
Immortal
Immortal

Coming from the same image? Have you checked that each guest has a different MAC address?


AWo

VCP 3 & 4

\[:o]===\[o:]

=Would you like to have this posting as a ringtone on your cell phone?=

=Send "Posting" to 911 for only $999999,99!=

vExpert 2009/10/11 [:o]===[o:] [: ]o=o[ :] = Save forests! rent firewood! =
0 Kudos
rickardnobel
Champion
Champion

Yes I think it works that way as well but on a per machine basis. So VM1 might end up on VMNIC0 and then it'll stay there so long as VMNIC0 doesn't fail. But VM2 could end up on VMNIC1 and then he'd stay on VMNIC1 so long as that doesn't fail.

Yes, that was what I tried to describe, so it is correct that your VMs should really be distributed across both VMNICs.

Did you see my question of your Service Console ping test above?

My VMware blog: www.rickardnobel.se
0 Kudos
dan13476
Contributor
Contributor

Although that shouldn't matter surely.

With port based it's not concerned with the actual mac, but just with the virtual adapter itself.

Still I'd probably try changing to source mac based policy to see if it splits the loads, just as an experiment. And maybe even try building a fresh vm or two in the same portgroup.

Dan.

0 Kudos
ewoodworth
Contributor
Contributor

Did you see my question of your Service Console ping test above?

I saw it but I didn't know why you were asking. I reread it and I understand now. All the traffic I generated was with external machines, including my pings. It wouldn't have been much of a test if it wasn't. So I am confident failover works.

0 Kudos
ewoodworth
Contributor
Contributor

Still I'd probably try changing to source mac based policy to see if it splits the loads, just as an experiment. And maybe even try building a fresh vm or two in the same portgroup.

That's probably a good idea but I'm out of time and I need to deploy. I'm replacing an existing cluster with this new hardware so I already ahve excellent metrics on utilization so I know the existing nic never goes over 120Mb/s with a typical level of ~15Mb during the day - so having a single Gig link is much more than enough for my needs. If fail over works then I'm happy.

I mostly wanted to solve this as an exercise. It should work, I want it to work, but I can't let it stop my rollout because it's honestly just gravy.

But thanks for trying to help everybody!

0 Kudos