Re: Virtual Switch issue

jpoling · ‎04-28-2008

I have another post about some possible network issues, so I don't want this issue to get confused with that (I beleive they are probably separate issues).

After ugprading one ESX host to 3.5 Update 1 (clean isntall of 3.5 U1), I ahve a virtual switch with two NIC connections. The NICs are teamed on the back end. One of the NICs is only transmitting packets, not receiving them. In our Cacti historical graphs for the siwtch port in question, prior to the upgrade to 3.5, the NIC was transmitting and receiving without issue. Has anyone seen this behavior?

I just looked at another vSwitch on the same host. . .it connects to a totally different network. The same behavior is exhibited - one NIC transmits and recieves, the other only transmits.

Jeff

Message was edited by: jpoling

Ken_Cline · ‎04-28-2008

The vSwitch has absolutely no control over the packets being received. The pSwitch is responsible for selecting the port it wishes to use for transmitting data to the ESX host.

Ken Cline

Technical Director, Virtualization

Wells Landers

VMware Communities User Moderator

Ken Cline VMware vExpert 2009 VMware Communities User Moderator Blogging at: http://KensVirtualReality.wordpress.com/

weinstein5 · ‎04-28-2008

What Load Balancing Method do you hagve configured for your virtual switch?

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

jpoling · ‎04-28-2008

I am using IP hash. . .

On our other esx hosts which are still 3.0.x, I am not seeing this. . .so I am curious if something changed in the algorithm for load balancing? The configuration is the same on the vSwitches regardless of the ESX version.

Looking at esxtop shows different behavior between the versions. ..I am uploading two images:

esx1.jpg --> 3.5 Update 1 host. Note the 0.00 for packets transmitted on vmnic3/vswitch2

esx3.jpg --> 3.0.2 host. Note that vmnic2 and vmnic3 both transmit and receive

The problem here is that the vSwitch on esx1 appears not to be transmitting via the vmnic3. Does that mean it is only using one NIC for transmitting and receiving over vmnic3? We're not seeing any noticeable issues related to this, but it is odd compared to our other hosts.

Thanks for any insight

Jeff

kjb007 · ‎04-28-2008

As Ken pointed out, the ESX host does not have any control of traffic being received by both NICs. In order to accomplish this, then both pNICs have to be attached to the same switch, and you have to create an etherchannel between the two ports on the switch. Unless this is done, ESX will send traffic out both NICs, doing its side of the load balancing, but the switch must do its part in order to fully utilize both interfaces.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB

jpoling · ‎04-28-2008

Ok, sometimes when posting it is all too easy to leave out important details:

I forgot to include that on all of our ESX hosts the vSwitches consist of two physical NICS which are trunked on the physical switch.

How does that information change things?

Jeff

kjb007 · ‎04-28-2008

If it is 1 physical switch, then spanning tree may be blocking one of the ports, and you should be able to check that and see if a port was marked as blk or blocking mode. If you want to use both, configure ether channel between the two ports to get link aggregation.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB

jpoling · ‎04-28-2008

I'll have to check with our network guy on the spanning tree. . .I do know for certain that the two ports in question are configured as a trunk on the physical switch. We use HP Procurve switches which do not specifically use etherchannel (or at least not that term).

Jeff

kjb007 · ‎04-28-2008

Ahh, the hp lingo is a bit different than the cisco lingo. The hp trunk is a cisco etherchannel (802.3ad) , so you're on the right track. What algorithm are you using on the switch for the "trunk". HP tagged port is equivalent to cisco trunk (802.1q). One of my pet peve's, same name used differently, but nevermind that.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB

Ken_Cline · ‎04-28-2008

One (or more) quick question -

- * How many VMs are running on the host?

- * How many IP conversations are active? Since you're using IP hash as your load balancing algorithm, the outbound pNIC will be determined by an XOR of the LSB of the source and destination IP address. It's possible that the hash is pointing to the same pNIC for all conversations (not likely, if there are many conversations, but with only two pNICs in the vSwitch, it is a possibility)

Ken Cline

Technical Director, Virtualization

Wells Landers

VMware Communities User Moderator

Ken Cline VMware vExpert 2009 VMware Communities User Moderator Blogging at: http://KensVirtualReality.wordpress.com/

mike_laspina · ‎04-28-2008

Hi,

HP Procurve can use FEC (Fast Ethernet Channel) and LACP on trunks but I would advise you to not set them up that way. The proper trunk setup for HP Procurve is to not use either FEC or LACP protocols on the trunk. FEC will work but newer HP firmware's will not have FEC available. They are using 802.1ad in the future.

Issue the following command as shown here of course substituting your interface ports and it will behave with IP hash and will work on reboot.

trunk 23-24 Trk1 trunk

http://blog.laspina.ca/ vExpert 2009

jpoling · ‎04-29-2008

Ken,

The esx host in question has 18 VMs on it. . .

I am not sure about the number of IP conversations, but this 3.5 host gets loaded with our heaviest CPU consumers by DRS. . .

I also confirmed with our network guy that the trunking is setup correctly (per another post in this thread)...I still need to confirm how spanning tree is setup.

Jeff

jpoling · ‎05-22-2008

I never got a real resolution from VMware but the problem was "fixed." I changed the load balancing method to source port and then back to IP Hash. Once I did that, there was traffic in both directions on both NICs

Fast Forward to last night: I did another clean isntall of a host to 3.5 U1. The host also has teamed NICs and also exhibits the strange behavior where one NIC is only receiving and not transmitting. . .

So, am I missing something? Is this expected based on the load balancing algorithm? Network connectivity to all VMs on the host is just fine. . .

Has anyone else seen this after a clean install of a 3.5 host?

Any insight is greatly appreciated

Jeff

kukacz · ‎05-22-2008

Jeff,

ProCurve switches are using something they call "non-protocol trunk", which you can use as a replacement for EtherChannel. Be careful it only works within a single switch, however - unlike 802.3ad (unsupported by VMware).

You might find my guide for ESX trunking with ProCurve useful: http://kukacz.blogspot.com/2008/04/teaming-nics-in-esx-configuration.html

--

Lukas Kubin

-- Lukas Kubin

jpoling · ‎05-22-2008

Thanks. . .on the HP switch we have it setup "properly" (i.e., using the "non-protocol trunk" - trunk).. It seems like just an oddity in the installation of 3.5. . .I am still curious if others are seeing the same thing after an install.

Jeff

All

Virtual Switch issue