Contributor
Contributor

Linux Bridge Between vSwitches

I am trying to create a network bridge between 2 vSwitches using Linux (CentOS 5.4 / 2.6.18-164.15.1 bridge-utils 1.1)- pretty straight forward setup.

vSwitch0 has the physical NICs in the ESXi box connected and promiscuous mode allowed. vSwitch1 has no physical NICs and promiscuous mode allowed as well. ESXi has 2 VMs, Linux machine with 2 vNICs (one in each vSwitch) and a Windows machine with a single vNIC in vSwitch1.

I've created a ethernet bridge on the Linux machine enslaving both eth0 and eth1, disabled selinux, flushed iptables and ensured both vNICs are up.

With this setup, the windows VM can not get arp replies back over the linux bridge so it never gets the mac address of the physical network GW (a cisco switch in this case). I can see the arp broadcast from the windows VM go over the bridge, and get replied too from the switch... but the reply never makes it back over the linux bridge. The response never gets sent out eth1 in vSwitch1. If I set the GW mac statically in the arp table on the windows machine, everything seems to work- so its only layer 2, ethernet broadcasts that do not seem to be able to make it over the bridge- in only one direction.

Cisco GW <-> Physical host NIC <-> vSwitch0 <-> Linux Machine vNIC eth0 <-> Linux bridge br0 <-> Linux machine vNIC eth1 <-> vSwitch1 <-> Windows vNIC

I know it sounds like a Linux issue, but this is a very basic bridge which works in the physical environment. (i set it up in the lab on physical hardware just to test that i am not forgettting something basic, and it works as expected.)

I can't be the first person to try this- is host bridging of vSwitches not supported?

(The Linux machine will -if this works- end up being a transparent Snort sensor)

0 Kudos
10 Replies
Contributor
Contributor

I would definitely make sure both eth adapters are in promiscuous mode...

0 Kudos
Contributor
Contributor

They are-

dmesg output:

device eth0 entered promiscuous mode

device eth1 entered promiscuous mode

br0: port 2(eth1) entering learning state

br0: port 1(eth0) entering learning state

acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5

VMware PVSCSI driver - version 0.0.0.6

VMware memory control driver initialized

vmmemctl: started kernel thread pid=2806

eth0: no IPv6 routers present

br0: no IPv6 routers present

eth1: no IPv6 routers present

br0: topology change detected, propagating

br0: port 2(eth1) entering forwarding state

br0: topology change detected, propagating

br0: port 1(eth0) entering forwarding state

br0: port 2(eth1) entering learning state

br0: topology change detected, propagating

br0: port 2(eth1) entering forwarding state

0 Kudos
Immortal
Immortal

Remember that you have to enable promiscuos mode ALSO on the right PortGroups od your vSwitches.

Andre

Andre | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
Contributor
Contributor

The port groups inherit the settings from the switch, do they not?

I have not explicitly set the port groups, so they should have the same setting as the vSwitch....

0 Kudos
Contributor
Contributor

I think I found out why its not forwarding arp back to the client windows machine- looking at the arp table in the bridge, the mac address for the windows machine is showing up on the wrong bridge port. eth0 is port 1 and eth1 is port 2, 00:0c:29:b9:2f:e9 is the mac of the windows machine.

port no mac addr is local? ageing timer

1 00:0c:29:a8:1b:ee yes 0.00

2 00:0c:29:a8:1b:f8 yes 0.00

1 00:0c:29:b9:2f:e9 no 1.86

If i disconnect the eth0 vNIC from the linux host, the mac moves to the correct port, 2. Im not sure vmware is completely to blame, but i can't reproduce this issue in a physical environment- I will try to different distro of linux with a newer kernel and see if that works- if not, is it worth taking up with vmware?

0 Kudos
Contributor
Contributor

still an issue with Ubuntu 9.10 2.6.31-14 and brige-utils 1.2

0 Kudos
Contributor
Contributor

i found a post where someone else is experiencing the same issue without resolution.

http://archives.free.net.ph/message/20100108.174704.efbb18cc.ja.html

So it looks like an actual bug

0 Kudos
Contributor
Contributor

In speaking with the linux-bridging mailing list, I understand where the issue lays.

The issue is in VMwares vSwitches- when a vSwitch has more than one pNIC in it, the second pNIC (even if standby in an active/passive fail over) replicates back the arp requests, causing the linux bridge to incorrectly update its mac table.

The recourse is one of a couple things:

  • Remove the second pNIC from the vSwitch; of course compromising redundancy.

  • Replace the built in vSwitch with a Cisco 1000V (unconfirmed, but assumed to work)

  • Replace the Linux bridge with an arp proxy / ip forward.

If any one else has suggestions, I'm all ears. My understanding is vmware has stated no intention on changing this behavor (hearsay from the mailing list).

Hopefully this saves someone else a week of their professional career Smiley Wink

0 Kudos
Contributor
Contributor

Hi All!

In the attached file follows definitely a graphic way to deploy a bridge for transparent VMs!

Good Luck!

0 Kudos
Contributor
Contributor

A workaround is to transform your linux bridge in a "hub", disabling the learn process.

brctl setageing br0 0

where br0 is the bridge name.

So every time a packet arrives in the bridge, it will be flooded to all ports.

I hope it helps other people with the same problem Smiley Happy

0 Kudos