Here's a scenario we recently encountered and what was necessary in order to address it. I'm entering this here in the hopes of helping future users.
Dell PowerEdge M710HD
Two Broadcom 57712-k NICs
PowerConnect M8024-k 10GbE switches
Each of the ports on the Broadcom card can be partitioned four times, for a total of eight partitions across two switch ports.
Odd numbered partitions become odd numbered vmnics, connected to switch-01, port tengigabitethernet 1/0/1
Even numbered partition become even numbered vmnics, connected to switch-02, port tengigabitethernet 1/0/1
ESXi 5.x Enterprise Plus
Host ESXHOST-01 has vSwitch0 (Management Network, vmk0 on VLAN100), assigned vmnic0 and vmnic1.
Port group Management-VLAN100 is created in DVSwitch0, assigned vmnic2 and vmnic3.
Guest VCENTER-01 on ESX01 has a single interface in port group Management-VLAN100
At this point, both the host and guest have working interfaces in VLAN100 with fully working configurations; however, VCENTER-01 cannot reach ESXHOST-01 when both are bound to vmnics on the same Broadcom NIC but different partitions (e.g. vmnic1 and vmnic3). When using partitions on different Broadcom NICs (e.g. vmnic0 and vmnic3) the connection is successful.
This occurs because partitions are treated as separate physical links connected to a single physical switch port. When the guest issues an ARP request for the ESX host, that request is broadcast by the switch to all links EXCEPT for the one where it originated, which prevents the host from ever seeing the lookup request. When the partitions are on different physical NICs spanning different physical ports, the ARP broadcast is received.
- Place the vCenter server (and/or any other guest used for monitoring ESX hosts) on a different VLAN than the ESX hosts - this forces the need for an extra hop for routing, which direct the traffic to leave the switchport and return.
- Add a network to vSwitch0 on each host and place the VMs in this instead of the distributed switch (or any other vswitch), which allows the vCenter to contact the host within the virtual switch.
We selected solution #2. This requires one extra step when defining the management network (adding a VM network alongside the vmkernel).
This issue is not limited to traffic between guests and ESX hosts. Essentially, two guests on a single ESX host would experience the same issue if:
- Both guests are on the same VLAN
- Guests are connected to different vSwitches
- The vmnics for the different vSwitches are partitions on the same physical Broadcom card
It's not likely that anyone would create multiple virtual switches supporting the same VLAN, so this scenario isn't as likely to come up.