We created a standard ESXi vswitch with a single uplink enabled, configured in promiscuous mode, and attached to a VM deployed inside the ESXi. VMs inside the ESXI are deployed in bridge mode (L2).
We have seen that when system is loaded with traffic the ARP request coming from the VMs sides (bridge mode) and going out through the ESXi vswitch are sent back by the vswitch through the same port where it comes from which is not a typical HUB or switch behavior. On the VM interface connected to the vswitch we see that the ARP request get out of the interface toward the vswitch and then comeback through the same interface within milliseconds.
It was well verified through troubleshooting that:
To overcome this issue, we had to deploy a linux VM that have interface passthrough enabled and that will act as a vswitch to prevent using the ESXi vswitch. This work around is costly because it will dedicate the physical interface to that VM and we no longer can use that for other VMs. Has anyone else seen this, or is this something VMWAre can investigate for us, as we see this as a Bug.
Can you confirm that this ARP packet is not seen twice on the uplink as well, first on egress and then on ingress again. Within milliseconds sounds to me that the packet makes a longer trip than just the vSwitch before looping back. Otherwise it should be a matter of sub-milliseconds.
Also this suggest that the issue could be elsewhere
Moderator: Moved to vSphere vNetwork Discussions
I can confirm that while troubleshooting this we only saw one ARP request exiting the ESXi and one reply is back, ARP packet is not seen twice on the uplink within that time frame. Aslo on original Question I was incorrect I was saying Millisecond, however it was suppose to say Microsecond.
Please see below capture showing the timing
ARP request getting out from VM toward the vswitch: 12:01:04.087056 00:90:0b:66:5d:29 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, Request who-has 192.168.101.1 tell 192.168.101.2, length 46 ARP request coming in at the VM from vswicth: 12:01:04.087126 00:90:0b:66:5d:29 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, Request who-has 192.168.101.1 tell 192.168.101.2, length 46
The difference is 70 Micro seconds
Interesting.
Can you capture the rebound packet on the ESXi (not only on the VM) as well. Here's a spell for that:
pktcap-uw --switchport <port_ID> --dir 2 --ethtype 0x0806
This seems to be a bug, but is it related explicitly to promiscuous mode? If I recall correctly promiscuous mode also had some performance penalty. You seem to talk about VSS, but are you eligible for VDS and could you try this instead?
Thank you for your advise, can you tell me how I can find out if we are eligible for VDS, is this something we have to pay for?
I'm not that aware of VMware licenses and about what options are included in which license edition and so forth, but I believe that VDS (Virtual Distributed Switch) needs least a vCenter license and hence few paid vSphere licenses for hosts as well. Hence this is not possible with free edition.