VMware Cloud Community
srodenburg
Expert
Expert

ESXi 6.5 - Beacon probing causing duplicate packets (on non Etherchannel/LACP)

Hello,

I have an unexpected issue with Beacon probing in a 2 ESXi Server setup. We are not using LACP, Ether-channels, Route based in IP hash (I know that these do not work with beacon-probing).

The 2 nodes use a DvSwitch with 7 VLAN's on Beacon Probing is enabled on all. Load-balancing on all DvPortgroups is "Route based on originating virtual port".

Here is an overview of the setup:

           ----------------------------

           |                          |

           |    DC Switching env.     |

           |                          |

           ----------------------------

               |                  |

            L1 |                  | L2

               |                  |

    -------------------   -------------------

    |                 |   |                 |

    |   My Switch 1   |   |   My Switch 2   |

    |                 |   |                 |

    -------------------   -------------------

         A |        A \   / S        | S

           |           \ /           |

           |            X            |

           |           / \           |

           |          /   \          |

    -------------------   -------------------

    | vmnic0    vmnic1|   |vmnic0    vmnic1 |

    |                 |   |                 |

    |      ESXi A     |   |      ESXi B     |

    |                 |   |                 |

    -------------------   -------------------

Beacon probing is the only way to protect against link-failures on the L1 and L2 connections going from my switches (which are not linked in any way).

Please note that I cannot change this setup.

On both ESXi Servers, vmnic0 is the Active uplink (A) for all DvPortgroups, vmnic1 is Standby (S). This means that connection L2 is not used as long as L1 is alive (as per my design).

The switches that I control (the "my switches") are set to "L2 mode". I know the ISP uses Cisco Switches but that is all I know. I just hook up to them with a single cable acting as the uplink to them.

The switch-ports to which vmnic0 and vmnic1 are connected are trunks with all the necessary VLAN's configured.

Everything works as designed. Pulling an ISP uplink cable (L1 or L2) causes a smooth failover to the other "L" uplink, proving that Beacon-probing works fine. All failure-scenario's, incl. "my switches" that go down work perfectly.

I have one nagging issue though...

The issue:

There are VM's running on "ESXi A" and on "ESXi B".  For testing, I have pings running to all the VM's. The pings come from another (remote) datacenter which pass through a site-2-site VPN.

When the local Firewall/VPN VM is running on ESXi A, the pings to the VM's on ESXi A are normal, but the pings to the VM's running on ESXi B all display double or quadruple duplicates (it varies).

I then vMotion the Firewall VM to ESXi B

Now, the pings to the VM's on ESXi B are normal, but the pings to the VM's running on ESXi A all display duplicates.

To summarise: pings are only normal for those VM's who happen to be on the same ESXi Server as where the Firewall VM is.

If I turn beacon-probing off, the issue goes away and all pings to all vm's are normal (as expected).

As i'm not using Route based on IP-Hash or any other channeling-technology anywhere on my side, why is Beacon probing still "shotgunning" packets out of both interfaces of the ESXi host ??

I cannot find any hints in the ESXi hosts logfiles either. The logfile  /var/log/vobd.log  for example, shows no relevant information as to why ESXi does this.

I've done extensive reading on this. The Blogpost linked below being one of of the better ones, but I can't find an answer. I Hope someone can shed some light on this.

https://vswitchzero.com/2017/06/18/beacon-probing-deep-dive/

0 Kudos
1 Reply
srodenburg
Expert
Expert

Nobody ?

0 Kudos