Metalgalle
Contributor
Contributor

[VMWare View] - ARP issue with vclients

Hi everybody,

I have a strange issue with VMWare View clients on my customer.

There are 3 esx 4.1.0,502767 enterprise servers (in an HP Blade server chassis), that hosts some servers and some clients (View 4.6.0-366101).

All the VDI are WXP Pro SP3, no AV, no Firewall and only backoffice programs are installed.

VNetworking consists in 2 Vswitches each one with 2 NICs (tried to configure in Active/Active and Active/Passive) that connects to the physical networks.

Convergence is provided by a NetASQ Firewall that separates Vclients network from server farm from LAN (devices, thinclient, ....) network.

Everything worked fine for some time (about two years), now there is a strange issue: when I delete an ARP entry on the Firewall's nic connected to the Vclients network, the Vclients do not respond to the ARP request sent by the firewall, obviously, the Vclient get kicked out from the network!

I captured traffic both on the firewall's interface, and on the Vclient interface, I can see the ARP request reach the client interface, but it does not send any ARP reply.

Strange behaviour is that the server farm network (on the same hosts) does not have this problem.

Another strange thing is that if I try to replicate the problem with another device on the Vclients network (e.g. a Print server or any other physical device), the problem does not appear.

More to come: if I move the Vclient to the server farm network, the problem disappears!

I also tried to install a fresh new Vclient, but the problem is still the same.

I tried to change the Vnic of the Vclient (moving from the flexible to the VMXNET3), but the problem is still there.

I tried to clear the ARP cache in all the physical switces involved in the communication, but the problem is not there (clearing the arp cache does not isolate the Vclient).

I noticed that if I clear the ARP table on the Vclient (arp -a -d) the client immediately get back to work, the same happens if the arp cache timeout runs out (120 sec.).

Another info is that the ARP protocol with other clients in the same network does not have this problem.

Ideas?

Thanx in advance to everyone!

--- Metalgalle
0 Kudos
4 Replies
Metalgalle
Contributor
Contributor

UP

--- Metalgalle
0 Kudos
amandasmith
Enthusiast
Enthusiast

he thing that solved it for me was to change the Team settings in  BACS from Smart Loadbalancing, which seems to be the default when a Team  is created to hold vlans, to Generic Trunking.

Before the change the ARP entries in my router was all from the MAC address of the physical interface.

After, they were replaced with the MAC addresses of the virtual nics in the vm-servers.

And everything just started to work

Always remember you're unique, just like everyone else
0 Kudos
Metalgalle
Contributor
Contributor

HI Amanda, thanx for reply.

Unfortuantely I don't have BACS installed, the network team of the physical nics is hold by ESX itself (now in active/passive mode), and I already tried to dismount the team and use a single nic, but nothing seems to change...

--- Metalgalle
0 Kudos
Metalgalle
Contributor
Contributor

up

--- Metalgalle
0 Kudos