VMware Cloud Community
Bob_Jenkins
Contributor
Contributor
Jump to solution

ESX / CISCO / Windows NLB nightmare

Dear All

We currently have a pretty standard ESX setup comprising of the following:

ESX 3.5U3 running on HP BL680 blades within a C7000 Virtual Connect chassis.

We have several Virtual Machines with MS Office Sharepoint installed.

We are attempting to load-balance traffic across two of the servers using Windows NLB.

It seems that load balancing works for around 20 mins then fails.

Our networking setup is as follows: Promiscuous Mode: Reject. MAC Address Changes: Accept. Forged Transmits: Accept. Load Balancing: Route Based on IP Hash. Network Failover Detection: Link Status Only. Notify Switches: Yes. Failback: Yes.

We have reproduced the exact same Application setup with physical servers in the same environment and all is OK (with same CISCO 6509 switches etc)...

Does anyone have any ideas around the ESX networking / CISCO setup that may help us?

Any help much appreciated.

Reply
0 Kudos
1 Solution

Accepted Solutions
emmar
Hot Shot
Hot Shot
Jump to solution

Glad to help. Have seen many hiccups getting this sorted in the past (hence i have my notes!!)

P.S Dont see any correct points Smiley Sad

View solution in original post

Reply
0 Kudos
11 Replies
emmar
Hot Shot
Hot Shot
Jump to solution

Hi Bob,

Are you using Unicast or Multicast NLB? Here are my notes on MS NLB - hope they're are of use:

Unicast mode - not recommended for use in VMs

Supported in ESX but requires you set "Notify Switch" to NO on required vSwitch/port group...which basically breaks VMotion for any VM on that vswitch/port group as ARP updates are not sent out when the VM moves hosts.

Multicast mode - VMware best practice to use this.

Supported in ESX but many physical switches don't support it (apparently in multicast mode, an ARP response includes a unicast IP address format with a multicast MAC format, which is rejected by many switches) ...in which case a static ARP entry will be required on the p switch for the NLB multicast MAC and Cluster Virtual IP.... This is the same for both physical and virtual machines.

Here is the VMware KB article - clearly recommending you use multicast:

Here is a MS article about common NLB issues and how to fix them:

Texiwill
Leadership
Leadership
Jump to solution

Hello,

Moved to Virtual Machine and Guest OS forum.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

Blue Gears and SearchVMware Pro Blogs: http://www.astroarch.com/wiki/index.php/Blog_Roll

Top Virtualization Security Links: http://www.astroarch.com/wiki/index.php/Top_Virtualization_Security_Links

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
Reply
0 Kudos
Bob_Jenkins
Contributor
Contributor
Jump to solution

Hi Emmar

I THOUGHT we were already using Multicast NLB but after looking at the vmkernel logs it is showing as unicast...oops. Also, in the VMnet vswitch properties, I have "Notify Switches" ticked - and according to your doc this is not advisable.

We are about to re-test the NLB and will let you know the result.

Thanks again.

Reply
0 Kudos
emmar
Hot Shot
Hot Shot
Jump to solution

Bear in mind that if you stick with Unicast mode and therefore have to switch off "Notify Switch" you'll have issues with VMotions as once the VM moves from one host to the other the vSwitch will not Notify the Switch that the VM has moved and therefore traffic will not automatically be sent to it on the new host.

Multicast is the really the way to go but it does generally involve making change in your pSwitch environment.

Reply
0 Kudos
happyhammer
Hot Shot
Hot Shot
Jump to solution

Bob, bear in mind if your load balancing policy is IP hash you should have etherchannel configured on the Cisco switches, if ether channel not configured then change load balance policy to Port ID or MAC

Bob_Jenkins
Contributor
Contributor
Jump to solution

Many thanks happyhammer - indeed we have ethercannel enabled on the CISCOs.

Many thanks.

Reply
0 Kudos
mike_laspina
Champion
Champion
Jump to solution

Hi,

If you use Port ID then Windows NLB will not be of any use because VMware assigns the port in a static manner to only one vmnic adaptor. It alternates VM port assignments across the vmnic team as the are initially connecting and they will remain on that assigned vmnic.

http://blog.laspina.ca/ vExpert 2009
Reply
0 Kudos
mike_laspina
Champion
Champion
Jump to solution

Do you really need NLB in a VM, I find it really does not improve the throughtput much over VMware and Cisco Etherchannel because even if you have it configured correctly the return path may pigeon hole to only one of the interfaces. Small gain lot's of complexity no good either way.

http://blog.laspina.ca/ vExpert 2009
Reply
0 Kudos
Bob_Jenkins
Contributor
Contributor
Jump to solution

Thanks, but to make clear - the NLB is across TWO VMs which are MOSS web front-end servers so there is much to be gained by balancing web traffic between the two.

Thanks for everyone's replies on this. Have just got it fixed by adding the NLB team's MAC address to the ARP table on the two CISCO switches and enabling LACP and hence etherchanneling on that portion of the network. Balancing well between the two servers now...

Reply
0 Kudos
Bob_Jenkins
Contributor
Contributor
Jump to solution

Hi Emmar

The "Correct" button only appears next to your latest reply but I want to make it clear that the most crucial information which helped me get this fixed was as a result of following the recommendations in your initial reply.

Thanks again.

Reply
0 Kudos
emmar
Hot Shot
Hot Shot
Jump to solution

Glad to help. Have seen many hiccups getting this sorted in the past (hence i have my notes!!)

P.S Dont see any correct points Smiley Sad

Reply
0 Kudos