Skip navigation
VMware

This Question is Possibly Answered

1 "correct" answer available (10 pts) 2 "helpful" answers available (6 pts)
2,911 Views 8 Replies Last post: Oct 26, 2011 7:43 AM by kamikadz3 RSS
rayvd Enthusiast 51 posts since
Jun 13, 2007
Currently Being Moderated

Mar 17, 2011 6:54 PM

vMotion causing Unicast Flooding

I am troubleshooting an environment generating unicast flooding during vMotion'ing.  The environment isn't exactly best practice (vMotion IP, Management IP and Virtual Machine Networks are all on the same logical subnet).  This will be corrected, but I'm trying to understand how and why the unicast flooding is occurring.

 

We have Dell blades (M1000e) in a Dell chassis with multiple blade center switches.  Each of these switches uplinks to Cisco gear.  When we do a vMotion between a host in the Blade Center and an external host, things work OK for a few minutes, but then unicast flooding begins.  I can send out an ARP request for the vMotion IP (target IP) from a non-involved host -- this seems to add the corresponding MAC address back to the Cisco's CAM table at which point the unicast flooding stops.

 

Based on my observations, the MAC address is present in the Dynamic Address List on the Dell switch (equivalent of CAM table).  So why is the Cisco expiring it (presumably after 600 seconds have passed)?

 

I encountered this post which asserts:

 

Make sure you have all your vkernel ports on separate subnets e.g.  separate vmotion/management/iscsi. Failure to do this can cause lots of flooding during vmotion as the  physical switch does not learn the MAC address for the vmotion port  correctly. And continuously broadcasts to find it.

 

Is this true?  Known bug?  Is there any documentation describing this?  My theory was that perhaps the ESXi server acting as the vMotion source has the destination IP in its ARP table.  It hasn't expired when the Cisco prunes on the MAC address from its CAM table and keeps transmitting without sending an ARP request.  The Cisco no longer knows where to send the packet so it Unicast Floods.

 

Am I way off in the weeds here?

 

Thanks!

mastrboy Novice 21 posts since
Jun 28, 2007
Currently Being Moderated
2. Apr 7, 2011 4:12 AM in response to: rayvd
Re: vMotion causing Unicast Flooding

We are currently experiencing the same problem with vmotion and ESXi 4.1 U1, this problem did not exist on ESX 4.0 Update 2. (We are currently migrating from esx to esxi)

 

I was kind of shocked to find that the vmotion port was the cause of the unicast flooding we experienced.

 

Did you find a "solution" to this, other than having a dedicated nic for management and a dedicated nic for vmotion seperated?

 

Seems like more people are experiencing this with esxi 4.1: http://serverfault.com/questions/197918/clearing-arp-cache-on-esxi-4-1

mastrboy Novice 21 posts since
Jun 28, 2007
Currently Being Moderated
3. Apr 8, 2011 1:29 PM in response to: mastrboy
Re: vMotion causing Unicast Flooding

A workaround that i found is setting the "switchport block unicast" on our cisco switches, but i can't consider this a good solution.

fletch00 Hot Shot vExpert 340 posts since
Nov 1, 2006
Currently Being Moderated
4. Apr 13, 2011 12:05 AM in response to: mastrboy
Re: vMotion causing Unicast Flooding

We also are experiencing severe network disruptions during vMotions involving ESXi 4.1 U1 hosts.

Is there a documented BUG for this for ESXi 4.1 U1?

 

thanks,

http://vmadmin.info

VCP5 VSP5 VTSP5 vExpert http://vmadmin.info
fletch00 Hot Shot vExpert 340 posts since
Nov 1, 2006
Currently Being Moderated
5. Apr 13, 2011 12:53 AM in response to: fletch00
Re: vMotion causing Unicast Flooding
VCP5 VSP5 VTSP5 vExpert http://vmadmin.info
Walfordr Expert 302 posts since
Dec 23, 2008
Currently Being Moderated
6. Apr 13, 2011 8:42 AM in response to: fletch00
Re: vMotion causing Unicast Flooding

fletch00,

 

Please let us know the results of the case.  I have our migration from ESX 3.5 to ESXi 4.1 u1 coming up soon and did not plan to use a different subnetwork for vmotion.

 

Thanks,

 

Robert

Robert -- BSIT, VCP3/VCP4, A+, MCP -- Please consider awarding points for "helpful" and/or "correct" answers.
fletch00 Hot Shot vExpert 340 posts since
Nov 1, 2006
Currently Being Moderated
7. Apr 15, 2011 10:01 AM in response to: Walfordr
Re: vMotion causing Unicast Flooding

VMware support helped resolve the issue with zero downtime and our config was brought inline with best practice recommendations -

 

What I thought would be a large network topolgy change involving downtime turned out to be a live virtual networking config change:

 

http://www.vmadmin.info/2011/04/vmotion-unicast-flood-esxi.html

 

I recommended they create a KB for the solution as well

VCP5 VSP5 VTSP5 vExpert http://vmadmin.info
kamikadz3 Lurker VMware Employees 3 posts since
May 20, 2008
Currently Being Moderated
8. Oct 26, 2011 7:43 AM in response to: fletch00
Re: vMotion causing Unicast Flooding

Hi,

 

as someone mentioned before, to solve that problem and avoid that type of difficulties in future you have to re-design  your IP space - you have to separate VMkernel IP form Management IP range.

 

 

After that you can check connectivity between vmotion VMkernel ports by command

 

 

    # vmkping x.x.x.x   - where x.x.x.x is VMkernel IP address of other ESX's

 

 

To clarify: it is not problem with ESX 4.0 or 4.1. Issue can occur after you igrate from ESX Classic to ESXi. I classic management interface is own by the COS Kernel while other vmknic ports (for example  vMotion, iSCSI,) are own by the VMKernel. In that case management traffic (vswif on ESX Classic) can be in same IP subnet as any of the vmkernel NICs, since they would belong to different kernels.

 

 

Once migrated to ESXi the management network would become another vmknic, and therefore colliding with an existing vmknic's IP subnet.

"Two things are infinite: the universe and human stupidity; and I'm not sure about the universe." A. Einstein

Bookmarked By (0)

Share This Page

Communities