12 Replies Latest reply on Jun 14, 2012 1:16 AM by kopper27

    Wrong MAC address used for replies during vMotion (and more?)

    rayvd Enthusiast

      Reference this thread.

       

      Have a vSwitch comprised of one physical NIC.  This vSwitch has two "ports" defined on it apart from the VM network devices -- a Management Console port (with MAC address "A") and a vMotion port (with MAC address "B").  No tagging is being done, both of these ports communicate via a different IP address within the same subnet.

       

      We've noticed that reply traffic with an originating IP of the vMotion port are being sent with the Management Console port's MAC address.  Traffic to the vMotion IP is sent to the correct MAC address ("B").

       

      Since all of the ACKs/responses are coming back with MAC "A", eventually our Cisco switches lose MAC "B" from their CAM tables and unicast flooding begins until we manually inject an ARP request to repopulate the CAM tables with the correct MAC ("B").

       

      I understand putting management console traffic along with vMotion traffic on the same physical subnet is not best practice.  But the behvarior we're seeing (wrong MAC address being used for the IP on reply packets) seems to be a bug.

       

      Anyone else run into this?

       

      This is with ESXi 4.1.0 260247.

        • 1. Re: Wrong MAC address used for replies during vMotion (and more?)
          bulletprooffool Virtuoso

          As you have said : putting management console traffic along with vMotion traffic on the same physical subnet is not best practice - so I would start looking here.

           

          you also have to be careful to note the actual IP settings / Default gateways etc for these ports - you are using 1 physical NIC and I am guessing poresenting both IPs on the same subnet (or do you have a router on a stick setup?) - the ESX is simply managing Ip traffic the way any other host in a similar scenario would.

           

          Are you able to segregate the traffic for these 2 Ports tagging VLANs by any chance to manage traffic?

          Also, is this a production environment, or a lab?

           

          Lastly, are you sure that the 'Management' port does not have VMotion enabled?

          • 2. Re: Wrong MAC address used for replies during vMotion (and more?)
            rayvd Enthusiast

            Thanks for the response.

             

            We're identifying and changing the configuration on these "non-best practices" setups, and yes, I believe this will resolve the issue.  I'll have to check on the default gateway setting.  Perhaps the source MAC chosen is alawys associated with the "port" with the default gateway assigned.

             

            My main thinking was, if traffic arrives for IP 10.x.x.5, responses sent out with src address 10.x.x.5 should use the source MAC address associated with the port group having that IP, not a port group having a different IP (even if that port group is the one with the default gateway and even if we're dealing with the same physical NIC).  This seems like buggy behavior to me.

             

            May have to do a little testing on a few generic Linux hosts since I presume ESXi's networking stack is still based on the Linux kernel.

            • 3. Re: Wrong MAC address used for replies during vMotion (and more?)
              mcowger Champion

              rayvd wrote:

               

               

              May have to do a little testing on a few generic Linux hosts since I presume ESXi's networking stack is still based on the Linux kernel.

               

              This would be an incorrect assumption that hasn't been true for quite a few years.

              1 person found this helpful
              • 4. Re: Wrong MAC address used for replies during vMotion (and more?)
                TMeissner Novice

                We have confirmed that this is true using a WireShark trace.  Even though this configuration is not "best practice" to put these on the same physical subnet, in reality many people do because a 1 Gb link provides more than enough bandwidth to do the job.  Basically, we consider this a bug that should be patched.

                1 person found this helpful
                • 5. Re: Wrong MAC address used for replies during vMotion (and more?)
                  rayvd Enthusiast

                  Matt wrote:

                   

                  rayvd wrote:

                   

                   

                  May have to do a little testing on a few generic Linux hosts since I presume ESXi's networking stack is still based on the Linux kernel.

                   

                  This would be an incorrect assumption that hasn't been true for quite a few years.

                   

                  Any idea what kernel or networking components ESXi is based on currently?  I've seen this sort of behavior on Solaris as well.

                  • 6. Re: Wrong MAC address used for replies during vMotion (and more?)
                    rayvd Enthusiast

                    TMeissner wrote:

                     

                    We have confirmed that this is true using a WireShark trace.  Even though this configuration is not "best practice" to put these on the same physical subnet, in reality many people do because a 1 Gb link provides more than enough bandwidth to do the job.  Basically, we consider this a bug that should be patched.

                     

                    Have you engaged VMware Support at all?  As we're in a position to potentially redo how we have things linked up, I wasn't sure this was worth even bothering over -- their manuals do indicate not to wire things up in this manner (even though it's common and practical as you mention).

                    • 7. Re: Wrong MAC address used for replies during vMotion (and more?)
                      mastrboy Novice

                      their manuals do indicate not to wire things up in this manner

                       

                      Could you provide a link to that manual? (I have like ~100 pdf whitepapers from vmware on my drive )

                      • 8. Re: Wrong MAC address used for replies during vMotion (and more?)
                        rayvd Enthusiast

                        mastrboy wrote:

                         

                        their manuals do indicate not to wire things up in this manner

                         

                        Could you provide a link to that manual? (I have like ~100 pdf whitepapers from vmware on my drive )

                         

                        Apparently I didn't save the exact manual either... but try the Performance Best Practices for VMware vSphere 4.1.  Page 12 or so recommends separate logical networks at least for vmknic traffic.

                         

                        However, this reads as a recommendation based more on capacity than based on preventing the issues we're seeing.

                         

                        the iSCSI SAn Configuration Guide for 4.1 may also have some information on this.

                        • 9. Re: Wrong MAC address used for replies during vMotion (and more?)
                          fletch00 Hot Shot

                          I just opened a case on this - we can not have network disruptions due to vMotions.

                          This did not exist in our environment prior to migrating from ESX to ESXi!

                           

                          http://vmadmin.info

                          • 10. Re: Wrong MAC address used for replies during vMotion (and more?)
                            jbajba Lurker

                            Hello,

                             

                            I guess I get similar issue with ESXi 4.1U1:

                             

                            After a Vmotion, ESXi host still claim having MAC of the migrated VM.

                            Migrated VM is consequently unreachable from all VM hosted on the former host.

                            Migrated VM is reachable from all networks but its former ESX host and attached VM.

                             

                            Currently, the only 2 ways the former host (and its VM) can reach the migrated VM is :

                            - reboot

                            - vmkload_mod -u e1000e; sleep 5 ; vmkload_mod e1000e

                             

                             

                            NIC are Intel 82574L (e1000e)

                             

                            2 VMKernel (Vmotion and Management) on a VLAN (same subnets)

                            1 VMKernel (Management) on a routed LAN. (different subnets)

                             

                             

                             

                             

                            Servers are hosted in a french datacenter (OVH), and they claim their CAM tables are not involved. No way to get swicth monitoring.

                             

                            I opened a case but VMware support was not able to reproduce this problem.

                             

                            If anyone has an idea ?

                             

                             

                            Thank you.

                             

                            JB

                            • 11. Re: Wrong MAC address used for replies during vMotion (and more?)
                              KReagan Novice

                              The VMwrae site does not make mention of unicast flooding in any officaly published documentation, or per VMware Networking Support, it is not mentioned in internal documentation either.

                               

                              The ESXi Server Config Guide, page 67, second bullet under Networking Best Practices;

                              "Keep the vMotion connection on a separate network devoted to vMotion"

                               

                              And that imples not to have the management network and the vMotion network in the same network else it would not be a 'separate network devoted to vMotion'

                               

                              However, that recomendation reads as if being made for security and speed, ie, machines being sent accross the wire are unencrypted and if you want best performance, then you should have a dedicted network.

                               

                              It is a supported configuration; (see last paragraph of solution)

                              http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1013077

                               

                              And the unicast storm is not unexpected;

                              "all outgoing traffic to the IP subnet always transmit on the first vmknic"

                               

                              And it is not a supported configuration; (See note under solution)

                              http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006989

                               

                              And finally, this thread gives the reasons why the physical switch resorts to a unicast flooding.

                              http://communities.vmware.com/message/1732987

                              • 12. Re: Wrong MAC address used for replies during vMotion (and more?)
                                kopper27 Expert

                                someone knows if this affects ESXi 4.1 Update 2?

                                 

                                http://www.vmadmin.info/2011/04/vmotion-unicast-flood-esxi.html

                                 

                                According to this

                                 

                                Right now my Management and vMotion are like this (2 hosts)

                                 

                                Hosts 1

                                Management - 192.168.23.240

                                vMotion - 192.168.23.241

                                Gateway 192.168.23.1

                                 

                                Host 2

                                Management - 192.168.23.242

                                vMotion - 192.168.23.243

                                Gateway 192.168.23.1

                                 

                                 

                                so I should create a vMotion with 10.10.10.x ???? for instance?

                                 

                                vMotion host 1 : 10.10.10.5

                                vMotion host 2 : 10.10.10.6

                                 

                                and same Gateway 192.168.23.1

                                 

                                or

                                something like this might be enough?

                                 

                                vMotion host 1 : 192.168.30.5

                                vMotion host 2 : 192.168.30.6

                                 

                                Let me know guys

                                thanks a lot