6 Replies Latest reply on Oct 15, 2015 2:49 AM by bpp

    vDS - Dropped Egress packets

    DredlinE Novice

      Hello everyone!

      I need a bit of help with packet drops and vDS port statistics.

       

       

      The problem is:

      I observe millions of Exception Ingress packets on all 4 up-links of my single vDS switch and probably tens of millions Dropped Egress packets on all ports of vDS regardless of a port-group they belong.

       

       

      What I have looked into:

           - I can see these drops on a VM-level statistics (Performance>Network) as well, but only for RX.

           - I don't experience any connectivity problems at application level although (no packet loss in guest OS logs or during ping).

           - Physical switch statistics reports no error/discard packets.

           - Nothing meaningful in vSpehre logs

           - Low traffic and low VM load

           - No warnings reported on vSphere, vDS switch health monitoring is enabled and reports normal status

       

       

      This is why I would like to ask several questions:

           - what does "Dropped Egress packets" actually mean in terms of vDS stats? what are these packets and why are they dropped

           - how to reset these packet counters?

           - is there any low-level network diagnostic tools available for vSphere/ESXi?

       

      Versions: ESXi 5.1u1, vCenter 5.1u1

       

       

      I would highly appreciate if someone could give a clue of what's going on or at least advise a direction for further investigation.

      Thanks.

        • 1. Re: vDS - Dropped Egress packets
          DredlinE Novice

          Some new updates and findings:

          I've made certain moves into my infrastructure, like migrating some VMs to different vsvitch, reconfiguring portgroup policies, updating and restarting hosts and so on.

          This led to substantial decrease of dropped packets on vDS and no more exception packets, but still:

          - there are beacon probing (ethertype 0x8922) broadcasts on the network (dispite it's disabled on every hos - set Net.MaxBeaconsAtOnce to 0).

          - dropped packet number is comparable to mentioned beacon packet number broadcasted.

          Can anyone explain this or make an assumption why this is happening?

          • 2. Re: vDS - Dropped Egress packets
            MKguy Virtuoso

            As you mentioned yourself, you're not actually experiencing issues even though such high numbers of dropped packets are displayed, right?

             

            Incorrect reporting on dropped packets seems to be a known issue since a while:

            http://kb.vmware.com/kb/2052917

            This issue occurs when packets filtered by the IO chain are incorrectly recorded as dropped packets. This is a reporting issue, the packets are not dropped, hence they cannot be seen using esxtop or other network monitoring tools.

             

            Also see:

            https://communities.vmware.com/message/2272239#2272239

            https://communities.vmware.com/thread/452787

             

             

            - there are beacon probing (ethertype 0x8922) broadcasts on the network (dispite it's disabled on every hos - set Net.MaxBeaconsAtOnce to 0).

            The 0x8922 ethertype broadcasts are used not only for beacon probing, but also for the new 5.1 distributed vSwitch network health check feature. Have you enabled that?

            The source MAC of these frames is encoded in the format 00:50:56:5[random Value]:[Last 2 Byte of the physical vmnic MAC].

            • 3. Re: vDS - Dropped Egress packets
              DredlinE Novice

              Incorrect reporting on dropped packets seems to be a known issue since a while:

              http://kb.vmware.com/kb/2052917

              Yes, I've found this KB recently. Thank You!

              As I mentioned in previous post the only dropped packets I see for the moment are ethertype 0x8922 packets

               

              The 0x8922 ethertype broadcasts are used not only for beacon probing, but also for the new 5.1 distributed vSwitch network health check feature. Have you enabled that?

              The source MAC of these frames is encoded in the format 00:50:56:5[random Value]:[Last 2 Byte of the physical vmnic MAC].

              This is exactly what I see. Many thanks for the lead. Disabled Health Check - no more broadcasts

               

              The only thing left is when Health Check is ON again all my Ubuntu (13.04) VMs are reporting dropped packets which completely correspond to ethertype 0x8922 broadcasts captured by tcpdump from another VM.

               

              user01@ubuntu01:~$ ifconfig eth0

              eth0      Link encap:Ethernet  HWaddr 00:50:56:bf:7c:48

                        inet addr:10.113.0.177  Bcast:10.113.0.255  Mask:255.255.255.0

                        inet6 addr: fe80::250:56ff:febf:7c48/64 Scope:Link

                        UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

                        RX packets:3250670 errors:0 dropped:91575 overruns:0 frame:0

                        TX packets:4468580 errors:0 dropped:0 overruns:0 carrier:0

                        collisions:0 txqueuelen:1000

                        RX bytes:2785809569 (2.7 GB)  TX bytes:2847014532 (2.8 GB)

               

              I presume Ubuntu is not able to recognize them. Tried different nic types. vmtools are the latest. Is there any solution to that?

              • 4. Re: vDS - Dropped Egress packets
                MKguy Virtuoso

                If it really just drops the 0x8922 frames because it doesn't recognize this ethertype, then this can be safely ignored, even though it may look a bit eyebrow-raising.

                 

                I tested a bit with CentOS 6.3 VMs but I wasn't able to reproduce the dropped packet count increase you're seeing on your Ubuntu VMs.

                Did you increase the MTU on the dvSwitch? The size of the MTU Health Check 0x8922 frames should correspond to the MTU you've set on the dvSwitch. I thought maybe your VM drops them because they are too large.

                Does the dropped frame counter increase in a VM if if you're running tcpdump on it (running it should "accept" all frames)?

                 

                I also tried crafting custom layer 2 frames with scapy, using the same or other random ethertypes and larger frames sizes, but I never got the dropped packet counter to increase. I'm not sure when the kernel would decide to drop frames, except maybe for wrong checksums or such.

                 

                With the scapy packet crafting tool I tried stuff like this:

                Source scapy VM:

                # scapy

                >>> a=Ether(dst="FF:FF:FF:FF:FF:FF", src="00:50:56:22:44:55", type=0x8922)

                >>> a

                <Ether  dst=FF:FF:FF:FF:FF:FF src=00:50:56:22:44:55 type=0x8922 |>

                >>> sendp(a)

                .

                Sent 1 packets.


                 

                >>>payload="Some dummy data that is 1400 byte long ....[...]"

                >>> a/payload

                <Ether  dst=FF:FF:FF:FF:FF:FF src=00:50:56:82:35:15 type=0x8922 |<Raw  load='Some dummy data that is 1400 byte long ...........[...]

                >>> sendp(a/payload)

                .

                Sent 1 packets.

                 

                 

                 

                Destination VM (with eth MTU set to 1300):

                # tcpdump -i eth0 -nnvvve not ip and not arp

                11:39:16.448518 00:50:56:22:44:55 > ff:ff:ff:ff:ff:ff, ethertype Unknown (0x8922), length 60:



                11:43:39.923808 00:50:56:82:35:15 > ff:ff:ff:ff:ff:ff, ethertype Unknown (0x8922), length 1414:

                    0x0000:  536f 6d65 2064 756d 6d79 2064 6174 6120  Some.dummy.data.

                    0x0010:  7468 6174 2069 7320 3134 3030 2062 7974  that.is.1400.byt

                    0x0020:  6520 6c6f 6e67 202e 2e2e 2e2e 2e2e 2e2e  e.long..........

                 

                 

                Using unicast MAC of the destination or other exotic/non-existent ethertypes (http://standards.ieee.org/develop/regauth/ethertype/eth.txt) didn't make a difference, I never saw drops in the guest.

                Check if you can reproduce the drops with scapy or some other packet crafting tool like this. But like I said above, it shouldn't be something to worry about if your Ubuntu is just dropping ethertypes it doesn't know.

                • 5. Re: vDS - Dropped Egress packets
                  DredlinE Novice

                  I tested a bit with CentOS 6.3

                  I've got CentOS 6.4 in my cloud - no drops at all at OS level although vCenter shows some small amount of dropped packets on this VM.

                  The MTU everywhere is 1500 bytes.

                  Tried out some tests:

                  1. Disabled vDS healthcheck and this is what I see:

                   

                  user01@ubuntu01:~$ netstat -i

                  Kernel Interface table

                  Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg

                  eth0       1500 0   3278579      0 110388 0       4471703      0      0      0 BMRU

                  lo        65536 0     84937      0      0 0         84937      0      0      0 LRU

                   

                  2. Created and sent 3 packets using scapy from CentOS to FF:FF:FF:FF:FF:FF

                   

                  user01@ubuntu01:~$ sudo tcpdump port not 22

                  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

                  listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

                  12:38:39.113905 00:50:56:bf:2a:7c (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 60:

                          0x0000:  0000 0000 0000 0000 0000 0000 0000 0000  ................

                          0x0010:  0000 0000 0000 0000 0000 0000 0000 0000  ................

                          0x0020:  0000 0000 0000 0000 0000 0000 0000       ..............

                  12:38:39.417839 00:50:56:bf:2a:7c (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 60:

                          0x0000:  0000 0000 0000 0000 0000 0000 0000 0000  ................

                          0x0010:  0000 0000 0000 0000 0000 0000 0000 0000  ................

                          0x0020:  0000 0000 0000 0000 0000 0000 0000       ..............

                  12:38:39.761906 00:50:56:bf:2a:7c (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 60:

                          0x0000:  0000 0000 0000 0000 0000 0000 0000 0000  ................

                          0x0010:  0000 0000 0000 0000 0000 0000 0000 0000  ................

                          0x0020:  0000 0000 0000 0000 0000 0000 0000       ..............

                  ^C

                  3 packets captured

                  3 packets received by filter

                  0 packets dropped by kernel


                  user01@ubuntu01:~$ netstat -i

                  Kernel Interface table

                  Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg

                  eth0       1500 0   3278626      0 110388 0       4471762      0      0      0 BMRU

                  lo        65536 0     84943      0      0 0         84943      0      0      0 LRU

                  user01@ubuntu01:~$

                   

                  As seen, no drops. But wait, there is more!

                   

                  3. Enabled health check.

                   

                  user01@ubuntu01:~$ sudo tcpdump port not 22 | grep -i ethertype

                  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

                  listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

                  12:40:59.568215 00:50:56:51:96:6d (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.568555 00:50:56:5c:79:7d (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.568909 00:50:56:5c:79:7f (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.569112 00:50:56:5c:79:7e (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.569460 00:50:56:51:9a:e0 (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.570060 00:50:56:5c:7a:15 (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.570337 00:50:56:5c:7a:17 (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  12:40:59.570708 00:50:56:5c:7a:16 (oui Unknown) > Broadcast, ethertype Unknown (0x8922), length 1500:

                  ^C8 packets captured

                  8 packets received by filter

                  0 packets dropped by kernel

                   

                   

                  user01@ubuntu01:~$ netstat -i

                  Kernel Interface table

                  Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg

                  eth0       1500 0   3278689      0 110388 0       4471815      0      0      0 BMRU

                  lo        65536 0     84956      0      0 0         84956      0      0      0 LRU


                  Five seconds later:

                   

                   

                   

                   

                   

                   

                  user01@ubuntu01:~$ netstat -i

                  Kernel Interface table

                  Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg

                  eth0       1500 0   3278716      0 110392 0       4471847      0      0      0 BMRU

                  lo        65536 0     84956      0      0 0         84956      0      0      0 LRU

                   

                  RX-DRP increments by 4 or 8 periodically. Number of dropped packets from vCenter stats increases evenly as well but is slightly larger. Probably there is something else that is dropped.

                  Will try to investigate further. Anyway thanks a lot for advise!

                  • 6. Re: vDS - Dropped Egress packets
                    bpp Lurker

                    Hi all,

                     

                    I had the same problem now in Oct-2015...

                    1:27:07.593695 00:50:56:5d:01:90 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:07.593855 00:50:56:5d:01:98 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:14.553732 00:50:56:5d:01:40 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:14.556266 00:50:56:5f:9e:b8 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:15.369990 00:50:56:5f:f0:28 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:15.371247 00:50:56:5f:f0:40 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:17.563098 00:50:56:5f:f0:58 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:17.564822 00:50:56:5f:f0:70 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:19.378221 00:50:56:53:f9:c8 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:19.378844 00:50:56:5f:9e:98 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:26.214389 00:50:56:5f:f0:48 > Broadcast, ethertype Unknown (0x8922), length 1496:

                    11:27:26.214823 00:50:56:5f:9e:a8 > Broadcast, ethertype Unknown (0x8922), length 1496:

                     

                    To cut the story short: the mac addresses belongs to ESXi Server port namely Shadow, which is for monitoring HealthCheck.

                    I just disabled health check for VLAN and MTUScreenshot 2015-10-15 11.48.59.png

                     

                    All unknown packets vanished.

                     

                    FYI:

                    The vSphere version : VMware ESXi, 5.5.0, 2718055 ( (Updated) HP-ESXi-5.5.0-Update2-iso-5.77.3)

                     

                    Regards,

                     

                    Badri