1 2 3 Previous Next 30 Replies Latest reply on Jul 31, 2019 12:52 PM by JasonNash Go to original post
      • 15. Re: Write latency and network errors
        rafficvmware Novice
        VMware Employees

        Did anyone got a solution for this?

        • 16. Re: Write latency and network errors
          rphoon Novice

          Just wondering if you have checked the upstream switches and MTU settings on all the vmkernel nics. Mismatch MTUs may cause reties and network inconsistencies.

          • 17. Re: Write latency and network errors
            InfiniVirt Novice

            We are having the very same issue, albeit much worse. We are seeing latencies surpassing 1400 ms ( ! ) on a relatively empty 12-node VSAN stretched cluster (SR# 18750505903).  The link between sites is less than 30% used with >1ms latency. The issue was discovered when a SQL server w/ 1.5TB DB was migrated into the cluster and began having major application issues.

             

            VSAN 6.2 , ESXi 6.0.0 3620759.

            Cisco UCS C240M4 hardware with Enterprise-grade SAS SSD/HDDs.Cluster is completely symmetrical. Hosts consist on 2 disk groups of 8 disks. (1) 400GB Enterprise SAS SSD / (7) 1.2 TB 10K SAS HDD."  VSAN HCL validated multiple times for incorrect drivers, firmwares and even hardware. All check out.

             

            I'm not seeing any pause frames on the upstream UCS Fabric Interconnects. Flow Control is not configured either, nor does it appear to be configurable on the VIC 1227:

             

            [root@-------vsan-06:~] esxcli system module parameters list -m enic

            Name               Type  Value  Description

            -----------------  ----  -----  -------------------------------------------------------------------------

            heap_initial       int          Initial heap size allocated for the driver.

            heap_max           int          Maximum attainable heap size for the driver.

            skb_mpool_initial  int          Driver's minimum private socket buffer memory pool size.

            skb_mpool_max      int          Maximum attainable private socket buffer memory pool size for the driver.

             

            [root@-------vsan-06:~] ethtool -a vmnic5

            Pause parameters for vmnic5:

            Cannot get device pause settings: Operation not supported

             

            Per KB2146267 I tried disabling the dedup scanner but this did not improve anything. I also updated the pNIC drivers and that didn't help either.

             

            • 18. Re: Write latency and network errors
              LeslieBNS9 Enthusiast

              We are also seeing a lot of these errors on our All Flash vSAN environment. We've been doing some testing and think we have narrowed down the issue.

               

              We have 6 hosts with the following configuration..

              SuperMicro 1028U-TR4+

              2xIntel E5-2680v4

              512GB RAM

              X710-DA2 10GB Network Adapters (Dedicated for vSAN, not shared)

              Cisco 3548 Switches (Dedicated for vSAN, not shared)

               

              We went through different drives/firmware on our X710, but so far none of that has made a difference.

               

              We noticed on our Cisco switch that all of the interfaces connected to our vSAN were having discards on a regular basis (multiple times every hour). We opened a support case with Cisco to troubleshoot this and found that ALL of our vSAN ports have bursts of traffic that are filling up the output buffers on the switch. During these bursts/full buffers the switch discards the packets.

               

              So I would check on your switches to see if you are having any packet discards.

               

              At this point Cisco is recommending we move to a deep buffer switch. I spoke with VMWare support to see if there is a specific switch they recommend (or buffers), but they said they just require a 10Gb switch. I find this frustrating as we have 2 expensive switches we are only using 6 ports on and may not be able to add any more hosts to.

               

              Ethernet1/2 queuing information:

                  qos-group  sched-type  oper-bandwidth

                      0       WRR            100

                  Multicast statistics:

                      Mcast pkts dropped                      : 0

                  Unicast statistics:

                  qos-group 0

                  HW MTU: 16356 (16356 configured)

                  drop-type: drop, xon: 0, xoff: 0

                  Statistics:

                      Ucast pkts dropped                      : 180616

               

              Ethernet1/2 is up

              Dedicated Interface

                Hardware: 100/1000/10000 Ethernet, address: 00d7.8faa.cf09 (bia 00d7.8faa.cf09)

                MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec

                reliability 255/255, txload 2/255, rxload 4/255

                Encapsulation ARPA

                Port mode is access

                full-duplex, 10 Gb/s, media type is 10G

                Beacon is turned off

                Input flow-control is off, output flow-control is off

                Rate mode is dedicated

                Switchport monitor is off

                EtherType is 0x8100

                Last link flapped 4d12h

                Last clearing of "show interface" counters 3d23h

                0 interface resets

                Load-Interval #1: 30 seconds

                30 seconds input rate 98177624 bits/sec, 4262 packets/sec

                30 seconds output rate 124356600 bits/sec, 4302 packets/sec

                Load-Interval #2: 5 minute (300 seconds)

                  input rate 163.09 Mbps, 6.20 Kpps; output rate 113.03 Mbps, 6.33 Kpps

                RX

                  2620601947 unicast packets  5716 multicast packets  335 broadcast packets

                  2620612576 input packets  10625804438347 bytes

                  1353181073 jumbo packets  0 storm suppression bytes

                  0 runts  0 giants  0 CRC  0 no buffer

                  0 input error  0 short frame  0 overrun   0 underrun  0 ignored

                  0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop

                  0 input with dribble  0 input discard

                  0 Rx pause

                TX

                  2619585440 unicast packets  0 multicast packets  2452 broadcast packets

                  2619587892 output packets  9072740199246 bytes

                  1162617883 jumbo packets

                  0 output errors  0 collision  0 deferred  0 late collision

                  0 lost carrier  0 no carrier  0 babble 180616 output discard

                  0 Tx pause

              • 19. Re: Write latency and network errors
                Great_White_Tec Expert
                VMware EmployeesvExpert

                For NIC issues, here is a typical checklist:

                • Make sure the NICs are on the vSphere VCG
                • Not only make sure that Firmware and Drivers are up to date (latest), BUT also that there are no mismatches
                  • Mismatches between these two have been know to cause some issues, in particular packet drops, based on my experience
                • For the X710 (X71x & 72x) disabling LRO / TSO have resolved a lot of the issues encountered in the past.
                • 20. Re: Write latency and network errors
                  LeslieBNS9 Enthusiast

                  For the X710 (X71x & 72x) disabling LRO / TSO have resolved a lot of the issues encountered in the past.

                   

                  We are aware of the LRO/TSO errors and the firmware/driver version recommendations for the X710's and have already been through all of those settings.

                  • 21. Re: Write latency and network errors
                    LeslieBNS9 Enthusiast

                    Also all of our hardware is on the HCL and has matching drivers/firmware.

                     

                    I actually posted another thread specific to my issue at All-Flash vSAN Latency & Network Discards (Switching Recommendations) 

                     

                    I just wanted to give the poster here some reference in case they are seeing the same thing we are seeing.

                    • 22. Re: Write latency and network errors
                      InfiniVirt Novice

                      Thanks LeslieBNS9. I believe we are experiencing similar causation.

                       

                      Instead of uplinking our UCS servers directly to switches first they connect to the Fabric Interconnects 6248s, which then uplink to Nexus 7010s via (2) 40GE vPCs. The Fabric Interconnects are discarding packets as evidenced by "show queuing interface" on all active vSAN interfaces. The manner in which we have vmnics situated in VMware (Active/Standby) Fabric B is effectively dedicated to VSAN traffic, and the cluster is idle so not a bandwidth issue or even contention, rather the FI's scrawny buffer assigned to custome QoS System Classes in UCS not able to handle bursts. We have QoS configured per the Cisco VSAN Reference doc. Platinum CoS is assigned qos-group 2, which only has a queue/buffer size of 22720! NXOS in the UCS FIs is read-only so this is not configurable.

                       

                      I will probably disable Platinum QoS System Class and assigning VSAN vNICs to Best Effort so we can at least increase the available queue size to 150720

                      Ethernet1/1 queuing information:

                        TX Queuing

                          qos-group  sched-type  oper-bandwidth

                              0       WRR              3   (Best Effort)

                              1       WRR             17  (FCoE)

                              2       WRR             31  (VSAN)

                              3       WRR             25  (VM)

                              4       WRR             18  (vMotion)

                              5       WRR              6   (Mgmt)

                       

                      RX Queuing

                          qos-group 0

                          q-size: 150720, HW MTU: 1500 (1500 configured)

                          drop-type: drop, xon: 0, xoff: 150720

                       

                      qos-group 1

                          q-size: 79360, HW MTU: 2158 (2158 configured)

                          drop-type: no-drop, xon: 20480, xoff: 40320

                       

                      qos-group 2

                          q-size: 22720, HW MTU: 1500 (1500 configured)

                          drop-type: drop, xon: 0, xoff: 22720

                          Statistics:

                              Pkts received over the port             : 256270856

                              Ucast pkts sent to the cross-bar        : 187972399

                              Mcast pkts sent to the cross-bar        : 63629024

                              Ucast pkts received from the cross-bar  : 1897117447

                              Pkts sent to the port                   : 2433368432

                              Pkts discarded on ingress               : 4669433

                              Per-priority-pause status               : Rx (Inactive), Tx (Inactive)

                       

                      Egress Buffers were verified to be congested during large file copy:

                      show hardware internal carmel asic 0 registers match .*STA.*frh.* | i eg

                       

                      The following command reveals congestion on the egress (reference):

                      nap-FI6248-VSAN-B(nxos)#  show hardware internal carmel asic 0 registers match .*STA.*frh.* | i eg

                      Slot 0 Carmel 0 register contents:

                      Register Name                                          | Offset   | Value

                      car_bm_STA_frh_eg_addr_0                               | 0x50340  | 0x1

                      car_bm_STA_frh_eg_addr_1                               | 0x52340  | 0

                      car_bm_STA_frh_eg_addr_2                               | 0x54340  | 0

                      car_bm_STA_frh_eg_addr_3                               | 0x56340  | 0

                      car_bm_STA_frh_eg_addr_4                               | 0x58340  | 0

                      car_bm_STA_frh_eg_addr_5                               | 0x5a340  | 0

                      car_bm_STA_frh_eg_addr_6                               | 0x5c340  | 0

                      car_bm_STA_frh_eg_addr_7                               | 0x5e340  | 0

                      nap-FI6248-VSAN-B(nxos)#  show hardware internal carmel asic 0 registers match .*STA.*frh.* | i eg

                      Slot 0 Carmel 0 register contents:

                      Register Name                                          | Offset   | Value

                      car_bm_STA_frh_eg_addr_0                               | 0x50340  | 0x2

                      car_bm_STA_frh_eg_addr_1                               | 0x52340  | 0

                      car_bm_STA_frh_eg_addr_2                               | 0x54340  | 0

                      car_bm_STA_frh_eg_addr_3                               | 0x56340  | 0

                      car_bm_STA_frh_eg_addr_4                               | 0x58340  | 0

                      car_bm_STA_frh_eg_addr_5                               | 0x5a340  | 0

                      car_bm_STA_frh_eg_addr_6                               | 0x5c340  | 0

                      car_bm_STA_frh_eg_addr_7                               | 0x5e340  | 0

                      nap-FI6248-VSAN-B(nxos)#  show hardware internal carmel asic 0 registers match .*STA.*frh.* | i eg

                      Slot 0 Carmel 0 register contents:

                      Register Name                                          | Offset   | Value

                      car_bm_STA_frh_eg_addr_0                               | 0x50340  | 0

                      car_bm_STA_frh_eg_addr_1                               | 0x52340  | 0

                      car_bm_STA_frh_eg_addr_2                               | 0x54340  | 0

                      car_bm_STA_frh_eg_addr_3                               | 0x56340  | 0x1

                      car_bm_STA_frh_eg_addr_4                               | 0x58340  | 0

                      car_bm_STA_frh_eg_addr_5                               | 0x5a340  | 0

                      car_bm_STA_frh_eg_addr_6                               | 0x5c340  | 0

                      car_bm_STA_frh_eg_addr_7                               | 0x5e340  | 0

                       

                       

                      I should note we are not seeing discards or drops on any of the 'show interface' counters.

                      • 23. Re: Write latency and network errors
                        wreedMH Enthusiast

                        Subscribing. I have same issues.

                        • 24. Re: Write latency and network errors
                          JimL1651 Novice

                          We're having the same issue on a new 12 node, all flash stretch cluster with raid-5 and encryption.  Write latency is very high.  We have support tickets open with Dell and VMware. We've done testing with hcibench and SQLIO using different storage policies.  Raid 1 is better but still below what we consider acceptable.

                           

                          The out of order packets were caused by having dual uplinks to two different top of rack switches. We resolved that by changing them active-passive instead of active-active.  We'll convert to LACP when we get a chance.  Networking is all 10gig with < 1ms latency between hosts and sites.  Top of rack switches are Cisco Nexus 5K's and all error counters are clean.  Using iPerf from the host shell shows we can easily push greater than 9gbit between hosts and sites with .5 to .6 ms latency.

                          • 25. Re: Write latency and network errors
                            pkonz Lurker

                            LeslieBNS9,

                             

                            Did you ending up getting a deep buffer switch? We are having the same issue.

                            • 26. Re: Write latency and network errors
                              TolgaAsik Enthusiast
                              vExpert

                              Hello All,

                               

                              Same issue we are experiencing. Any update for solution?

                              My switches Nexus 5548UP, a lot of packets are discarding on the switch ports.

                              • 27. Re: Write latency and network errors
                                sk84 Expert
                                vExpert

                                Meanwhile, you can read more and more about packet discards on the switch side in All-Flash vSAN configurations. The cause often seems to be the buffer on the switch side. VMware itself gives little or no information about which switch components to use because they want to be hardware independent and don't prefer a vendor. But in my personal opinion, most Nexus switches are crap for use in vSAN all-flash configurations, especially if they're over 5 years old and have a shared buffer.

                                 

                                However, John Nicholson (Technical Marketing vSAN) recently published a post on Reddit that summarizes some points to keep in mind (but it's his personal opinion and no official statement):

                                1. Don't use Cisco FEX's. Seriously, just don't. Terrible buffers, no port to port capabilities. Even Cisco will telly you not to put storage on them
                                2. Buffers. For a lab that 4MB buffer marvel $1000 special might work but really 12MB is the minimum buffer I wantt o see. IF you want to go nuts I've heard some lovely things about those crazy 6GB buffer StrataDNX DUNE ASIC switches (Even Cisco carries one the Nexus 36xx I think). Dropped frames/packets/re-transmits rapidly slow down storage. That Cisco Nexus 5500 that's 8 years old and has VoQ stuff? Seriously don't try running a heavy database on it!
                                3. It's 2019. STOP BUYING 10Gbps stuff. 25Gbps cost very little more, and 10Gbps switches that can't do 25Gbps are likely 4 year old ASIC's at this point.
                                4. Mind your NIC Driver/Firmware. The vSphere Health team has even started writing online health checks to KB's on a few. Disable the weird PCI-E powersaving if using Intel 5xx series NIC's. It will cause flapping.
                                5. LACP if you use it, use the vDS and do a advanced hash (SRC-DST) to get proper bang/buck. Don't use crappy IP HASH only. No shame in active/passive. simpler to troubleshoot and failure behavior is cleaner.
                                6. TURN ON CDP/LLDP in both directions!
                                7. Only Arista issue I've seen (Was another redditor complaining about vSAN performance actually a while back we helped) was someone who mis-matched his LAG policies/groups/hashes.

                                 

                                Interfaces. TwinAx I like because unlike 10Gbase-T you don't have to worry about interference or termination, they are reasonably priced, and as long as you don't need a long run the passive ones don't cause a lot of comparability issues.

                                https://www.reddit.com/r/vmware/comments/aumhvj/vsan_switches/

                                • 28. Re: Write latency and network errors
                                  TolgaAsik Enthusiast
                                  vExpert

                                  Thank you for answer.

                                  Now we are working on the case with Cisco support. If I summary, they recommend us to apply the following steps:

                                  My issue is huge ingress packet discarding by the switch.

                                  - HOLB Mitigation: Enable VOQ Limit

                                  - HOLB Mitigation: Traffic Classification

                                  https://www.cisco.com/c/en/us/support/docs/switches/nexus-6000-series-switches/200401-Nexus-5600-6000-Understanding-and-Troub.html

                                   

                                  After applied the steps, I will inform you.

                                  • 29. Re: Write latency and network errors
                                    TolgaAsik Enthusiast
                                    vExpert

                                    Still it continues, we applied QoS by using ACL, but we couldnt finalize the issue.