<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hyperflex + VMWare: a lot of restransmissions in vSphere™ vNetwork Discussions</title>
    <link>https://communities.vmware.com/t5/vSphere-vNetwork-Discussions/Hyperflex-VMWare-a-lot-of-restransmissions/m-p/2968381#M14681</link>
    <description>&lt;P&gt;Hi everyone!&lt;/P&gt;&lt;P&gt;Probably someone faced with the same issue. I found that some types of packets cause a huge amount of retransmissions in our VMWare environment.&lt;/P&gt;&lt;P&gt;1. SMB: we have a tunnel between offices. In one office we have Cisco Hyperflex with the latest firmware and VMWare 6.7 cluster. And there is a dedicated server with the latest firmware and free ESXi 8 in the other. If I copy file from physical device like laptop from the office where hyperflex is placed to through the tunnel to the remote office I do that for 3-7 seconds without hangs and speed drops. I made all dumps on core in office with HPX.&lt;/P&gt;&lt;P&gt;laptop (Windows) -&amp;gt; WiFi AP -&amp;gt; access switch -&amp;gt; core switch -&amp;gt; router -&amp;gt; tunnel -&amp;gt; router - distribution switch - server - ESXi 8 - VM (Windows).&lt;/P&gt;&lt;P&gt;There are no retransmissions just normal coping.&lt;/P&gt;&lt;P&gt;However, if I do the same thing from VM in Hyperflex to VM in ESXi:&lt;/P&gt;&lt;P&gt;VM (Windows) -&amp;gt; Hyperflex server -&amp;gt; UCS (2 UCSs with LACP per each) -&amp;gt; core switch -&amp;gt; router -&amp;gt; tunnel -&amp;gt; router -&amp;gt; distribution switch -&amp;gt; server -&amp;gt; ESXi -&amp;gt; VM (Windows).&lt;/P&gt;&lt;P&gt;I got so much retransmissions that the same file can be copied up to 30-40 minutes. We have MTU 9000 for all network devices (except tunnels of course). We also use TrustSec (+SGT overhead) and IPv4 only.&lt;/P&gt;&lt;P&gt;I tried to use VMXNET3 and E1000 cards, update Windows VMWare Tools, change physical connections (left only one connected UCS), change MTU on vSwithes,&amp;nbsp;disable TSO/LSO in VM - nothing helped. I also tried to change&amp;nbsp;&lt;SPAN&gt;EnableBandwidthThrottling on client VM to 0 and this helped a little - less retransmissions but still a lot of. I also checked errors on physical ports in UCS and Core - they look clear.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;From laptop:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;42406 2023-05-04 16:35:13,813605 58.160514 10.0.242.122 10.3.100.5 TCP 1354 51590 → 445 [ACK] Seq=45218536 Ack=7033 Win=511 Len=1300 [TCP segment of a reassembled PDU]

42407 2023-05-04 16:35:13,813605 58.160514 10.0.242.122 10.3.100.5 TCP 1354 51590 → 445 [ACK] Seq=45219836 Ack=7033 Win=511 Len=1300 [TCP segment of a reassembled PDU]

42408 2023-05-04 16:35:13,813605 58.160514 10.0.242.122 10.3.100.5 TCP 1354 51590 → 445 [ACK] Seq=45221136 Ack=7033 Win=511 Len=1300 [TCP segment of a reassembled PDU]&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;From VM:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;  20056 2023-05-04 17:35:30,786043    55.365382      10.3.100.5            10.0.100.55           TCP      66     [TCP Dup ACK 20055#1] 445 → 53430 [ACK] Seq=1715 Ack=4152732 Win=8196 Len=0 SLE=4239832 SRE=4245032

  20057 2023-05-04 17:35:30,786094    55.365433      10.0.100.55           10.3.100.5            TCP      1354   [TCP Retransmission] 53430 → 445 [ACK] Seq=4152732 Ack=1715 Win=6244 Len=1300

  20058 2023-05-04 17:35:30,786114    55.365453      10.0.100.55           10.3.100.5            TCP      1354   [TCP Retransmission] 53430 → 445 [ACK] Seq=4154032 Ack=1715 Win=6244 Len=1300&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I also checked connection with iPerf and TCP/UDP packets with different MTU from HPX VM. I found that if I set MTU to 1200 in app I got maximum speed and no problems. We do not block ICMP so it couldn't be black hole issue with MTU adjastment.&lt;/P&gt;&lt;P&gt;Coping from laptop to HPX inside office or from VM to VM inside HPX cluster flows without problems.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Second issue is a lot of retransmissions for packates in Hyperflex cluster between Cisco VMs (like ISE, FMC, WLC) and also between network devices.&lt;/P&gt;&lt;P&gt;For example:&lt;/P&gt;&lt;P&gt;a. WLC-VM (WiFi Controller) and ISE-VM are placed on different servers inside one HPX cluster. They connect to each other through the Core switch:&lt;/P&gt;&lt;P&gt;VM -&amp;gt; HPX server -&amp;gt; UCS -&amp;gt; Core -&amp;gt; UCS -&amp;gt; HPX server -&amp;gt; VM&lt;/P&gt;&lt;P&gt;I captured traffic on the core and found a lot of Duplicate Response (Access-Request) RADIUS messages. A LOT of. I guess that this is the reason why our users have to wait for sometime for dot1x authorization. These messages size are not bigger than 1200 bytes. I tried to connect from WLC to ISE by ssh and it was clear traffic - no retransmissions or DUPs. So I guess only special (probably only UDP traffic) types of packets suffer.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The same situation if ISE deploys CoA/RADIUS/TCP (REST) packets to network devices: a lot of DUPs and retransmissions. For example for SGT propagation. This cause deploy issues when devices start dropping incoming packets of that type and stop responding to ISE.&lt;/P&gt;&lt;P&gt;The third issue I found for Cisco Firepower Management Center. It uses TCP for devices connection. And again the same situation: a lot of retransmissions and DUPs. However, I tried to open ssh from ISE to firepower devices - no issues.&lt;/P&gt;&lt;P&gt;There are no errors on interfaces. I tried to set less MTU on vSwitches, VMs interfaces, change VMXNET to E1000 and vise versa - no result.&lt;/P&gt;&lt;P&gt;I would be grateful for any suggestions and ideas.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Sun, 14 May 2023 10:26:13 GMT</pubDate>
    <dc:creator>SergeyRusak</dc:creator>
    <dc:date>2023-05-14T10:26:13Z</dc:date>
    <item>
      <title>Hyperflex + VMWare: a lot of restransmissions</title>
      <link>https://communities.vmware.com/t5/vSphere-vNetwork-Discussions/Hyperflex-VMWare-a-lot-of-restransmissions/m-p/2968381#M14681</link>
      <description>&lt;P&gt;Hi everyone!&lt;/P&gt;&lt;P&gt;Probably someone faced with the same issue. I found that some types of packets cause a huge amount of retransmissions in our VMWare environment.&lt;/P&gt;&lt;P&gt;1. SMB: we have a tunnel between offices. In one office we have Cisco Hyperflex with the latest firmware and VMWare 6.7 cluster. And there is a dedicated server with the latest firmware and free ESXi 8 in the other. If I copy file from physical device like laptop from the office where hyperflex is placed to through the tunnel to the remote office I do that for 3-7 seconds without hangs and speed drops. I made all dumps on core in office with HPX.&lt;/P&gt;&lt;P&gt;laptop (Windows) -&amp;gt; WiFi AP -&amp;gt; access switch -&amp;gt; core switch -&amp;gt; router -&amp;gt; tunnel -&amp;gt; router - distribution switch - server - ESXi 8 - VM (Windows).&lt;/P&gt;&lt;P&gt;There are no retransmissions just normal coping.&lt;/P&gt;&lt;P&gt;However, if I do the same thing from VM in Hyperflex to VM in ESXi:&lt;/P&gt;&lt;P&gt;VM (Windows) -&amp;gt; Hyperflex server -&amp;gt; UCS (2 UCSs with LACP per each) -&amp;gt; core switch -&amp;gt; router -&amp;gt; tunnel -&amp;gt; router -&amp;gt; distribution switch -&amp;gt; server -&amp;gt; ESXi -&amp;gt; VM (Windows).&lt;/P&gt;&lt;P&gt;I got so much retransmissions that the same file can be copied up to 30-40 minutes. We have MTU 9000 for all network devices (except tunnels of course). We also use TrustSec (+SGT overhead) and IPv4 only.&lt;/P&gt;&lt;P&gt;I tried to use VMXNET3 and E1000 cards, update Windows VMWare Tools, change physical connections (left only one connected UCS), change MTU on vSwithes,&amp;nbsp;disable TSO/LSO in VM - nothing helped. I also tried to change&amp;nbsp;&lt;SPAN&gt;EnableBandwidthThrottling on client VM to 0 and this helped a little - less retransmissions but still a lot of. I also checked errors on physical ports in UCS and Core - they look clear.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;From laptop:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;42406 2023-05-04 16:35:13,813605 58.160514 10.0.242.122 10.3.100.5 TCP 1354 51590 → 445 [ACK] Seq=45218536 Ack=7033 Win=511 Len=1300 [TCP segment of a reassembled PDU]

42407 2023-05-04 16:35:13,813605 58.160514 10.0.242.122 10.3.100.5 TCP 1354 51590 → 445 [ACK] Seq=45219836 Ack=7033 Win=511 Len=1300 [TCP segment of a reassembled PDU]

42408 2023-05-04 16:35:13,813605 58.160514 10.0.242.122 10.3.100.5 TCP 1354 51590 → 445 [ACK] Seq=45221136 Ack=7033 Win=511 Len=1300 [TCP segment of a reassembled PDU]&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;From VM:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;  20056 2023-05-04 17:35:30,786043    55.365382      10.3.100.5            10.0.100.55           TCP      66     [TCP Dup ACK 20055#1] 445 → 53430 [ACK] Seq=1715 Ack=4152732 Win=8196 Len=0 SLE=4239832 SRE=4245032

  20057 2023-05-04 17:35:30,786094    55.365433      10.0.100.55           10.3.100.5            TCP      1354   [TCP Retransmission] 53430 → 445 [ACK] Seq=4152732 Ack=1715 Win=6244 Len=1300

  20058 2023-05-04 17:35:30,786114    55.365453      10.0.100.55           10.3.100.5            TCP      1354   [TCP Retransmission] 53430 → 445 [ACK] Seq=4154032 Ack=1715 Win=6244 Len=1300&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I also checked connection with iPerf and TCP/UDP packets with different MTU from HPX VM. I found that if I set MTU to 1200 in app I got maximum speed and no problems. We do not block ICMP so it couldn't be black hole issue with MTU adjastment.&lt;/P&gt;&lt;P&gt;Coping from laptop to HPX inside office or from VM to VM inside HPX cluster flows without problems.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Second issue is a lot of retransmissions for packates in Hyperflex cluster between Cisco VMs (like ISE, FMC, WLC) and also between network devices.&lt;/P&gt;&lt;P&gt;For example:&lt;/P&gt;&lt;P&gt;a. WLC-VM (WiFi Controller) and ISE-VM are placed on different servers inside one HPX cluster. They connect to each other through the Core switch:&lt;/P&gt;&lt;P&gt;VM -&amp;gt; HPX server -&amp;gt; UCS -&amp;gt; Core -&amp;gt; UCS -&amp;gt; HPX server -&amp;gt; VM&lt;/P&gt;&lt;P&gt;I captured traffic on the core and found a lot of Duplicate Response (Access-Request) RADIUS messages. A LOT of. I guess that this is the reason why our users have to wait for sometime for dot1x authorization. These messages size are not bigger than 1200 bytes. I tried to connect from WLC to ISE by ssh and it was clear traffic - no retransmissions or DUPs. So I guess only special (probably only UDP traffic) types of packets suffer.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The same situation if ISE deploys CoA/RADIUS/TCP (REST) packets to network devices: a lot of DUPs and retransmissions. For example for SGT propagation. This cause deploy issues when devices start dropping incoming packets of that type and stop responding to ISE.&lt;/P&gt;&lt;P&gt;The third issue I found for Cisco Firepower Management Center. It uses TCP for devices connection. And again the same situation: a lot of retransmissions and DUPs. However, I tried to open ssh from ISE to firepower devices - no issues.&lt;/P&gt;&lt;P&gt;There are no errors on interfaces. I tried to set less MTU on vSwitches, VMs interfaces, change VMXNET to E1000 and vise versa - no result.&lt;/P&gt;&lt;P&gt;I would be grateful for any suggestions and ideas.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Sun, 14 May 2023 10:26:13 GMT</pubDate>
      <guid>https://communities.vmware.com/t5/vSphere-vNetwork-Discussions/Hyperflex-VMWare-a-lot-of-restransmissions/m-p/2968381#M14681</guid>
      <dc:creator>SergeyRusak</dc:creator>
      <dc:date>2023-05-14T10:26:13Z</dc:date>
    </item>
  </channel>
</rss>

