MrPowerEdge
Contributor
Contributor

Dell Mellanox ConnectX-5 not achieving 25GbE with vSphere 7.0

Jump to solution

Customer has 5 vSAN ready nodes all Dell PowerEdge 7R525 with AMD EPYC processors.

Setup:

2x S5248F-ON switches with firmware 10.5.1.4.

2x Mellanox ConnectX-5 dual port - firmware 16.27.61.06

ESXi 7.0 - Mellanox driver - Mellanox-nmlx5_4.19.70.1-1OEM.700.1.0.15525992_16253686-package

iperf between hosts we can only get about 15-16Gbps. I wasn't expecting full line rate, but at least 20Gbps+

Host1:

esxcli network firewall set --enabled false
/usr/lib/vmware/vsan/bin/iperf3.copy -s 10.0.0.1

Host2:

esxcli network firewall set --enabled false
/usr/lib/vmware/vsan/bin/iperf3.copy -c 10.0.0.1 -i 1 -t 10 -w 16M

Tuning we have tried, but these make no difference:

esxcfg-advcfg -s 4 /Net/TcpipRxDispatchQueues
esxcfg-advcfg -s 65535 /Net/VmxnetLROMaxLength
esxcli system settings kernel set -s netMaxPktsToProcess -v 128
esxcli system settings kernel set -s intrBalancingEnabled -v false
esxcli network nic ring current set -r 4096 -n vmnic2
esxcli network nic coalesce set -a false -n vmnic2

Any thoughts please, Dell support won't help?

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
MrPowerEdge
Contributor
Contributor

Turns out you need to start multiple iperf processes from separate SSH sessions to max out the 25GbE pipe.

esxcli network firewall set --enabled false

SSH session 1 - host 1 # /usr/lib/vmware/vsan/bin/iperf3.copy -s -B 10.10.10.10 -p 5201

SSH session 2 - host 1 # /usr/lib/vmware/vsan/bin/iperf3.copy -s -B 10.10.10.10 -p 5202

SSH session 3 - host 1 # /usr/lib/vmware/vsan/bin/iperf3.copy -s -B 10.10.10.10 -p 5202

SSH session 1 - host 2 # /usr/lib/vmware/vsan/bin/iperf3.copy -P 4 -M -t 60 -c 10.10.10.10 -p 5201

SSH session 2 - host 2 # /usr/lib/vmware/vsan/bin/iperf3.copy -P 4 -M -t 60 -c 10.10.10.10 -p 5202

SSH session 3 - host 2 # /usr/lib/vmware/vsan/bin/iperf3.copy -P 4 -M -t 60 -c 10.10.10.10 -p 5202

esxcli network firewall set –enabled true

View solution in original post

0 Kudos
4 Replies
ZibiM
Enthusiast
Enthusiast

I'd guess you've seen that:

How to Tune vMotion for Lower Migration Times? - VMware vSphere Blog

15 Gbps seems to consistent with single stream performance.

As VSAN uses only single vmk it might be a challange to squeeze more

Can you try LACP/LAG ? I mean proper LAG and not ip hash load balancing.

From my experience this helps with better utilization of the pnics.

MrPowerEdge
Contributor
Contributor

Hi thanks for taking the time to respond. The CPU is hitting 90% during the iperf test, which leads me to think the offloading capabilities of the Connect-x5 aren't working.

0 Kudos
MrPowerEdge
Contributor
Contributor

Turns out you need to start multiple iperf processes from separate SSH sessions to max out the 25GbE pipe.

esxcli network firewall set --enabled false

SSH session 1 - host 1 # /usr/lib/vmware/vsan/bin/iperf3.copy -s -B 10.10.10.10 -p 5201

SSH session 2 - host 1 # /usr/lib/vmware/vsan/bin/iperf3.copy -s -B 10.10.10.10 -p 5202

SSH session 3 - host 1 # /usr/lib/vmware/vsan/bin/iperf3.copy -s -B 10.10.10.10 -p 5202

SSH session 1 - host 2 # /usr/lib/vmware/vsan/bin/iperf3.copy -P 4 -M -t 60 -c 10.10.10.10 -p 5201

SSH session 2 - host 2 # /usr/lib/vmware/vsan/bin/iperf3.copy -P 4 -M -t 60 -c 10.10.10.10 -p 5202

SSH session 3 - host 2 # /usr/lib/vmware/vsan/bin/iperf3.copy -P 4 -M -t 60 -c 10.10.10.10 -p 5202

esxcli network firewall set –enabled true

View solution in original post

0 Kudos
ZibiM
Enthusiast
Enthusiast

Thanks for follow up

Back to your original scenario, I'd say that makes stripe width setting bit more important.

With 25Gb NICs it looks we should spread objects to as many nodes as possible in order to maximize the throughput.

0 Kudos