VMware Cloud Community
cac3a
Contributor
Contributor

Dell R710 with Intel 82599 10 Gigabit can't get to 10Gb

I've been trying to get a lab setup to see how close I can get to 10Gb speeds between two guests running on separate hosts.

My setup is this:

2 x Dell R710

2 x Intel 82599 10Gb cards

Cisco nexus 7k with two top of the rack switches

All SFP's are branded and compatible with the equipment.

Each host is running one CentOs 6.6

I've notice couple of things depending on the order of the NIC's. Here is a simple test set:

       

Host .5Host .28iperf3 -w 65536 -P 10iperf3 -w 65536 -P 10
1st Nic2nd Nic1st Nic2nd Nictesting from .5 to .28testing from .28 to .5
vmnic4vmnic5vmnic4vmnic59.18Gb9.21Gb
vmnic4vmnic5vmnic5vmnic42.07Gb8.8Gb
vmnic5vmnic4vmnic4vmnic58.92Gb1.89Gb
vmnic5vmnic4vmnic5vmnic48.96Gb9.14Gb

Depending on the vmnic order I get either 9Gb or in ~2Gb in speed. Why is this happening ?

Also, when I run the test without option -P 10 esentially using only 1 connection my speed is ~2.5Gb. Shouldn't it be close 10 as well? If I increase the tcp window to about 500k I get 4.5Gb, but never over that.

Thanks

Reply
0 Kudos
3 Replies
MKguy
Virtuoso
Virtuoso

Can you give some more details about your setup, like how exactly are the servers and switches physically cabled together, how they are configured, flow control, what ESXi version, how are the vSwitches configured, which load balancing are you using etc, what vNIC are you using on VMs etc?

Also "Intel 82599" is just the name of the processing chip, but not the actual NIC name branded by some vendor.

Also, when I run the test without option -P 10 esentially using only 1 connection my speed is ~2.5Gb. Shouldn't it be close 10 as well? If I increase the tcp window to about 500k I get 4.5Gb, but never over that.

These values seem fine for a single TCP stream. You can only really examine the raw 10Gbe throughput with either UDP or multiple TCP streams. A window of 500KB needs approximately 350 non-jumbo frames (1460 byte TCP payload) and is saturated within about 0.0001 seconds at a bandwidth of 4.5 Gbps. Processing the received traffic and the delay of the ACK should be considered as limiting factors as well.

I don't think getting more out of a single TCP connection is really feasible without some OS TCP stack tweaks, VM low-latency CPU profile, SR-IOV/pNIC pass through or similar.

You can find some hints and more details in these whitepapers:

https://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf

http://www.vmware.com/files/pdf/techpaper/latency-sensitive-perf-vsphere55.pdf

http://www.vmware.com/files/pdf/techpaper/VMware-PerfBest-Practices-vSphere6-0.pdf

-- http://alpacapowered.wordpress.com
Reply
0 Kudos
cac3a
Contributor
Contributor

Physical connectivity is made up as such:

2 X Nexus 7k

2 X Nexus 2k (each 2k has total of 4 10Gb uplinks - 2 going to each 7k)

Host .5 is connected to 2k such that eth0 is on port 15 in switch A and eth1 is on port 15 in switch B, host .28 eth0 is on port 16/switchA and eth1 is on port 16/switchB.

The cards that I have are Intel/Dell X520-DA2 10Gb 10Gbe Network Adapter NIC F3VKG 0F3VKG Dual E10G42BTDA.

I just loaded latest patch on ESXi and latest drivers and latest vmware tools.

I'm running ESXi 6.

I haven't done anything to Cisco appliances yet - everything is set as default at this time (other than the VLANS and port speed). I'm going to check on the flow control, but it would be on default setting - whatever that is....

My vSwitches are as follows:

Both hosts have "Routing based on originating virtual port", MTU set to 1500.

Guests have vmxnet3 as nics on both sides.

I've run the same setup with different cards and same behavior was seen.

Do you have any ideas on why the speed drops on card order switching like that ?

Reply
0 Kudos
zbester
Contributor
Contributor

Hi,

I am planning to setup the exact same hardware setup, same servers, same NIC's and same switches.

Are you still experiencing problems?

Thanks

Zaid Bester

Reply
0 Kudos