Hosts: ESX 3.5u4; multiple brands of 4 socket, 24 core servers, 128G of RAM, 2x10G adapters (Intel, Broadcom and Server Engines).
Network: cisco 5010
Test client: rhel5u2x32
Test methodology: using an nfs mount, cat /mount/nfsfile > /dev/null. The file is 4G of text and is 100% cached.
We are unable to drive the network to more than 1.6Gb/s.
If we use 10 clients, each client evenly splits split up the 1.6Gb/s.
On later tests, we used clients on multiple vSwitchs, to multiple NICs, to multiple 10G storage controllers, and still only get 1.6Gb/s.
Has anyone actually pushed a 10G adapter? If so, how?
Hello,
We were able to push up to 4.5 Gb on our HP Flex 10 gear. We will test further once more evaluation gear is onsite. Our test involved NFS mounted VMs and IOMeter. We used 3 VMs running a 100% random workload.
What is your Cpu0 on your ESX host at when you run your tests? You could be hitting an IRQ sharing issue.
Are you hitting bottlenecks on your storage? ESX host, network equipment?
Thank you for the response.
We are checking on cpu0. We did not notice anything our of the
ordinary, but were not specifically watching cpu0
Addressing the bottlenecks issue, we were able to add a second storage
array and point 1/2 our clients at the second array. There was no
change in the amount of data that we could push. The total of all data
was 1.6Gb/s. Using this data, I would assume that we were not storage
bound. When I was asked to assist with this test, that is one of the
first things I checked.
Our network switch (we isolated one switch for this test) is a Cisco
Nexus 5010. I don't have the upper end of switch capabilities. I know
it can handle much more than a couple of esx servers. There is no
routing in my environment. All packets are local.
I need to try iometer to see how it reacts.
After following article http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100427...
we are able to get 8.6Gb/s on a Fujitsu RX600 s4 with 24 cores, 128G RAM, and a single Intel 82598EB connected to a netapp fas3170 with a 16G PAM card. Can't wait for the next gen PAM card. It will rock this PAM card.
This article seems like it is only for Intel adapters. We still need to find a solution for our IBM x3850m2s with Broadcom 57710.
Hey,
What network traffic did you get moving at 8.6Gb\sec? NetQueue would have little impact on just NFS. where you combining traffic? or just generating network traffic from the virtual machines?
This was a simple file copy over an nfs mount.
cat /mount/filename > /dev/null
All reads, 0 writes, multiple copies.
We also reached 5Gb/s using a ServerEngines CNA on a IBM HS22
One interresting issue we are still trying to work out on the intel cards. The queues spread out of the first 16 cores. (24 core server). The first 16 guests get 3x performance over any guest after the first 16. Even after the first 16 guests are done copying their file, the 16+ guests still run at the same reduced rate We shut down the first 16 guests. The 16+ guests still run at a reduced rate. Only after rebooting the 16+ guests will they come up to full speed.
There is definately still some issues. I may want to start testing vsphere!
johnhaas,
Do you know if an article exists that show steps on how to get better performance out of ESX 4.0 with the same Intel NICs?
Thanks!
Have you checked your server to verify that it can handle this load? Is the PCI slot PCIe 2?
JP
I would take a look at the best practices white paper that I wrote regarding 10G and VMware vSphere 4.
Simplifying Networking using 10G Ethernet
Brian Johnson
Intel Corp -- LAN Access Division
PME - 10G and Virtualization Technologies
not sure if this helps you but look slike the netqueue by default is only set to 1 (this should be set to the same number of cores) there is a maximum of 16 though, so it looks like your machine is using the first 16 cores for netqueue. It will never be able to use all 24