A CALL FOR HELP
Jonathan,
What type of switches are you using with your EQL setup?
In production I have Cisco 6509s. For the purpose of troubleshooting I moved the EQL and test hosts to a single blade. Also for the purpose of troubleshooting I ran tests on an unmanaged Netgear switch and direct connect crossover (no switch).
What type of NICs on the ESX hosts? Broadcom or Intel?
The onbord NICs are intel based (4-ports) and the additional nic is broadcom based (4-ports).
Hmm... got no idea at the moment. Sound like an issue on the EqualLogic side for me but I don´t know them very much. Sorry.
I see that you are using 5.1 firmware.
How much free space do you have on the member? On this firmware (see firmware notes) you need at lest 100 GB of free space!
This issue has everyone at EQL stumped. Replacing the controller did not resolve the issue. Engineering is looking into it.
Its a brand new device with about 11TB free. Firmware was upgraded to v5.2.1.
I mean how much free space.
With the new firmware you must least at least 100 GB of free space.
11 TB free space. They're shipping me a replacement unit.
In effect is really strange. Which kind of switches are you using?
I have Cisco 6509s in production. Also tested with an unmanaged Netgear and a direct connection (using cross overs)
Equallogic also sent me a new Dell 6224 for testing
ok, for sure the issue isn't in the switches
Did you get better results with the replacement unit? I just went through pre-production testing and I am currently migrating our infrastructure over to a PS4100X and a PS4100XV. I didn't see any kind of issues like this during any of my testing.
Jake
Have you managed to get any further with your diagnostics? Any chance you can look at my below config and comment on your setup / results?
Current Setup:
- 5 * Dell R710 servers
- 1 * dual port 10GB Broadcom 57711 NIC in each server
- 2 * EqualLogic PS6510E (firmware 5.2.2) configured in a single group with all disks running RAID 10.
- ESXi 5.0 Update 1 - 623860
- each nic is bound to a iscis port group as per forum and Dell suggestions
- Am using Dell MEM driver for advanced multipathing and load balancing
- 45TB total, 1 x 2TB LUN created and hosting the 'IO ANALZYER' machine only.
- no thin disks currently being used.
Have beein using VM Labs IO analyzer to compare results, test being 'Max_Throughput.icf'
Based on everyones experience, what is preferred options currently being used with best performance:
- hardware or software ISCSI?
- Jumbo frams off or on?
- Delayed Ack enabled or disabled?
- storage IO control enabled or disabled?
- what range of IOPS and MBps would you expect to see?
My tests are being very random and not 100% sure how to understand the results. Currently I am seeing:
- software iSCSI on, jumbo frames on, delayed Ack on = 952 IOPS / 472 MBReads/S
- software iSCSI on, jumbo frames off, delayed Ack on = 1169 IOPS / 579 MBReads/S
- software iSCSI on, jumbo frames on, delayed Ack off = 1505 IOPS / 747 MBReads/S
- software iSCSI on, jumbo frames off, delayed Ack off = 181 IOPS / 90 MBReads/S
Is my third test the best results I could expect?
How does that compare with the rest of the community?
Is there any other settings I should be using / changing to improve my results.
Worry is that these results are based on only 1 test machine being run. When I run all 5 test machines (1 on each host) the overall performance drops VERY low.
Thanks in advance for any help / replies
Warren Estermann
Having the same issue
I have two groups installed in two different location each group have 3 * ps6000 array members with firmware 5.2.1
appreciate if you can update me with any suggested solution
thanks,
Ramzy
Update:
The L3 tech I was working with ordered a brand new PS4100X and had it shipped to his lab. In his lab environment he saw full 200MB/s throughput.
I saw the same poor performance when I configured the replacement unit in my environment.
Long story short, I got a replacement unit but the issue was not resolved.
w_estermann,
Have you tried running all four tests?
http://www.mez.co.uk/OpenPerformanceTest.icf
RealLife and Random will give you a better idea of how your system will perform in "Real Life"
What block size are you using for your tests? It doesn't look like the 32K block used in OpenPerformance.
1505 IOPS / 747 MBReads/S
If you are in fact getting 747MB/s thats not bad at all.
Try searching for similar results in the Open unofficial storage performance thread
thats too bad!!!
I will open a case next week and will keep you posted if i reach anything
Many thanks for the update
Ramzy201110141,
What issue are you seeing? Can you post your OpenPerformanceTest results? Did these issues develop after a firmware upgrade?
Hi,
has anybody a solution or update. Same Problem with ESXI 5 U1 and Cisco.
I encountered a different problem when upgrading from ESX 4.1 to ESXi 5 U1.
RealLife,Max Throughput-50% and Random tests all showed expected results. Max Throughput-100% on the other hand had horrible results. After a lot of troubleshooting I indentified the Intel quad nic as the culprit.
The following fix resolved the issue:
esxcfg-module -s "InterruptThrottleRate=0,0,0,0" igb