Hello everybody,
the old thread seems to be sooooo looooong - therefore I decided (after a discussion with our moderator oreeh - thanks Oliver -) to start a new thread here.
Oliver will make a few links between the old and the new one and then he will close the old thread.
Thanks for joining in.
Reg
Christian
Hi,
the "Real Life 60% Rand - 65% Read" looks to good to me - have here 2 MD3220 / SAS with 24 10k disks and saw ca. 3500 iops on R5/11 disks (maybe I try to configure a R1 set with 8/10 disks and test also)
Should the 15k disks make such a difference? Have you repeated your tests?
Reg
Christian
P.S. And yes its true - each controller has 2GB cache included (and a flash card with 8GB).
@ Christian,
These whitebox results I recently posted were obtained using the perftest.iso file available from http://vmktree.org/iometer/. While I have not performed a diff, their script looks to be the same as yours. So, for the Real Life result of iops=12870, I believe the size would have been 8192KB.
I am quite suprised to see these kinds of numbers myself. But I have run this test many times with the same results. All I can say is Openfiler 2.99 File I/O with WB, using dual Realtek 8168B NICS and ESXi 5.0 MPIO performs very well. Nearly 2 gigs of memory buffer sure helps. FWIW.... I could only get 150MB reads using Intel NICS and the IOPS were quite a bit less too.
I have tried several NAS products (NexentaStor, Open-E, FreeNAS), and while they all have their merits, their ISCSI File I/O performance cannot match Openfiler. They don't even come close. Attached are results from the best run I've achieved:
Whitebox NAS and whitebox ESXi 5.0.
Hi Christian
I don't think that the 15krpm is doing much difderence except on the latency side as i get nearly the same results on the 10krpm ones, also i have repeated the tests and am getting the same results. Tests are done on 2 different servers as well with nearly the same specs both with 4x nics reserved for iscsi and the results are the same.
Also the tests are being done while another 10 servers are running, 5 on each server. The last test was done an a running exchange 2010 server but at this time of the day there is no load at all as the company doesn't operate on weekends.
Oh and btw , my iops setting is set to 1, would it be better to have it set to 3 for the MD or 1 Shouldn't cause any problems in the long run?
Regards
David
Sent from my iPhone
@makruger
Thanks for the feed back. If your testfile is only ca. 8000KB - that means ca. 8MB then all the ios comes from cache (or ram in your case).
That is the cause of the high iosps and very low latency I think.
Reg
Christian
@Alexxdavid
Thanks for sharing your experiences here.
>Oh and btw , my iops setting is set to 1
Do you mean 1 minute run time here?
If so then the time is to short - you should test min. 5 minutes or longer.
Check the size of the testfile. It should be min. 4GB big.
Well the 15k disks should make a difference but not so high. If your testfile is to small and the run time only 1 min, then the most of the ios come from controller cache and therefore you can't see any differences between 10k and 15k disks, I think.
Reg
Christian
Hi Christian
Seems i got my wording wrong,
Iops : 1 is a feature in vmware that sends each iops to a different path instead of the default value of each 1000 iops.
Test was run for 5 min and testfile is 4gb.
David
Sent from my iPhone
Ok, that's clear now for me. You are using RR with the setting 1 io per path (changed from the default 1000).
I wonder only about any differences between 10k and 15k disks. I try to test it also.
Reg
Christian
Did you try ? What are your results?
Sent from my iPhone
At least with EFDs changing the iops to 5-10 instead of 1 gives a bit more IOps
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
TABLE OF RESULTS
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SERVER TYPE: VM ON ESXi 5, Win 2003, 512GB RAM, test file on C:\
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: Dell PE715, 64GB RAM; 2x AMD Opteron 6220(8C), 3 GHz,
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Dell MD3220 SAS direct / 22+2 SAS 10k / R10-10 Disks
LD segment size 512kB, dynamic cache prefatch/read deactivated
SAN TYPE / HBAs : SAS direct attached / Dell 6Gb SAS HBA
##################################################################################
TEST NAME- Win 2003 on ESXi 5, Raid 10, MD3220/SAS
Av. Resp. Time ms Av. IOs/sec Av. MB/sec
##################################################################################
Max Throughput-100%Read........____0,6______..........___36874___.........___1150____
RealLife-60%Rand-65%Read......___14,4/15_____.......__3226/3191_.......___25____
(test file size 4GB/8GB )
Max Throughput-50%Read..........____4,4____.........._____13448___.........___420____
Random-8k-70%Read.................____13,5____.........._____3120___.........____24____
EXCEPTIONS: VCPU Util. 98-47-44-54 %
So I decided to go back to R5 (the difference is to small to run R10 IMHO); the test will follow
Hi,
I am currently out of the office.
If you require assistance, please call our helpdesk on 1300101112.
Alternatively, email service@anittel.com.au
Regards,
James Ackerly
Hello everyone,
Hi Inges,
did the number ov VMs increased in the last weeks?
You storage might be ok, but having just one big LUN is mostly not really a good idea. 1 LUN means jsut 1 SCSI stream which can quickly become a bottleneck. Are you seeing any disk wait times? Best way to check them is with ESXTOP / RESXTOP
http://www.yellow-bricks.com/esxtop/
------------------------------------
This table lists the relevant columns and a brief description of these values:
Column | Description |
CMDS/s | This is the number of IOPS (Input/Output Operations Per Second) being sent to or coming from the device or virtual machine being monitored |
DAVG/cmd | This is the average response time in milliseconds per command being sent to the device |
KAVG/cmd | This is the amount of time the command spends in the VMkernel |
GAVG/cmd | This is the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVG |
Also ESX4.1 is in some points not as good as 5 when it comes to SCSI handling. If you copy a lot of files to the system so that you have to allocate new blocks, you will always have a short scsi reservation from on host and the rest has to wait.
So try to get the Latency Statistics
and Queuing Information
for further investigation.
Regards,
David
Oh, and can you give some more details about the 7110?
FC, iSCSI, 1,4 or 8gb?
Dedublication on?
Snapshots used?
And don´t forget to check if you might have a LUN misalignment issue...
A CALL FOR HELP
Hi _VR_,
I had to deal with a EqualLogic Model with 4x1Gb per controller a while ago. When I did this setup it was important to do a right network setup and RoundRobin in ESX.
So can you confirm the following:
- seperated vSwitches for ever NIC used for iSCSI?
- Jumbo Frames enabled everywhere?
- RoundRobin used for path selection?
- Testes run on seperated volumes? (-> iSCSI reservations etc...)
- Disabled: TCP and IP Offload engines on NICs
As far as I know, the Controllers are active/passive. So you said there is a limitation of 100MB in the controller head. When you use 1Gbit/s this ends in about 80-100 MB/s you can use. So I´m not really getting the problem here?
Did you see http://www.equallogic.com/WorkArea/DownloadAsset.aspx?id=8453 ?
Regards,
David
Thanks for the reply
- seperated vSwitches for ever NIC used for iSCSI?
yes, i have two vSwitches. each one has 1 iSCSI nic. i also ran a test from a physical host (non-esx) with 4 iSCSI nics. same results
- Jumbo Frames enabled everywhere?
i tried turning jumbo frames on. max throughput test runs 5% faster while the reallife test runs 5% slower
- RoundRobin used for path selection?
same results in round robin and least queue depth
- Testes run on seperated volumes? (-> iSCSI reservations etc...)
seperate volumes & separate hosts concurently
- Disabled: TCP and IP Offload engines on NICs
disabling / enabling offload made no difference
As far as I know, the Controllers are active/passive. So you said there is a limitation of 100MB in the controller head. When you use 1Gbit/s this ends in about 80-100 MB/s you can use. So I´m not really getting the problem here?
The PS4100X has 2 active nics per controller. The expected throughput is 200MB/s (2000/8=250MB/s Theoretical). I see each NIC pushing 100MB/s one at a time. When they're both active the throughput per nic drops to 50MB/s.
First....might want to open a separate thread on this, for continued troubleshooting, we try to keep this thread dedicated for storage performance posts.
What type of switches are you using with your EQL setup?
What type of NICs on the ESX hosts? Broadcom or Intel?
Thanks,
Jonathan
Sent from my iPad.
My apologies, moving thread to http://communities.vmware.com/thread/393583
My result on Broadcom 5709 NIC without jumboFrame - NOT SOFTWARE iscsi only dependend HBA
SERVER TYPE: VM Windows 2008 R2 62bit
CPU TYPE Intel Xeon X5680 / NUMBER: 6 core
HOST TYPE: ESXi 5.0 patch3 Dell R710 and Supermicro + 4xNIC braodcom 5709 offloading without JumboFrame
STORAGE TYPE HP P2000 G3 4x10Gb/s / DISK NUMBER: 6x600GB SAS2 15k / RAID LEVEL: RAID10 from 6x600GB SAS2 15k iSCSI
|*TEST NAME*|*Avg Resp. Time ms*|*Avg IOs/sec*|*Avg MB/sec*|*% cpu load*|
|*Max Throughput-100%Read*|6.57|8844|276|0%|
|*RealLife-60%Rand-65%Read*|8.41|4296|33|1%|
|*Max Throughput-50%Read*|7.20|8125|253|0%|
|*Random-8k-70%Read*|8.04|4353|34|1%|