VMware Cloud Community
sscheller
Contributor
Contributor

IBM DS3300 very poor write performance

Hi,

we owned a ibm ds3300 with two x3650 esx4 servers.

Both esx servers are conneted to to tower over 2 NICs (2 vswitches, with 1 Portgroup and 1 NIC each server, two ip sub net for iscsi communication, multipathing), connected redudant to two hp1800 switches.

IBM ds3300 has got a dual controller, connected redundant to both hp swichtes.

Tower: 8 SAS 15T 450GB HDDs, RAID 6

MTU 9000 is emable on esx and tower (jumbo frames).

Write performance is the problem.

Write cache on both controller is enabled.

These performance test were made by a vm windows xp maschine.

TEST NAME Av. Resp. Time ms Av. IOs/sek Av. MB/sek

Max Throughput-100%Read.......____6,1___......____9822,97__...____306,07___

RealLife-60%Rand-65%Read......_____162_......._____368,08__...____2,88____

Max Throughput-50%Read........_____39___......_____1377,95__..____43,06____

Random-8k-70%Read............._____45__......._____1187,80__...____9,28___

Test i made:

  • direct connect to tower from esx, same performance

  • Flow controll, more worse performance

  • jumbo frames off, worse performance too

I am to be at a loss.

Maybe you can help me to get more performance out of oure iscsi tower.

Thanks

Bye

Reply
0 Kudos
13 Replies
Wikkie
Contributor
Contributor

Hi there,

Seems that we share the problem Smiley Happy

My config is like:

1x DS3300 dual controler

2x HP 1800 switches

4x IBM X3550 servers

Same issue after chaging the things to Jumbo Frames with a size of 9000.

At this moment I have no idea what is causing this. But maybe we can help us both.

For now I will suggest to lower the Frame size to about 4500. If its not working then lower it further to 1500 then it should work !

Next step is to make it by steps higher like 1500, 2000, 2500 etc etc until you got the problem.

At least i will try this.

thx.

Reply
0 Kudos
sscheller
Contributor
Contributor

Hi,

we found out from oure tests, that not write performance is the iusse, its the read performance.

Write performance is realy good.

Up tp 80 MB/s in vm guest.

We although have to think about performance local disks, if you use a laptop or desktop pc for iometer tests.

But the bottleneck is read performance.

Don´t know why.

What is your config esx looks like?

Maybe i can help you.

greatings

Reply
0 Kudos
JoshMM
Contributor
Contributor

I'm having very similar problems, with a suprisingly similar configuration!

We recently implemented the following:

2 x IBM x3650's with the following specs:

- 2 x 5160 Xeon processors

- 32GB of RAM

- 2x quad port Intel gigabit NIC's

1 x IBM DS3300 Dual Controller w/ 1GB cache upgrade

2 x (4 x 300GB 15k SAS drives) in RAID 5

2 x (2 x 300GB 15k SAS drives) in RAID 1

HP Procurve 2900-24G

I have flow control and jumbo frames turned on.

Have noticed fair IO waits and not getting fantastic read performance.

Previously we were running ESX 3.5 with Qlogic iSCSI HBA's but we found they weren't officially supported so we have pulled out the cards when we upgraded to vSphere and the performance is definitely better, but I am not happy.

I will complete some IO test and post back to this thread, but does anyone have any ideas?

How is the best way to setup paths within iSCSI? What is the best way to manage the controller assignment of the LUNS on the SAN?

Looking for any help I can get!

Cheers

Reply
0 Kudos
Wikkie
Contributor
Contributor

Last week did some serveral testing and it seems that in my config the HP ProCurve switches is A bottle neck. I am using a 1800-24G

See also this thread: http://communities.vmware.com/thread/186569

Tests without the switches proofs that I can have more performance. about 25% to 30% more.

I have added Pci-e Intel/1000 dual port network cards which performs some beter than the Onboard Broadcoms but only a few procent.

A note on the dual controllers:

Each Virtual Host uses only 1 active path never more so at a 1Gb network you will never get higher than approx 64Mb/s random read/write

I got with a test up to 51 Mb/sec total in a average read/write load (33% Write / 67% Read and without the switches).

A other Virtual Host can or will then use the other controller. This also apply between serveral Lun's

With switches I got about 30 mb/sec (33% Write / 67% Read)

This all above with Jumbo frames enables. Will make a test with a standard MTU size of 1500 for reference.

Reply
0 Kudos
Wikkie
Contributor
Contributor

Connect the DS3300 direcktly to the server without the switches and test again. Any difference ?

Did you played with Jumbo frames ?

Check the cache settings into the DS3300 need to looking like below:

Read cache: Enabled

Write cache: Enabled

Write cache without batteries: Disabled

Write cache with mirroring: Enabled

Flush write cache after (in seconds): 10.00

Dynamic cache read prefetch: Disabled

You can find them into your logical drive profile.

If differ I will explain further this.

I assume you tested from your RAID-1 and not the RAID-5 volume !

Still this smells to be a switch problem. All guys on this thread uses the HP ProCurves.

Anybody with a DS3300 in combination with a HP ProCurves. What type of switch and what firmware you running on the DS3300 ?

Reply
0 Kudos
vhavandjian
Contributor
Contributor

We just setup a our VMWare implementation and I am currently doing testing. Our setup includes

IBM DS3300 Dual controllers w 12 x 300GB SAS in RAID 1

Juniper EX3200 Switch

3 x IBM X3550 servers

On my first tests installing Win2k8 VM's I am showing 51 MB/s throughput or 408 MBIT / s

Jumbo frames are enabled and set to 9000.

So far it doesn't see that quick, am I expecting too much?

Reply
0 Kudos
marktbreaux
Contributor
Contributor

I have a question about the DS3300 and vSphere 4.0. I can't get it to create a vmfs partition on the LUNs I have created. I have contacted vmware support and they tell me vSphere 4.0 is not on the HCL. I could have swore it was when I bought the DS3300 but with no proof i am SOL. Can any of you give me some tips about connecting it? eg. how many RAID Arrays? How many LUNS per array? I had setup 6 LUNs in one big RAID 10 array. I have 12 500GB SATA drives and dual controllers.

I have attached an error message jpeg. After getting this pop-up, I checked the logs and got Path redundancy errors for each lun on each connection on each controller.

thank you in advance for your help!

Reply
0 Kudos
JoshMM
Contributor
Contributor

Last week did some serveral testing and it seems that in my config the HP ProCurve switches is A bottle neck. I am using a 1800-24G

See also this thread: http://communities.vmware.com/thread/186569

Like I said, I am using a 2900-24G which has both jumbo frames and flow control enabled.

Tests without the switches proofs that I can have more performance. about 25% to 30% more.

I have added Pci-e Intel/1000 dual port network cards which performs some beter than the Onboard Broadcoms but only a few procent.

I am using PCIe quad port Intel 1000 cards and I have definitely noticed they perform better than the onboard Broadcoms, too.

A note on the dual controllers:

Each Virtual Host uses only 1 active path never more so at a 1Gb network you will never get higher than approx 64Mb/s random read/write

I got with a test up to 51 Mb/sec total in a average read/write load (33% Write / 67% Read and without the switches).

A other Virtual Host can or will then use the other controller. This also apply between serveral Lun's

How do you best balance the use of controllers though? Once the host sees the first available path, it utilises that. I find most of my traffic going over Controller-A

With switches I got about 30 mb/sec (33% Write / 67% Read)

This all above with Jumbo frames enables. Will make a test with a standard MTU size of 1500 for reference.

Unfortunately I don't have any other switches I can use to try. We are a full Procurve house. I wonder if it's worth contacting HP directly?

Connect the DS3300 direcktly to the server without the switches and test again. Any difference ?

Did you played with Jumbo frames ?

I have jumbos enabled. I have two hosts so I can't simply just "take out the switch" easily. I may try this after hours and see if I get a gain.

Check the cache settings into the DS3300 need to looking like below:

Read cache: Enabled

Write cache: Enabled

Write cache without batteries: Disabled

Write cache with mirroring: Enabled

Flush write cache after (in seconds): 10.00

Dynamic cache read prefetch: Disabled

The only thing I appear to have different is that my "Dynamic cache read prefetch" is enabled? Is this an issue?

I assume you tested from your RAID-1 and not the RAID-5 volume !

Right you are!

Anybody with a DS3300 in combination with a HP ProCurves. What type of switch and what firmware you running on the DS3300 ?

DS3300: 07.35.41.00

2900-24G: T.12.51

I have noticed I am fairly behind in the switch firmware (latest is T.13.63 as per http://www.hp.com/rnd/software/j90491363.htm) so I'm going to check that out now!

Thanks for the comments and suggestions!

Reply
0 Kudos
JoshMM
Contributor
Contributor

I have a question about the DS3300 and vSphere 4.0. I can't get it to create a vmfs partition on the LUNs I have created. I have contacted vmware support and they tell me vSphere 4.0 is not on the HCL. I could have swore it was when I bought the DS3300 but with no proof i am SOL.

The DS3300 was on the HCL for 3.5u4 so I don't know how it's not on there for vSphere? I just checked and it's definitely not there. I might raise a call with VMware on this one.

3.5u4 HCL: http://www.vmware.com/resources/compatibility/search.php?action=search&deviceCategory=san&productId=1&advancedORbasic=advanced&maxDisplayRows=50&key=&release[]=21&datePosted=-1&partnerId[]=43&arrayTypeId[]=1&rorre=0

4.0 HCL: http://www.vmware.com/resources/compatibility/search.php?action=search&deviceCategory=san&productId=1&advancedORbasic=advanced&maxDisplayRows=50&key=&release[]=13&datePosted=-1&partnerId[]=43&arrayTypeId[]=1&rorre=0

Can any of you give me some tips about connecting it? eg. how many RAID Arrays? How many LUNS per array? I had setup 6 LUNs in one big RAID 10 array. I have 12 500GB SATA drives and dual controllers.

Personally I have 4 LUNS, 4 RAID arrays. 2 x RAID 1, 2 x RAID 5 but I have 300GB SAS drives.

I have attached an error message jpeg. After getting this pop-up, I checked the logs and got Path redundancy errors for each lun on each connection on each controller.

Sorry never saw this so I can't help!

Reply
0 Kudos
vhavandjian
Contributor
Contributor

I have not had an issues with Vsphere 4.0 and our DS3300.

What does it say in the Vkernal logs?

Reply
0 Kudos
malaysiavm
Expert
Expert

I will suggest you try to get some switch with ISCSI optimization. The jumbo frame features you use will need to turn on from ESX server, storage and switch level. Beside that, you may try to run another test with raid 10 and raid 5 to further compare the result with the raid 6 configuration you are using. You may consider the LAG or vmkernel binding for software ISCSI which will help to improve the ISCSI performance too.

Craig

vExpert 2009

Malaysia VMware Communities -

Craig vExpert 2009 & 2010 Netapp NCIE, NCDA 8.0.1 Malaysia VMware Communities - http://www.malaysiavm.com
Reply
0 Kudos
marktbreaux
Contributor
Contributor

Heres the deal. I updated my server last night to ESX 4.0 from ESXi 4.0 I can now create a VMFS partition on the IBM DS3300. If anyone has any issues with there 4.0 installation and IBM DS3300, first thing I would ask is do they have ESXi or ESX installed. Thanks for the help fellas!

Reply
0 Kudos
joshuatownsend
Enthusiast
Enthusiast

Nothing like being a couple years late, but see this: http://vmtoday.com/2009/06/ibm-ds3300-iscsi-write-performance-solved/

Good chance the array is running in simplex mode, even for those of you with dual controllers. 

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful". Please visit http://vmtoday.com for News, Views and Virtualization How-To's Follow me on Twitter - @joshuatownsend
Reply
0 Kudos