VMware Cloud Community
dctaylorit
Contributor
Contributor
Jump to solution

SAN Performance

This is my first time working with a SAN and also my first time working with an ESX server w/o DAS and so far it has been fairly smooth. I’ve been able to get the SAN up and running also been able to get the ESX box to recognize it and have access to the LUNs.

The question I have thus far is that I’m not sure if my performance of the SAN is going to be acceptable. I’m going to running Win2k3 machines on here and will have some higher I/O apps running down the road (1 SQL and 1 Exchange box). For the most part the rest of the machines will not have very high I/O mostly just higher processor/RAM requirements, or will be test/dev machines.

I’ve been using a Win2k3 R2 SP2 text box and running I/OMeter to try and get some hard numbers for my performance testing and thus far this is what I’ve came up with.

When running 32k block size and 100% Read – 0% Random I’ve got the following numbers (regardless of using a TOE card or using VMWare’s iSCSI software initiator).

Total I/O per Sec 1178.03

Total MB per Sec 36.81

Average I/O Response Time (ms) .8477

When running the I/O Meter All in one test I’m getting these numbers:

Total I/O per Sec 939.23

Total MB per Sec 12.04

Average I/O Response Time (ms) 1.0635

Those numbers just don’t seem to add up to me. Unfortunately I don’t really have anyone I can ask for help so if anyone has any thought I would greatly appreciate it.

My environment is as follows:

\- LeftHand Networks DL320s SAN (3.6 TB Raw – 15k SAS drives) – Both NICs are in Bridged mode

\- Cisco 3750 24 Gigabit switch (12 ports on dedicated VLAN for iSCSI traffic only)

\- 2 HP DL360G5 Servers (Only using 1 right now for testing, other is in production)

\- Each DL360 has a Qlogic QLA4050C TOE card (latest BIOS - v1.09)

\- All certified Cat6 cable connecting everything

If there is any other piece of information that I missed please let me know I would greatly appreciate it. I’m just not sure that my performance is where it needs to be and don’t want all the time and effort we have put into this to turn out sub-par performance.

Thanks for time,

Joe

0 Kudos
1 Solution

Accepted Solutions
pauliew1978
Enthusiast
Enthusiast
Jump to solution

Hi there,

yes the performance does seem a bit poor to me. You have good hard drives in the left hand unit ( I am using 7200rpm sata drives on a sanmelody san and getting 3100 iops on 100 percent sequential reads so your figures should most definately be better than mine) and that is on a 4 disk raid 10 array. How much traffic is going through the cisco switch? (is the switch being used just for iscsi traffic or are you sharing it with lan users on other vlans?). I would absolutely make sure your connections have negotiated correctly at 1000 ful duplex. I would also consider changing the block size of your raid set (have a look at the raid card guide as sometimes manufacturers give advice on what block size is best for certain situations).

Remember that more disks = better performance. So if you are just using one disk at he moment then this might explain it.

try christianz's storage performance test and post them here (or have you done that, i couldnt work out if you used the 4gb file test he set up).

You could directly connect your san to your esx server and run the tests again to rule out any switch problems also. if you are using a large vmfs volume make sre you follow the guidelines re: if its 2tb use an 8mb block size for the vmfs partition (you probbaly know all this!)

what raid level are you using?

View solution in original post

0 Kudos
9 Replies
esiebert7625
Immortal
Immortal
Jump to solution

You might want to read through this huge storage performance thread...

http://www.vmware.com/community/thread.jspa?messageID=584154&#584154

0 Kudos
dctaylorit
Contributor
Contributor
Jump to solution

esiebert,

Thanks for the post, I've been looking through there before and that was kind of my first sign that I might be in trouble. I'm not seeing alot of help in that topic on how to resolve any speed issues you might have, just people posting what transfer rates are.

One other thing I do have a question regarding. I've seen alot of mention of people using Jumbo frames. But I sent an email to VMWare asking about it and they said that ESX 3.0.1 does not support jumbo frames.

\- With this being said I know I can configure the frame size within the BIOS of my Qlogic card, and the rest of my environment does support the larger frames.

I was just curious how if you are having any troubles with this and if you do what is/does VMWare say if you are calling in with a support questions.

0 Kudos
christianZ
Champion
Champion
Jump to solution

Well your results seem to be poor, I think.

Maybe you can run the tests from suggested thread, then we can compare your resulst to results e.g. from Sanmelody.

In addition "Jumbo" are supported with QLA4050 but not every switch can run with "Jumbo" and "flow control" simultaneously (the 3750 can't).

The "flow control" configuration seems to be more significant, I think.

The LefHand Storage isn't supported with QLA4050 for now (from VMware side) - I don't know what LeftHand tells about ?

I red here about sluggish performance of LeftHand storage but didn't tested it myself.

How many disk spindles are you using ?

Have you tried to test it with sofware iscsi initiator ?

Have you tried to test it with physical server and run the tests there ?

pauliew1978
Enthusiast
Enthusiast
Jump to solution

Hi there,

yes the performance does seem a bit poor to me. You have good hard drives in the left hand unit ( I am using 7200rpm sata drives on a sanmelody san and getting 3100 iops on 100 percent sequential reads so your figures should most definately be better than mine) and that is on a 4 disk raid 10 array. How much traffic is going through the cisco switch? (is the switch being used just for iscsi traffic or are you sharing it with lan users on other vlans?). I would absolutely make sure your connections have negotiated correctly at 1000 ful duplex. I would also consider changing the block size of your raid set (have a look at the raid card guide as sometimes manufacturers give advice on what block size is best for certain situations).

Remember that more disks = better performance. So if you are just using one disk at he moment then this might explain it.

try christianz's storage performance test and post them here (or have you done that, i couldnt work out if you used the 4gb file test he set up).

You could directly connect your san to your esx server and run the tests again to rule out any switch problems also. if you are using a large vmfs volume make sre you follow the guidelines re: if its 2tb use an 8mb block size for the vmfs partition (you probbaly know all this!)

what raid level are you using?

0 Kudos
dctaylorit
Contributor
Contributor
Jump to solution

christian,

I tried running your config file instead of just trying to mimic it and my results were much better (posted at the bottom). I'm not sure why just running those seemed to make that much of a difference. If these are the results that I'm getting I think this is very acceptable in comparison with the unofficial performance thread listed by esiebert.

paulie,

My SAN box is 12x300GB 15k SAS drives in a RAID 5 array. My Volumes are all under 500GB.

All connections are manually set at 1000 full duplex. The switch does have all of our other services connected to it on a separate VLAN.

I have no tried running with JUMBO frames yet, but have tried with & without flow control and my results were typically faster (.5 - 1 MB/sec) with very lil impact in average response time (+.2-5 ms on average) without using flow control.

Would these results look respectable to you, or do I still have an issue I need to resolve?

SERVER TYPE: ESX 3.0.1

CPU TYPE / NUMBER: VCPU / 1

HOST TYPE: HP DL350G5 - 6GB - 2x Xeon5150 2.66 DC

STORAGE TYPE / DISK NUMBER / RAID LEVEL: LeftHand DL320s / 10+2 15k SAS / R5

##################################################################################

TEST NAME--


Av. Resp. Time ms--Av. IOs/sek---Av. MB/sek----

##################################################################################

Max Throughput-100%Read........\_17.56_..........\_3355_.........\_104.9_

RealLife-60%Rand-65%Read......\_24.19_..........\_2103_.........\_16.43_

Max Throughput-50%Read..........\_16.35_..........\_3466.2_.........\_108.32_

Random-8k-70%Read.................\_34.75_..........\_1582.83_.........\_12.37_

EXCEPTIONS: CPU Util.-27-35-34-26%;

##################################################################################

Thanks again for all of your help. I greatly appreciate it.

-Joe

0 Kudos
pauliew1978
Enthusiast
Enthusiast
Jump to solution

Hi again,

Yes those performance figures are good. You shouldnt be having any problems with them unless you are really straining your io. You should check out your perfmon on your current physical servers and then compare them to the figures you posted to see if it will be fast enough. They look pretty darn good to me from here. I mean after all if you virtualize 4 servers and they are writing 12mb/s constantly your san is going to fill up pretty quickly!. I would just total up your average io figures for your physical servers and see if they are under your total io for real life situation (test 2 of christianz?) Oh and another thing if you can do raid10 or something like that I think it will give you better performance (though less disk space).

Message was edited by:

pauliew1978

Message was edited by:

pauliew1978

0 Kudos
pauliew1978
Enthusiast
Enthusiast
Jump to solution

you can give me some points if you want to Smiley Wink

0 Kudos
christianZ
Champion
Champion
Jump to solution

Yes that looks better now - the results are much better, now comparable with the other results. This way you can start to work.

When your are using iscsi hba they maybe the "Max Throughput-50%Read" could be a little higher (with full duplex there are more than 150 MB/s to reach) - so it seems that the nics aren't working with full duplex - just my thought.

As Paul mentioned you can mark some answers as "Helpful" or "Correct" - this way the helpers get points. Thanks.

0 Kudos
christianZ
Champion
Champion
Jump to solution

In addition when you have only one storage server for now it would be better to configure your LeftHand's volumes as "One Way" (default Two Way).

This configuration can be changed online when you get a second storage server later.

0 Kudos