VMware Cloud Community
ngerasim
Contributor
Contributor

CLARiiON CX4-960 and CX3-80 setup validation

We have a CLARiiON CX4-960 and a CX3-80 an I was wanting to validate theyre are setup correctly.

We are using MRU for the pathing n VMware, and the HBA queue depth is set as 64, we are using qla2xxx and lpfc820 HBA cards currently. I also set the Disk.SchedNumReqOutstanding to be 64.

Performance isnt very good... things run. But not nearly as well as I would expect them to. ESXTOP shows low averages and I am not showing any of the VMs to be suffering disk queing when reviewing PERFMON, but the performance of the VMs is poor as compared to their physial counterparts running on dedicated storage.

Im not sure where else to be looking at this point. The Hosts are barely being used, usually no more than 20% utilization, for CPU and memory. VMs are all configured with 1 or2 vCpus, and memory is not balloning. If anything, we have ample resources available. 

Thanks in advance for your help. I am stumped.

0 Kudos
16 Replies
AndreTheGiant
Immortal
Immortal

You can reset the default queue deep on your HBA and test again.

To make some test you can use iometer in a Windows VM.

Make the test on a LUN inside a fast RAID group (RAID 10 on fastes disk) with a single host and check the value.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
Basheer09
Enthusiast
Enthusiast

Hi,

Run IO meter in the VM to measure disk performance and post us the results.

0 Kudos
ngerasim
Contributor
Contributor

What specific settings should I use in IOMeter?

0 Kudos
Basheer09
Enthusiast
Enthusiast

Nothing special ... Its a tool used to test ur disk IO's. So just run a dummy test for copying over a gig of data and check the IO performance to the disk .

0 Kudos
ngerasim
Contributor
Contributor

Physical system is on the left, VM i on the right. See screenshot attachment. I used 512-byte and 32-KB chunks of data, with read frequency assumptions of 50 percent and 25 percent, respectively. Queue depth was 32MB.

0 Kudos
ngerasim
Contributor
Contributor

Any input guys?

0 Kudos
Basheer09
Enthusiast
Enthusiast

Could you please confirm that both the tests has been for the same SAN LUN from physical machine and VM.

Also, While running this test did u got any chance to look at "esxtop" disk performance output for DAVG and KAVG ???

Please confirm that the path policy are as per the recommendation from VMware or EMC.

Also, Increase the queue depths in storage and HBA to 64 and 128 respectively and test the performance

0 Kudos
SurfControl
Enthusiast
Enthusiast

how's your raid group configured? you may also turn on the Analyzer from Navisphere to check the perfomance at the lun leavel.

0 Kudos
alienjoker
Enthusiast
Enthusiast

Hi,

Having looked at the IOMeter outputs, it would appear the physical server has more Workers than the virtual instance.

Can you also confirm all the other parameters you have set (# of targets/# of outstanding I/Os per target)

Thanks

Andrew

0 Kudos
ThompsG
Virtuoso
Virtuoso

Hello,

Just wondering if you could provide a few details about your array setup?

How are your LUNs configured:

- number of spindles

- RAID group configuration

- are you using Meta LUNs

- are the LUNs balanced across SPs (ownership)?

- do you have the ESX hosts seeing all four ports on each SP?

Sorry for all the questions.

Thanks and kind regards.

0 Kudos
AndreTheGiant
Immortal
Immortal

- number of spindles

IMHO... for RAID5 less than 8 (with old FLARE was a suggested practice)

- RAID group configuration

This is related to previous answer

- are you using Meta LUNs

I do not like them, I prefer to keep simple, but I know that other people use them

- are the LUNs balanced across SPs (ownership)?

By default yes, but I suggest to check it during LUN creation and force for example the LUN of the same RAID group to be on the same SP.

- do you have the ESX hosts seeing all four ports on each SP?

It dedepends by your SAN configuration.

For LUN design see also:

http://communities.vmware.com/docs/DOC-10990

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
depping
Leadership
Leadership

You are comparing physical vs virtual. are both the physical and virtual talking to the same LUN? Are both the physical and virtual equipped with the same amount of memory / cpu? Are the amount of workers equal? There are many variables in a test like this that will need to be equal in order to have a fair comparisson.

Duncan (VCDX)

Available now on Amazon: vSphere 4.1 HA and DRS technical deepdive

0 Kudos
ThompsG
Virtuoso
Virtuoso

Hi Andre,

- number of spindles

IMHO... for RAID5 less than 8 (with old FLARE was a suggested practice)

Agree - we actually used 4+1 for our RG configuration and this actually saved us from losing an awful lot of data after having three simultaneous drive failures with two being in the same DAE but different RGs.

- are you using Meta LUNs

I do not like them, I prefer to keep simple, but I know that other people use them

Just wondering about this statement - how do you get any speed out of the disks without using Meta LUNs? I agree Meta LUNs are a pain to create and manage - we actually had a quilt showing which RGs made up the Meta plus allowed us to move the Meta head around the disks. In order to get the IOPs we required on a datastore we had to resort to Metas or reduce the consolidation of virtual machines to a datastore. We settled on a growth unit for the array that we thought the business could withstand and then created LUNs across all the disks excluding the vault disks. Much discussion was had about whether to use the vaults drives as well but in the end voted against it.

- are the LUNs balanced across SPs (ownership)?

By default yes, but I suggest to check it during LUN creation and force for example the LUN of the same RAID group to be on the same SP.

Again agree, however I always check the load on the SPs with the mark one eyeball to confirm all is good Smiley Happy

Kind regards.

Message was edited by: ThompsG

0 Kudos
AndreTheGiant
Immortal
Immortal

Just wondering about this statement - how do you get any speed out of the disks without using Meta LUNs? I agree Meta LUNs are a pain to create and manage - we actually had a quilt showing which RGs made up the Meta plus allowed us to move the Meta head around the disks. In order to get the IOPs we required on a datastore we had to resort to Metas or reduce the consolidation of virtual machines to a datastore. We settled on a growth unit for the array that we thought the business could withstand and then created LUNs across all the disks excluding the vault disks. Much discussion was had about whether to use the vaults drives as well but in the end voted against it.

Metalun is a way, but I prefer keep datastore "small".

For medium environment I prefer this way...

But with a large environment this create too much LUNs... So metalun can be a choice (or also have more clusters with different LUNs).

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
ngerasim
Contributor
Contributor

400 spindles

RAID 5

Not using Meta LUNs

LUNS are balances across the SPs

Not sure what you mean by LUNs seeing all four ports on each SP?

Qlogic 4GB HBAs are being used. Queue depth is 64.

According to ESXTOP max I/O I have been able to push across a single LUN on a single VMDK, for a single VM when testing was 376 MB/s

Glen wrote:

Hello,

Just wondering if you could provide a few details about your array setup?

How are your LUNs configured:

- number of spindles

- RAID group configuration

- are you using Meta LUNs

- are the LUNs balanced across SPs (ownership)?

- do you have the ESX hosts seeing all four ports on each SP?

Sorry for all the questions.

Thanks and kind regards.

0 Kudos
ngerasim
Contributor
Contributor

AVG I/O RSPONSE TIME = 126MS

TOTAL I/O PER SECOND = 454

TRANSACTIONS PER SECOND = 454

MAX I/O RESPONSE TIME = 1906MS

TOTAL MB/S = 3.54

MAX TRANSACTION TIME = 1906MS

0 Kudos