VMware Cloud Community
TB_Rich
Enthusiast
Enthusiast
Jump to solution

HP MSA1500 Active/Active (MRU or Fixed) ESX3.5??

Hi,

I have got an MSA1500 with 2 controllers. 1 FC switch and 8 hosts all v3.5.

I have only 2 Luns. On the MSA controller 1 is set as preferred for LUN 1 and controller 2 set as preferred for LUN 2.

Then in ESX, I have Fixed policy's and preferred route to LUN 1 via target 0 (controller 1?) and preferred route to LUN 2 via target 1 (controller 2?)

That makes sense to me to balance out the LUN access across the 2 controllers, with MRU potentially it could all go across the 1 controller or an unequal split?

If someone could confirm or suggest the best way that would be great.

(Also, im going to upgrade the HBA's to dual (we have a second switch already), am I right in saying controller 1 goes to switch 1 and controller 2 goes to switch 2 - each host goes to each switch - would i leave the policies as above at fixed?)

Many Thanks

Rich.

Reply
0 Kudos
1 Solution

Accepted Solutions
kastlr
Expert
Expert
Jump to solution

Hi KjB,

it's the other way round, an A/A array should be used with fixed while an A/P array should be used with MRU.

The reason to do so is the following.

Load Balancing

Active/Active arrays

Uses fixed path policy -

Many customers implement a load balancing methodology from the ESX using the preferred path where LUNs are accessed by the ESX via different Storage Processors/Controllers.

Active/Passive arrays

Uses mru path policy -

Since there is no preferred path option available for MRU, customers have to manually load balance the LUN presentations from the array side, presenting LUNs via different Storage Processors/Controllers.

This is prefectly acceptable, but in the event of a failover, all the load balancing is lost.

Why not just set fixed path policy on an A/P array?

On A/P arrays, there are three SCSI return codes / check conditions which will initiate a failover:[b]

NO CONNECT

NOT READY

ILLEGAL REQUEST

On A/A arrays, there is only one SCSI return codes / check conditions which will initiate a failover:[b]

NO CONNECT[/b]

If we set a fixed path policy on an A/P array, it is possible we get a check condition which should initiate a failover, but will not if we have not set the path policy to mru.[/b]


Hope this helps a bit.
Greetings from Germany. (CEST)

View solution in original post

Reply
0 Kudos
22 Replies
kastlr
Expert
Expert
Jump to solution

Hi,

you shouldn't never change the failover policy, because ESX does automatically select the best failover policy.

The selected failover policy doesn't care if you're using one or two SAN fabrics, it only depend on the capability of the used storage array.

Hope this helps.

Ralf


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

I'm not 100% sure about this but as far as I know the MSA1500 isn't an active/active SAN in VMware terms. The lun needs to be published at the same time through both controllers and this isn't the case for a MSA1500. in other words your esx setup needs to be set to MRU.

Duncan

My virtualisation blog:

Reply
0 Kudos
williambishop
Expert
Expert
Jump to solution

Depping is correct in this as far as I know, the MSA 1500 is not true active/active...Few(very, VERY few) midrange devices are. Usually, active/active is for the higher end.

--"Non Temetis Messor."
Reply
0 Kudos
java_cat33
Virtuoso
Virtuoso
Jump to solution

Hi Duncan - I've got several customers using an MSA 1500 and some of them are running it as active/active. I haven't changed the multipathing within ESX and it's set to Fixed. This shows that ESX detects the active/active controllers on the MSA.

How do you mean that a MSA 1500 isn't an active/active SAN in VMware terms?

Appreciate your feedback.

Reply
0 Kudos
TB_Rich
Enthusiast
Enthusiast
Jump to solution

I think you are only meant to use MRU on v6.86, v7 is Fixed. At least how I've read it.

Seems to work how well how I've done it, is there anyway I can do some tests on throughput and what would be some 'good' values to obtain?

Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

It could be that this is fixed in the new firmware and I don't have the resources to test this, but for VMware active/active means that all of the lun's are active "on" on both controllers. As far as I know this never was the case for an HP MSA1500, but it could be that they changed this as of the firmware.

Duncan

My virtualisation blog:

Reply
0 Kudos
kjb007
Immortal
Immortal
Jump to solution

Ok, there appears to be some confusion here. For an active/active SAN array, the best policy is MRU, not fixed. Fixed is used for an active/passive array so as not to cause disk thrashing.

So, for active/active, use MRU.

For active/passive, use Fixed, and fix the path to the disk whose controller is owner of the disk.

That is the big difference, in active/passive, there is a disk owner, so ownership transfers back and forth, and performance hits are seen. So, when a failback occurs on the SAN, you want the path to change back to the disk owner, which is done via the Fixed policy. MRU will leave the path to the failover path, which is ok with active/active, but not with active/passive.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
TB_Rich
Enthusiast
Enthusiast
Jump to solution

That makes sense then, if A/A is normally MRU and as the MSA is not true A/A in the eyes of VM hence use Fixed.

So, I have it set up ok, and have tried to distribute access to each Lun via a different Controller - Is there a way to test this or do I just assume and hope. (As above if I can run some benchmarking somehow that would be great?)

Thanks 007.

Reply
0 Kudos
kjb007
Immortal
Immortal
Jump to solution

You can create a vm, and run iometer. That will allow you to test access to the LUN and see what kind of troughput and latency is present.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

kjb007,

i think what you are stating here is wrong.

During boot up or a rescan operation, VMware ESX automatically assigns a path
policy of Fixed for all active/active storage array types. With a Fixed path policy, the
preferred path is selected if that path is in the on state. For active/active storage
array types, VMware ESX performs a path failover only if a SCSI I/O request fails
with a FC driver status of NO_CONNECT, which indicates a loss of FC connectivity.
Commands that fail with check conditions are returned to the guest operating
system. When a path failover is completed, VMware ESX issues the command to the
next path that is in the on state.
For active/passive storage array types, VMware ESX automatically assigns a path
** policy of MRU (Most Recently Used).** A device response to TEST_UNIT_READY of
NO_CONNECT and specific SCSI check conditions triggers VMware ESX to test all
available paths to see if they are in the on state.

and this is taken from the vmware san system design and deploy guide.

Duncan

My virtualisation blog:

kjb007
Immortal
Immortal
Jump to solution

hmmm, then we have read different things here. I will try and find the reference, but from the excerpt from the san guide, I will have to defer, but the logic still remains the same. I will try and find the reference.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

check this explanation of path thrashing which makes sense in my opinion:

http://pubs.vmware.com/vi301/san_cfg/wwhelp/wwhimpl/common/html/wwhelp.htm?context=san_cfg&file=esx_...

Duncan

My virtualisation blog:

Reply
0 Kudos
kjb007
Immortal
Immortal
Jump to solution

I suppose the definition of active/active is the problem here. If both controllers are active, but the LUN is published through one controller, then to prevent thrashing, you will need to use MRU, the difference being replication of the sp cache to both controllers..

If you are balancing the paths to your LUNs, then you need to use Fixed anyway. The san guide does state to use fixed for the MSA, which it considers Active/Active.

So, for 90% of the time, use MRU for active/passive, and fixed for active/active.

Sorry if I'm the one adding to the confusion.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
TB_Rich
Enthusiast
Enthusiast
Jump to solution

I had at one point, 1 SP going to 1 FC Switch and 4 hosts to this switch. The other SP going to another FC switch with 4 more hosts.

All 8 hosts could see both LUNS, therefore v7 A/A must have similtaneous access to each LUN. IIRC was ESX had set is self up as MRU.

Now the above was bad and not proper fault tolerant, so ive moved everything onto 1 FC switch and its configured as in my first post, ESX has set itself up as Fixed - I have only then above this chosen a preferred path and correspondingly set this in ESX. I think Im happy I have it set up OK now.

Hopefully I can get the budget for a dual HBA per host and have some proper redundancy again. - I presume I would leave it Fixed and not switch back to MRU - as ghost said VM shouldnt care about having another fabric?

Thanks

Reply
0 Kudos
kjb007
Immortal
Immortal
Jump to solution

The redundancy should be all around. Both SP's should not be attached on only one switch, they should have a connection to both, as should the hosts.

True redundancy should have at least 1 port per server connecting to each switch, and the SP's in turn should have at least one port on each switch. This will lead to the same LUN seen 4 or 8 times, depending on active/active active/passive LUN visibility.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
TB_Rich
Enthusiast
Enthusiast
Jump to solution

I had assumed that having dual controllers in a/a would negate the need to connect each controller to each switch. (obviously each host would need connecting to each switch).

I pressume then I need to buy additional 'cards' for the MSA as currently I only have 1 2gb port per controller on the back of it.

Reply
0 Kudos
kastlr
Expert
Expert
Jump to solution

That's true, for an A/A array it's not needed to connect more than one port to a switch for redundancy.

Performance might be a reason to do so, but even with one SP port per switch you got redundany.


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos
TB_Rich
Enthusiast
Enthusiast
Jump to solution

Thanks, I guess kp007's way would in effect get me 4gb per SP. But I think the SCSI link to the SCSI disks of the MSA30 would be holding me up well before the need for 4gb per SP!!

I'll get some more HBA's, leave the preferred paths that I've already setup. I can't see how that could be improved?

Thanks.

Reply
0 Kudos
kastlr
Expert
Expert
Jump to solution

Hi KjB,

it's the other way round, an A/A array should be used with fixed while an A/P array should be used with MRU.

The reason to do so is the following.

Load Balancing

Active/Active arrays

Uses fixed path policy -

Many customers implement a load balancing methodology from the ESX using the preferred path where LUNs are accessed by the ESX via different Storage Processors/Controllers.

Active/Passive arrays

Uses mru path policy -

Since there is no preferred path option available for MRU, customers have to manually load balance the LUN presentations from the array side, presenting LUNs via different Storage Processors/Controllers.

This is prefectly acceptable, but in the event of a failover, all the load balancing is lost.

Why not just set fixed path policy on an A/P array?

On A/P arrays, there are three SCSI return codes / check conditions which will initiate a failover:[b]

NO CONNECT

NOT READY

ILLEGAL REQUEST

On A/A arrays, there is only one SCSI return codes / check conditions which will initiate a failover:[b]

NO CONNECT[/b]

If we set a fixed path policy on an A/P array, it is possible we get a check condition which should initiate a failover, but will not if we have not set the path policy to mru.[/b]


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos