When using ESX 3.0.2 with an IBM SVC SAN is defaults to MRU mode and treats SVC as Active/Passive array. Per support request I have opened it appears this is correct for the SVC.
Others that have this configuration (IBM SVC) I have the following questions:
1) do you leave yours in MRU mode, if not no Path Thrashing issues?
2) we are still at 3.1 SVC code which had many performance issues. SVC 4.2 code I think is what is certified. Any of you on that code with ESX 3.0.2. and is performance good?
3) can I change the path to the storage processors on the ESX side (I know where to make the change but not sure I should, especially with the MRU mode set)?
Thanks for your feedback.
> 1) do you leave yours in MRU mode, if not no Path Thrashing issues?
Interestingly, we were directed to set ours to fix mode. We were originally set at mru until we got a recommendation from IBM to use fixed mode. This way ESX won't treat the SVC any differently from our 8300. Anyways, no we have not had any path thrashing issues.
> 2) we are still at 3.1 SVC code which had many performance issues. SVC 4.2 code I think is what is certified. Any of you on that code with ESX 3.0.2. and is performance good?
We just upgraded from 18.104.22.168 to 22.214.171.124 a couple of Sundays ago. Performance is great. However, we were not experiencing any performance issues before the upgrade.
>3) can I change the path to the storage processors on the ESX side (I know where to make the change but not sure I should, especially with the MRU mode set)?
Yes. You can manually set the path to a specific SP whether you're using MRU or Fixed. As you probably know, that can be done within Configuration | Storage Adapters view.
As a general rule, active/passive luns connected directly to and ESX server should be set to MRU.
You should not see path thrashing issues with MRU as the LUN's will 'stick' to the most recently used path. The reason path thrashing occurs is that a fixed policy will try to 'pull' a the path back tot he prefered path and and MRU policy will try and 'pull' the path back to to the most recently used path.
I believe we'd treat the SVC as active/active (you may want to confirm this with IBM)
Normally we recommend Fixed for active/active arrays. Just be sure that it's set to Fixed across all hosts.
The VMware ESX Server VMkernel has a built-in proprietary implementation of multipathing that establishes and manages SAN fabric path connections. Once paths are detected, VMkernel establishes which path is active, in standby, or no longer connected (a dead path). When multipathing detects a dead path, it provides failover to alternate paths. Because of this multipathing mechanism, you can leverage all available paths to manually load balance I/O work-oad.
By default, ESX Server only uses one path, regardless of available alternate paths.
For active/active type arrays, you must manually assign a LUN to a specific path. For active/passive type arrays, ESX Server automatically picks one path for I/O workloads; you cannot load balance to an alternate path. This is a limitation of the active/passive array types as only one path can be active at one time.
For best performance, each virtual machine should reside on its own LUN. In this way, the volume can be specified a preferred path to the SAN fabric. Before doing this, however, you must determine which virtual machine or volume (if multiple virtual machines are sharing this one volume or LUN) has the most intensive I/O workload. Note that the level of I/O traffic varies for different applications at different times of the day. To determine the maximum level of I/O traffic for a virtual machine (or for a specific volume), monitor actual performance at intervals of a few hours apart (or shorter time duration) during a typical day. Monitor performance first by measuring the IOPS generated by the application. Second, measure the bandwidth availability between the associated volume and the SAN fabric path.
Balancing loads among available paths also improves performance. You can set up your virtual machines to use different paths by changing the preferred path for the different HBAs. This is possible only for active/active storage processors (SP) and requires that you have path policy set to Fixed.
If a path fails, the surviving paths carry all the traffic. Path failover should occur within a minute (or within two minutes for a very busy fabric), as the fabric might converge with a new topology to try to restore service. A delay is necessary to allow the SAN fabric to stabilize its configuration after topology changes or other fabric events.
Thanks for the input so far...I WILL award points...just want to see what other feedback comes in first and confirming with IBM contact.
Thanks again for any comments....
Possibly the most informative post on the whole SVC scenario i've read. The site i'm working for is undergoing a SVC 3.1 to 4.2 upgrade, we are currently connected directly to the storage but about to undergo config and testing for use with the SVC 4.2.
As I understand it, the SVC has multiple nodes (a cluster) with the equivalent of storage processors in each. The whole active/active active/passive thing comes about because there is normally a client driver that manages the multipathing (like round robin dns), which obviously does not exist within the VM infrastructure. With the SVC, you will have <something like> 8 paths, 4 'active' and 4 'passive', but normally all the traffic goes via the active paths, and hence the configuration is active/active.
My question is .. how do you 'setup your virtual machines to use different paths by changing the preferred path for the different HBA's' .. I though you can only specify one preferred path regardless of how many of your HBA's can see the path to the disk? Currently, I can see 4 paths to the HDS (active/active array), 2 via each HBA, but there is no way I can see (read: doesen't mean it doesen't exist!) to configure an active path for each HBA? This would be sensational, because we want to load balance between the SVC ports, but also want to be assured that in the event of svc node failure (or scheduled maitenance), that the preferred pathing will fail over to the designated port.
Using VC 2.02, ESX 3.01
To load balance at a VM level, each VM would have to be on a different LUN. You would then load balance your prefered paths across you HBA's and SP's. If not, if you have multiple VM on a LUN, you can still spread the load over the HBA's and SP's at a LUN level. There will only ever be one active path per LUN. This will be the prefered path, if it is available.
Path Management and Manual Load Balancing
VMware's knowledge of the IBM SVC is very limited...and they constantly refer you to IBM. It has taken almost a year for certification for ESX 3 to even work on the IBM SVC. A big question we have moving forward, do we want to wait a year for certification for future Vmware products that deal heavily with the SAN (like Site Recovery Manager)?
Our answer is no!
Support states ESX will pick the proper pathing (MRU or FIXED) per what it see and from what they have been told by the vendor. We see this sending most traffic to hit a single storage processor path.
We can't even get a simple answer to is it alright to switch to FIXED and manually adjust load balance?
Yes..but when I call Vmware support and mention wanting to change it to fixed they claim the SVC should be at what it picked (MRU).
I have already started trying to set some test LUNS to FIXED and adjusting the path. The SAN guides warn this could cause PATH Thrashing. Thanks for you help.
Would you know how I would look for what LUNS are most busy and what vms would be causing the traffic? This one might be a little involved I understand.
Thanks for you help though.
We treat the SVC as an active/active array, please use fixed policy. You will not get path thrashing if all paths are set to fixed. Path thrashing occurs when some servers have their paths set to MRU and some to fixed. The fixed servers try to use the prefered path and the MRU servers try to pull the path back to the one most recently used then the fixed servers pull the path back again. If everything is set to fixed, this won't happen.
Again thanks...So I have 3 ESX 3.0.2 server being used for the conversion. They have about 100 vms on them...5 LUNS with 1 LUN being empty. I can Vmotion off all vms and make the changes on 1 server..then go to the next server and make the same changes (with policy set to FIXED).
Can I do this without interruption to the vms (using Vmotion to clear a server I am working on) or will I have issues while I am going around to each server until all servers are done?
Today as of 2-5-08, the supported and recommended pathing policy for SVC code 4.1+(which is all that we support in 3.x) Should be set to MRU, we tested and treat the SVC as Active/Passive. We are however going through the certification to change that to fixed policy which would allow you to use the path load balancing meantioned above.
By the reference to We in your post i assume you work for vmware or IBM.
As you can see, there is no wonder people complain about vmware support as the guys with the most knowledge are not providing the updated information to the guys working the cases. Obviously things change with code revisions, hardware revisions,etc but if there was a consistent answer available it sure would be nice.