Solved: FC ALUA Storage, LUN trespass and some doubts

hwasp · ‎04-10-2015

Hi there!

First thank to all for this great community!

I am configuring a small vSphere 5.5 cluster, and everything seems to be ok, but I have some doubts that I will want to share. The configuration is the following:

vSphere 5.5u2 hosts with two single Emulex HBAs 8G. (Host_HBA1 and Host_HBA2)

Netapp E2724 dual controller FC 8G (two ports per controller) and in theorytically active-active configuration. (Controller1_Port1, Controller1_Port2, Controller2_Port1, Controller2_Port2)

Two Brocade s300 switches with single initiator, multiple target zoning (Switch1: Host_HBA1 to Controller1_Port1 and Controller2_Port1 and Switch2: Host_HBA2 to Controller1_Port2 and Controller2_Port2)

Cabling seems to be correct and vSphere detects 4 paths per LUN, sets the multipath driver automatically to round robin, and sets two Active (I/O) paths and the other two only as Active.

The LUNS in the Netapp are configured as VMware O.S. LUNs (they are Dinamic Disk Pools Volumes) and AVT seems to be enable for this kind of Hosts.

Well, I tried different tests (Storage VMotion between different LUNs, VMware IO analyzer, to create and remove virtual machines...) and everything seems to run smoothly. The only strange behavior that I have noticed is that when one creates the volumes in the Storage system, they are assigned to one or other controller, and they have a "Preferred Owner" (ALUA). When for example, I make a storage VMotion between two DataStores (I have one DataStore per LUN) and this LUNS have different "preferred" controllers (LUN1 in Controller1 and LUN2 in Controller2) I have noticed that the LUN2 is automatically moved to the Controller1. I am not sure if this is a normal behavior to improve the performance or I have some misconfiguration (I have a message in the NetApp storage log that states: I/O shipping implicit volume transfer, and afterwards an alarm that the LUN is not in the preferred controller). I guess the multipath driver is using AVT to send the LUN to the same controller, but as I have said, I am not sure if this is normal. It would be nice if somebody can clarify this. The Storage VMotion finish well and no disk problems appears.

Thanks again for your time!

hwasp · ‎04-16-2015

The official answer from NetApp is that they are aware of the problem and it will be fixed in a future firmware release.

View solution in original post

vfk · ‎04-10-2015

Hi, I have found the Express Guide to be good when configuring NetApp E-series. Is Santricity reporting non-optimal configuration? If you are getting alerts then check your zoning and confirm the all initiators are connecting to the targets correctly.

Here is the link to the Express Guide: FC Configuration and Provisioning for VMware® Express Guide - FC Configuration and Provisioning for VMware® Express Guide

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP

hwasp · ‎04-13-2015

First, thanks for the answer. Yes, I also have used the Express Guide. For vSphere 5.5 and E2700 series seems that any command has to be run in the hosts to customize the NMP or VAAI claim rules. Also I configured the zoning in the switches following its recommendation (Single Initiator-Multiple Target). As far as I can see the zoning seems to be ok. The Storage seems to work well too, and is only reporting an alarm when the LUN is automatically moved to the non preferred-controller. I have only seen this when I copy data across two LUNS that are in different preferred controllers. Seems that AVT is playing a role in this behavior, but I still don't understand how.

vfk · ‎04-13-2015

I have noticed that some of the documentation for the E-series is little bit dated, make sure you are looking at the right ones. I have recently configured E-2700, iSCSI, and I didn't make any changes to the ESXi hosts, it picks up the correct configuration. I suspect all the initiators are not login to the SAN, try rebooting the esxi host to get initiators to login again. If you are still getting non-optimal config alert, contact NetApp. Their support is very good, they should be able to check AutoSupport logs and point in the right direction.

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP

hwasp · ‎04-14-2015

Well, I did some additional research before to open a case with NetApp and my problem seems to be like this: Unexplained LUN trespasses on EMC VNX explained

I have disabled VAAI temporary to make a test, and the problem disappears, so I guess it could be the same thing.

In theory the movement of the LUN out of the preferred controller is done to improve the response, and in a Round Robin configuration the LUN should be restored to the preferred one afterwards ( https://community.emc.com/thread/132762 ) but this seems not to happen.

Anyway I think I have an explanation to this behavior and I will contact NetApp to see if it is like this or there is something else.

hwasp · ‎04-16-2015

The official answer from NetApp is that they are aware of the problem and it will be fixed in a future firmware release.

All

FC ALUA Storage, LUN trespass and some doubts