VMware Cloud Community
KrishnaR
Enthusiast
Enthusiast

DS4000 SRM issues

I'm starting this threadto hear back from users and field on any experiences with DS4000 and SRM. Particularly interested in any issues or problems encountered. I've been working with SRM and DS4000 since beta and can try to help resolve any problems that've come up. I'm also working on an SRM guide but can't give a date on it yet.

Reply
0 Kudos
106 Replies
Iuridae
Contributor
Contributor

Hi FG,

You will need empty, unassigned disk for each LUN you replicate. You will only need one if a disk has atleast 20% capacity of the replicated LUN.

ex. if you have a LUN with 1TB that you replicate and is gonna do a test failover, SRA will take one disk (or as many as needed) to create a new array 0 and LUN with 200gb space. This is done for each LUN that is part of the test. So if you have more LUN in the test, a new array/LUN is created for each.

Hope it makes sense.

Reply
0 Kudos
admin
Immortal
Immortal

Hi Felix,

Could you attach new SRM logs?

-Masha

Reply
0 Kudos
TheGallowMan
Contributor
Contributor

The latest SRA (May 09) removes the requirement for unconfigured disks, It now just needs enough free space (25%) for the FlashCopy LUN to be created. In my experience when using DS4700s and SRM the biggest problems were getting the SRA to recognise the replicated LUNs. Once we enabled the Enhanced Remote Mirroring (both sites) and FlashCopy (recovery site) features were enabled and replicated LUNs configured we were OK.

Reply
0 Kudos
cj1223
Contributor
Contributor

This reference proved to be very helpful with our installation of SRM using the DS 4700.

HTH.

Reply
0 Kudos
cj1223
Contributor
Contributor

Reply
0 Kudos
dex_1234
Contributor
Contributor

As mentioned already, the latest SRA(may) removes the unconfigured disk requirement for flashcopies taken during test failover. Are you not seeing the mirror pair during Array Manager configuration? Mirror pairs with no virtual machine objects on it will not be reported. If you just configured mirroring and haven't yet configured a datastore with VMs on it, nothing will be reported during Array Manager configuration. Thought I'd just throw that out.

Reply
0 Kudos
KrishnaR
Enthusiast
Enthusiast

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101443

This IBM DS4k/5k SRA guide should be helpful. It's also present on the SRM site I think.

Reply
0 Kudos
FG0711
Contributor
Contributor

Everybody, Thank you all for your feedback. It's very usefull. I'll follow many provided here suggestions some time next week.

Reply
0 Kudos
FG0711
Contributor
Contributor

,

Thanks for feedback. Which version of DS4700 did you use -70 or -72? If you used -70, how was the FC fabric configured? We have port #2 of both controllers in the protected site and recovery site used for Enh. Remote Mirroring so it leaves only one port per controller for Host connectivity. I do use May 09 version of IBM ds4000/5000 SRA. I have 500GB of unconfigured free space, however this space is within a configured RAID5 array. All my VMs are on the same LUN. ( 600GB) The other LUN is a RMD data disk(1.6TB) which is only used by one VM.

It looks like my free space is just below 25% you mentioned, but well over 20% other people mentioned.

Reply
0 Kudos
dex_1234
Contributor
Contributor

The current SRA algorithm will look in the current array group that the logical drive is in to put the repository, if not enough space is available it'll scan the entire subsytem and place the repository in the first available free capacity it finds. In your case, the repository will be place in the RAID5 array. (note: flashcopies are only taken of the secondary logical drive in a mirror pair at the "recovery" site during an SRM test failover)

SRM will work with both the DS4700 model 70 or 72. With the model 70, as you mentioned when ERM is enabled you'll lose a host port on each controller for mirroring, leaving you with only one port on each controller for host communication. It's recommended that you design the fabric to enable each HBA initiator port to see both controllers in your subsystem. In order to do this with the model 70 you will have to use a single fibre switch or redundant switches with an ISL. This will provide your ESX server(s) with redundant paths to your storage system(4 path). The reason why this is recommended is to avoid logical drive failover in the event of a single path failure. The storage system is designed to perform logical drive failover, but it's something you want to avoid if possible. If a logical drive failover does occur, the storage subsystem has to change it's ownership, synchronize cache, re-route IOs through the alternate controller and if the logical drive is part of a mirror, all of the backend end mirroring operations has to change controller ownership as well. If the primary logical drive in a mirror pair changes controller ownership, so does the secondary logical drive. The recommended SAN design is an attempt to minimize the need for this. Yes, the system is designed to be fault tolerant, but it's something you really want to try and avoid.

Reply
0 Kudos
FG0711
Contributor
Contributor

I switched to a single fabric - both ESX hosts and ports 1 of both controllers are in the same zone on the same vsan. I also created a new LUN #6 - SRM_TEST with one VM and its datastore on it, 60GB. Setup replication to the protected side's storage ( same name). Enable flash copy on both sites. >>> Still can not see the LUNS, protected or recovery. I also opened the support cases with IBM and Vmware. We had a couple of webex sessions, but no resolution so far

The log file do not indicate any errors. (attached)

Any ideas? Thank you

Reply
0 Kudos
dex_1234
Contributor
Contributor

I took a look at your vmware-dr log and have a few questions:

1. The DiscoverLun request has your controller IPs as 192.168.10.31 and 192.168.100.32 is this correct?

2. Looks like you have two "hosts" configured on the storage system, are the luns you're trying detect currently mapped to the "default group"? If they

are mapped to the default group the IBM SRA does not detect luns mapped to the default group. You have to map your luns

to a host or hostgroup with members.(hostgroups with no members are not supported)

Reply
0 Kudos
FG0711
Contributor
Contributor

Dex_1234

Thank you very much for your post. I did make a mistake in entering the IP address (missed one "0"). I do have my hosts (ESX1-1, ESX1-2 on one side and ESX2-1, ESX2-2 on the DR site) mapped to the default group as well as all LUNs. I had a hostgroup "Hosts" in each site, but got rid off it and mapped everything to "default" when SRM did not find the LUNs at first. I will recreate the hostgroups and fix IP problem on Monday. I hope it'll work. Thanks again for your help

Reply
0 Kudos
FG0711
Contributor
Contributor

Dex_1234,

I recreated the hostgroups and confirmed all IP addresses of DS4700 controllers. Still no good. I may try to uninstll and re-install everything. Thanks

attached are screenshots (storage manager and SRM)

Reply
0 Kudos
FG0711
Contributor
Contributor

Just noticed something:

>> It looks like SRA thinks that my storage is DS4800. It's DS4700-70 could that be a problem ??? Also, the 4th line from the bottom says "NULL" - is that a problem ?

I appreciate any feedback.

Thank you

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Manager/scripts/SAN/IBM/discoverLuns.pl" temp956

:QUITE:main:Logging data for discover luns

:INFO:parseDiscoverLunInput:Parsing the lun discovery input.....

192.168.100.31192.168.100.32"C:/Program Files/VMware/VMware Site Recovery Manager/scripts/SAN/IBM/SMsra.exe" discoverLuns arrayName=DS4800 arrayAddress=192.168.100.31 arrayAddress=192.168.100.32 arrayPassword= arrayId=600a0b80002ab4b00000000049704bef logLevel=trivia outputFile=cliDLun1208[2009-07-06:: 10:06:43]:INFO:MAIN:Entry

ArrayId=600a0b80002ab4b00000000049704bef

bindToController succesfull

Current CGN=2336

No. of Arrays found=1

ArrayId in Cmd =600a0b80002ab4b00000000049704bef

ArrayId in List=600a0b80002ab4b00000000049704bef

Cluster Len=1

Host Len=2

Got a Cluster

Cluster Ref=...

Cluster Ref=85000000600a0b80002ab45a0036090e4a51abc9

Pushing in an Initiator

InitiatorID=2100001b32042bdf

Pushing in an Initiator

InitiatorID=2100001b3204f9df

Pushing in an Initiator

InitiatorID=2100001b320454e0

Pushing in an Initiator

InitiatorID=2100001b32042cdf

Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9

ObjBundle Host Ref=84000000600a0b80002ab45a003007404a3cc1d9

hostInResponse=1

Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9

ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a

Added host Ref=84000000600a0b80002ab45a003007434a3cc21a

ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a

hostInResponse=1

Got a Mirrored LUN

LunMapping.mapRef is NULL

peerArrayId=600a0b80002ab45c0000000049704cb3

Result has=23

:INFO:main:Done with discoverLuns

Reply
0 Kudos
dex_1234
Contributor
Contributor

Completely uninstall the SRA and then re-install it.(both sides) After that, restart SRM(I'd do it for both protected and recovery sides) and run through the array manager configuration again. Post your vmware-dr log after this run. dex

Reply
0 Kudos
FG0711
Contributor
Contributor

dex_1234:

re-installed sras on both sides. restarted srm services on both sites, still the same problem. see attached

thanks

Reply
0 Kudos
dex_1234
Contributor
Contributor

As you've stated earlier the line: "LunMapping.mapRef is NULL" is troubling.

192.168.100.31192.168.100.32"C:/Program Files/VMware/VMware Site Recovery Manager/scripts/SAN/IBM/SMsra.exe" discoverLuns arrayName=DS4800 arrayAddress=192.168.100.31 arrayAddress=192.168.100.32 arrayPassword= arrayId=600a0b80002ab4b00000000049704bef logLevel=trivia outputFile=cliDLun3396[2009-07-06:: 15:07:43]:INFO:MAIN:Entry

ArrayId=600a0b80002ab4b00000000049704bef

bindToController succesfull

Current CGN=2361

No. of Arrays found=1

ArrayId in Cmd =600a0b80002ab4b00000000049704bef

ArrayId in List=600a0b80002ab4b00000000049704bef

Cluster Len=1

Host Len=2

Got a Cluster

Cluster Ref=…

Cluster Ref=85000000600a0b80002ab45a0036090e4a51abc9

Pushing in an Initiator

InitiatorID=2100001b32042bdf

Pushing in an Initiator

InitiatorID=2100001b3204f9df

Pushing in an Initiator

InitiatorID=2100001b320454e0

Pushing in an Initiator

InitiatorID=2100001b32042cdf

Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9

ObjBundle Host Ref=84000000600a0b80002ab45a003007404a3cc1d9

hostInResponse=1

Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9

ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a

Added host Ref=84000000600a0b80002ab45a003007434a3cc21a

ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a

hostInResponse=1

Got a Mirrored LUN

LunMapping.mapRef is NULL*

peerArrayId=600a0b80002ab45c0000000049704cb3

Result has=23

:INFO:main:Done with discoverLuns

The SRA connects to the controllers and can pull the host/hostgroup info just fine. It detects that a mirrorred lun exist, but

we get the "LunMapping.mapref is NULL" error. What you should see is something similar to the following after the "Got a Mirrored LUN" line:(example is from my setup)

Got a Mirrored LUN

(2009-04-27 10:21:02) ::VERBOSE::SMsra::StorageArrayCommandExecutor::getClusterRef::Got a Host's clusterRef

InitiatorGroupID=85000000600a0b8000293ade003659ae49e3246e

peerArrayId=600a0b80001133e800000000481eb21a

I know you already have a support ticket in and this looks to be an issue between the SRA and storage system, so it'll have to go down the IBM side. The example of

my log above was based on SRA build 01.00.35.12. One thing you might want to try while you work through the support process is back down to the previous SRA build(01.00.35.12)

and give it another attempt to see if we get the same "LunMapping.mapref is NULL" error. This is the build before the current SRA build that's out there for download(01.01.35.01). In parallel I can continue to help you out and look into things, would it be possible

to get a copy of your storage subsystem profiles for both sites?

Reply
0 Kudos
FG0711
Contributor
Contributor

Dex_1234,

Attahed are some screenshots of the storage. Which log files should I get from DS4700 ? Thank you

Reply
0 Kudos
FG0711
Contributor
Contributor

attachment

Reply
0 Kudos