I'm starting this threadto hear back from users and field on any experiences with DS4000 and SRM. Particularly interested in any issues or problems encountered. I've been working with SRM and DS4000 since beta and can try to help resolve any problems that've come up. I'm also working on an SRM guide but can't give a date on it yet.
Hi FG,
You will need empty, unassigned disk for each LUN you replicate. You will only need one if a disk has atleast 20% capacity of the replicated LUN.
ex. if you have a LUN with 1TB that you replicate and is gonna do a test failover, SRA will take one disk (or as many as needed) to create a new array 0 and LUN with 200gb space. This is done for each LUN that is part of the test. So if you have more LUN in the test, a new array/LUN is created for each.
Hope it makes sense.
Hi Felix,
Could you attach new SRM logs?
-Masha
The latest SRA (May 09) removes the requirement for unconfigured disks, It now just needs enough free space (25%) for the FlashCopy LUN to be created. In my experience when using DS4700s and SRM the biggest problems were getting the SRA to recognise the replicated LUNs. Once we enabled the Enhanced Remote Mirroring (both sites) and FlashCopy (recovery site) features were enabled and replicated LUNs configured we were OK.
in response to previous post, the full link is http://daveveness.wordpress.com/2009/01/09/ibm-ds4700-model-70-vmware-esx-35-and-site-recovery-manag...
As mentioned already, the latest SRA(may) removes the unconfigured disk requirement for flashcopies taken during test failover. Are you not seeing the mirror pair during Array Manager configuration? Mirror pairs with no virtual machine objects on it will not be reported. If you just configured mirroring and haven't yet configured a datastore with VMs on it, nothing will be reported during Array Manager configuration. Thought I'd just throw that out.
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101443
This IBM DS4k/5k SRA guide should be helpful. It's also present on the SRM site I think.
Everybody, Thank you all for your feedback. It's very usefull. I'll follow many provided here suggestions some time next week.
Thanks for feedback. Which version of DS4700 did you use -70 or -72? If you used -70, how was the FC fabric configured? We have port #2 of both controllers in the protected site and recovery site used for Enh. Remote Mirroring so it leaves only one port per controller for Host connectivity. I do use May 09 version of IBM ds4000/5000 SRA. I have 500GB of unconfigured free space, however this space is within a configured RAID5 array. All my VMs are on the same LUN. ( 600GB) The other LUN is a RMD data disk(1.6TB) which is only used by one VM.
It looks like my free space is just below 25% you mentioned, but well over 20% other people mentioned.
The current SRA algorithm will look in the current array group that the logical drive is in to put the repository, if not enough space is available it'll scan the entire subsytem and place the repository in the first available free capacity it finds. In your case, the repository will be place in the RAID5 array. (note: flashcopies are only taken of the secondary logical drive in a mirror pair at the "recovery" site during an SRM test failover)
SRM will work with both the DS4700 model 70 or 72. With the model 70, as you mentioned when ERM is enabled you'll lose a host port on each controller for mirroring, leaving you with only one port on each controller for host communication. It's recommended that you design the fabric to enable each HBA initiator port to see both controllers in your subsystem. In order to do this with the model 70 you will have to use a single fibre switch or redundant switches with an ISL. This will provide your ESX server(s) with redundant paths to your storage system(4 path). The reason why this is recommended is to avoid logical drive failover in the event of a single path failure. The storage system is designed to perform logical drive failover, but it's something you want to avoid if possible. If a logical drive failover does occur, the storage subsystem has to change it's ownership, synchronize cache, re-route IOs through the alternate controller and if the logical drive is part of a mirror, all of the backend end mirroring operations has to change controller ownership as well. If the primary logical drive in a mirror pair changes controller ownership, so does the secondary logical drive. The recommended SAN design is an attempt to minimize the need for this. Yes, the system is designed to be fault tolerant, but it's something you really want to try and avoid.
I switched to a single fabric - both ESX hosts and ports 1 of both controllers are in the same zone on the same vsan. I also created a new LUN #6 - SRM_TEST with one VM and its datastore on it, 60GB. Setup replication to the protected side's storage ( same name). Enable flash copy on both sites. >>> Still can not see the LUNS, protected or recovery. I also opened the support cases with IBM and Vmware. We had a couple of webex sessions, but no resolution so far
The log file do not indicate any errors. (attached)
Any ideas? Thank you
I took a look at your vmware-dr log and have a few questions:
1. The DiscoverLun request has your controller IPs as 192.168.10.31 and 192.168.100.32 is this correct?
2. Looks like you have two "hosts" configured on the storage system, are the luns you're trying detect currently mapped to the "default group"? If they
are mapped to the default group the IBM SRA does not detect luns mapped to the default group. You have to map your luns
to a host or hostgroup with members.(hostgroups with no members are not supported)
Dex_1234
Thank you very much for your post. I did make a mistake in entering the IP address (missed one "0"). I do have my hosts (ESX1-1, ESX1-2 on one side and ESX2-1, ESX2-2 on the DR site) mapped to the default group as well as all LUNs. I had a hostgroup "Hosts" in each site, but got rid off it and mapped everything to "default" when SRM did not find the LUNs at first. I will recreate the hostgroups and fix IP problem on Monday. I hope it'll work. Thanks again for your help
Dex_1234,
I recreated the hostgroups and confirmed all IP addresses of DS4700 controllers. Still no good. I may try to uninstll and re-install everything. Thanks
attached are screenshots (storage manager and SRM)
Just noticed something:
>> It looks like SRA thinks that my storage is DS4800. It's DS4700-70 could that be a problem ??? Also, the 4th line from the bottom says "NULL" - is that a problem ?
I appreciate any feedback.
Thank you
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Manager/scripts/SAN/IBM/discoverLuns.pl" temp956
:QUITE:main:Logging data for discover luns
:INFO:parseDiscoverLunInput:Parsing the lun discovery input.....
192.168.100.31192.168.100.32"C:/Program Files/VMware/VMware Site Recovery Manager/scripts/SAN/IBM/SMsra.exe" discoverLuns arrayName=DS4800 arrayAddress=192.168.100.31 arrayAddress=192.168.100.32 arrayPassword= arrayId=600a0b80002ab4b00000000049704bef logLevel=trivia outputFile=cliDLun1208[2009-07-06:: 10:06:43]:INFO:MAIN:Entry
ArrayId=600a0b80002ab4b00000000049704bef
ArrayId in Cmd =600a0b80002ab4b00000000049704bef
ArrayId in List=600a0b80002ab4b00000000049704bef
Cluster Ref=85000000600a0b80002ab45a0036090e4a51abc9
Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9
ObjBundle Host Ref=84000000600a0b80002ab45a003007404a3cc1d9
Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9
ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a
Added host Ref=84000000600a0b80002ab45a003007434a3cc21a
ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a
Completely uninstall the SRA and then re-install it.(both sides) After that, restart SRM(I'd do it for both protected and recovery sides) and run through the array manager configuration again. Post your vmware-dr log after this run. dex
As you've stated earlier the line: "LunMapping.mapRef is NULL" is troubling.
192.168.100.31192.168.100.32"C:/Program Files/VMware/VMware Site Recovery Manager/scripts/SAN/IBM/SMsra.exe" discoverLuns arrayName=DS4800 arrayAddress=192.168.100.31 arrayAddress=192.168.100.32 arrayPassword= arrayId=600a0b80002ab4b00000000049704bef logLevel=trivia outputFile=cliDLun3396[2009-07-06:: 15:07:43]:INFO:MAIN:Entry
ArrayId=600a0b80002ab4b00000000049704bef
ArrayId in Cmd =600a0b80002ab4b00000000049704bef
ArrayId in List=600a0b80002ab4b00000000049704bef
Cluster Ref=85000000600a0b80002ab45a0036090e4a51abc9
Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9
ObjBundle Host Ref=84000000600a0b80002ab45a003007404a3cc1d9
Added host Ref=84000000600a0b80002ab45a003007404a3cc1d9
ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a
Added host Ref=84000000600a0b80002ab45a003007434a3cc21a
ObjBundle Host Ref=84000000600a0b80002ab45a003007434a3cc21a
peerArrayId=600a0b80002ab45c0000000049704cb3
:INFO:main:Done with discoverLuns
The SRA connects to the controllers and can pull the host/hostgroup info just fine. It detects that a mirrorred lun exist, but
we get the "LunMapping.mapref is NULL" error. What you should see is something similar to the following after the "Got a Mirrored LUN" line:(example is from my setup)
Got a Mirrored LUN
(2009-04-27 10:21:02) ::VERBOSE::SMsra::StorageArrayCommandExecutor::getClusterRef::Got a Host's clusterRef
InitiatorGroupID=85000000600a0b8000293ade003659ae49e3246e
peerArrayId=600a0b80001133e800000000481eb21a
I know you already have a support ticket in and this looks to be an issue between the SRA and storage system, so it'll have to go down the IBM side. The example of
my log above was based on SRA build 01.00.35.12. One thing you might want to try while you work through the support process is back down to the previous SRA build(01.00.35.12)
and give it another attempt to see if we get the same "LunMapping.mapref is NULL" error. This is the build before the current SRA build that's out there for download(01.01.35.01). In parallel I can continue to help you out and look into things, would it be possible
to get a copy of your storage subsystem profiles for both sites?
Dex_1234,
Attahed are some screenshots of the storage. Which log files should I get from DS4700 ? Thank you