Greetings to all.
Who is confronted with an error on the SRM storages HP EVA?
If I start "test recovery plan", everything is done properly. When
i start "run recovery plan", in paragraph 5 I see an error:
5. Recover Normal Priority Virtual Machines | Error: Failed to recover datastore: | 00:00:00 |
5.1. Recover VM "2000" | Error: Failed to recover datastore: | 00:00:00 |
SRM 1.0.1
ESX 3.5 U4
VC 2.5 U4
HP SRA 1.0.1
Site1 - EVA4000 (HSV200) firmware 6110
Site2 - EVA4100 (HSV200-B) firmware 6110
Management software - CV EVA 8.00.02
PS: At the same system with connected netapp storages everything works normal.
This might have changed - but I believe with the EVA - you have to two scans before the snapshot, and resigitures happen.
In the main xml or .ini file for SRM the options were you can tell SRM to do two rescans of the hbas...
Might be in the rel notes or pdfs somewhere...
EVA specific - hence it works with netapp fine but not the EVA
Regards
Mike Laverick
RTFM Education
Author of the SRM Book: http://www.lulu.com/content/4343147
That's probably the problem: http://www.yellow-bricks.com/2009/02/19/srm-and-rescanning-your-storage-twice/
I've experience this every single time I work with an EVA, that's why I wrote it down
Duncan
VMware Communities User Moderator
-
Blogging: http://www.yellow-bricks.com
If you find this information useful, please award points for "correct" or "helpful".
I write this option before run recovery plan
in the SRM log I found the line: dr.san.fault.RecoveredDatastoreNotFound
later I will try to change Qlogic HBA to Emulex
Huh? Don;t think this has anything to do with the type of HBA you are using.
Did you already present the recovery sites LUNs to the recovery site hosts? this is a pre-req for a full failover!
Duncan
VMware Communities User Moderator
-
If you find this information useful, please award points for "correct" or "helpful".
Yes i am present recovery lun to the recovery ESX HOST.
Can i send my log file to you ?
Hi,
It looks like a configuration issue on the array side. ESX hosts at recovery site report LUNs on target 50:0A:09:82:86:27:B9:99 which is not reported by the SRA.
Added LUN '50:01:10:A0:00:18:3E:18;0;50:0A:09:82:86:27:B9:99' with keys 'host-321;vmhba1:4:0' and 'host-321;020000000060a980004334623668344d52415873424c554e202020'
Added LUN '50:01:10:A0:00:18:3E:18;1;50:0A:09:82:86:27:B9:99' with keys 'host-321;vmhba1:4:1' and 'host-321;020001000060a98000433462385a4a4a43726179644c554e202020'
Could you check your array configuration against the SRA installation guide?
-Masha
what type of access to the remote site i need to set ? not access or read-only ?
Sorry.
access mode to the LUN on the second site. Set in the creation of the replication in HP Command View.
I think it is the dual rescan issue and agree with Duncan's assessment to set the hostRescanRepeatCnt value to 2, however you need to restart the SRM server on the recovery side for this change to take effect (like any change to vmware-dr.xml since it is only read when the SRM service is started). To restart the service, locate "VMware Site Recovery Manager" in the Windows services GUI, right-click and restart.
The logs show the value of the restart count at startup time and both logs have the same message:
Setting number of repeated host rescans during recovery to 1
The reason it is most likely the rescan issue is that the HP adapter claims it successfully failed over the target device and assigned it LUN number 1:
<Number initiatorGroupId="\Hosts\hp08">1</Number>
However, on rescan, ESX host only sees LUN 0 (presumably there are two paths to it which is why it is listed twice); it should also see a LUN 1, i.e. vmhba1:0:1
Added LUN '50:01:10:A0:00:18:3E:18;0;50:00:1F:E1:50:11:26:48' with keys 'host-321;vmhba1:0:0' and 'host-321;020c00000050001fe150112640485356323030'
Added LUN '50:01:10:A0:00:18:3E:18;0;50:00:1F:E1:50:0A:FF:48' with keys 'host-321;vmhba1:2:0' and 'host-321;020c00000050001fe1500aff40485356323030'
The lines:
Added LUN '50:01:10:A0:00:18:3E:18;0;50:0A:09:82:86:27:B9:99' with keys 'host-321;vmhba1:4:0' and 'host-321;020000000060a980004334623668344d52415873424c554e202020'
Added LUN '50:01:10:A0:00:18:3E:18;1;50:0A:09:82:86:27:B9:99' with keys 'host-321;vmhba1:4:1' and 'host-321;020001000060a98000433462385a4a4a43726179644c554e202020'
most likely are for LUNs on the NetApp array which you indicated is also attached to the ESX host -- the third value in the semi-colon-separated string immediately after "Added LUN" is the WWPN of the FC port on the array presenting the LUN, 50:0A:09 is NetApp. These are ignored as part of this failover because SRM is looking for EVA LUNs, i.e. any LUN on a WWPN returned by discoverArrays.
Duncan, I suggest updating your very useful blog post to remind uses to restart SRM service after changing vmware-dr.xml since many might not be aware of this requirement.