VMware Cloud Community
wallabyfan2
Contributor
Contributor

vCenter SRM: "failed to create LUN snapshot" error

Hello Community -

Just getting to grips with a new SRM installation. But getting a failed to create LUN snapshot" error.

"recovery plan encountered errors"

"non-fatal error information reported during execution of array integration script: Failed to create Lun snapshots."

Progress to date:

0. Configured EVA 6100 and 4400 for CA replication groups, nailed down vDisks and members.

1. Setup two vCenters, (2.5 U4) with hosts at 3.5 U4.

2. Setup two SRM databases: Protection Site, Recovery Site

3. In production site, paired sites, configured arrays, setup inventory mappings and then Protection Groups.

4. At recovery site, setup recovery plan.

All seems OK, but when running though a Test--the above error is produced.

Questions:

(A) Why and what does snapshotting, this makes the SAN team very nervous?

(B) Is this an error worth taking seriously, it says non-fatal?

Have check SAN and all Replication looks good.

Reply
0 Kudos
8 Replies
Itzikr
Enthusiast
Enthusiast

Hi,

i suggest you reading the SRM installation guide and the evaluator guide, they provide great resource of understanding the product.

that being said, SRM uses Clones/Snaps/BCV's on the replicated lun of the recovery site to bring up the VM's in an isolated network and isolated LUN, so you could "test" your DR recovery plans without disturbing the production / recovery LUNS replication.

you get this error, becasue you didn't assign the snapshot luns to the recovery luns at the remote site.

Itzik Reich

Solutions Architect

VCP,VTSP,MCTS,MCITP,MCSE,CCA,CCNA

EMC²

where Information Lives

If you find this information useful, please award points for "correct" or "helpful".

Itzik Reich
Reply
0 Kudos
depping
Leadership
Leadership

I think the storage team needs to take a look at this.

SRM gives the command "testfailover". Depending on the array you are using the array creates a snapshot or utilizes a clone of a LUN. In the case of the EVA it will try to create a snapshot of your original LUN. This LUN will be presented to the ESX hosts on the recovery site and it will be "resignatured" and the hosts will be rescanned. When the vmfs volumes are recognized the VMs on the snapshots will be registered and booted up.

So far it looks like replication is okay, otherwise it wouldn't have passed the setup of the SRA, but the snapshotting isn't successful. In otherwords the SRA doesn't get the response from the SRA that it's expecting, it needs a confirmation of the snapshot being successful otherwise it can't continue.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
wallabyfan2
Contributor
Contributor

Can you explain what you mean here pleas by assigning snapshot luns to Recovery Luns. This is my vDisk (or LUN) setup:

Prod (6 hosts) (London, UK)

Datastore001 - 500gb

Datastore002 - 500gb

Datastore003 - 500gb

DR (6 hosts) (Milton Keynes, UK)

Datastore001-Mirror - 500gb

Datastore002-Mirror - 500gb

Datastore003-Mirror - 500gb

Datastore000-PlaceholderDS - 50gb *VM config files

So no Snapshot LUNS anywhere mate? Can you explain. Recovery LUNS are the replicated LUNS with HP CA right from Prod to DR, but where to the Snaphost LUNS live?

Reply
0 Kudos
jbloo2
Enthusiast
Enthusiast

The error actually is fatal, since the EVA adapter is not able to snapshot the replication target LUN in order to present to the recovery ESX hosts for the test.

The adapter is attempting to snapshot the following Vdisk at the target (look for "Input for testFailover" in the recovery SRM logs)

<ReplicaLunKey>\Virtual Disks\ESX Datastores\Prod SRM Test\ACTIVE</ReplicaLunKey>

and cannot seem to find this Vdisk; you should make sure there is a Vdisk named "Prod SRM Test" in the "ESX Datastores" folder on array DREVA1; if not, this is your issue.

SRM knows to use that Vdisk because at the protection side it called the EVA adapter's 'discoverLuns' logic to return the list of replicated devices, among which (from the protection SRM logs) was:

<Lun id="\Virtual Disks\ESX Datastores\Prod SRM Test\ACTIVE">

<Number initiatorGroupId="\Hosts\ESX Hosts\ProdESX03">6</Number>

<Number initiatorGroupId="\Hosts\ESX Hosts\ProdESX01">4</Number>

<Number initiatorGroupId="\Hosts\ESX Hosts\ProdESX02">4</Number>

<Peer>

<ArrayKey>DREVA1</ArrayKey>

<ReplicaLunKey>\Virtual Disks\ESX Datastores\Prod SRM Test\ACTIVE</ReplicaLunKey>

</Peer>

</Lun>

The value of "id" attribute of <Lun> element is the primary Vdisk, and the value of <ReplicaLunKey> is the name of the Vdisk on the target array (which SRM passed to the testFailover script.

I've found in my EVA environment somewhat strange behavior that if I rename the target Vdisk through Command View the GUI reflects this, but when I run discoverLuns again it still uses the old value for the target listed in <ReplicaLunKey>. Did you by any chance rename the Vdisk at the target or put it in a different folder?

Also what is strange about your environment is that 'testFailover' is returning pretty much nothing; the adapter even if it can't find the virtual disk it is supposed to snapshot will still list all of the Initiator Groups (in Command View this is the Host folders); if you don't have any host folders configured even if the EVA adapter can snapshot the target Vdisk it won't be able to present it to any hosts, so you should make sure that is configured properly as well.

Another issue might be when running on a localized (e.g. non-English) version of Windows for your SRM server since EVA disks use the "backslash" character which unfortunately has other meanings or representations on non-English OS's.

Also helpful is if you attach the hpsrmeva.log file in the "logs" directory of the EVA adapter (usually c:\program files\vmware\vmware site recovery manager\scripts\SAN\HP StorageWorks EVA Virtualization Adapter"); the file is a mountain of XML but usually has some useful information about what the adapter thinks it sees on the array.

Reply
0 Kudos
depping
Leadership
Leadership

Snapshot LUNs do not constantly live on the array, they get added when a testfailover occurs. In other words SRM tells the SRA a test failover needs to take place, the SRA tells Command View EVA that it needs a snapshot of a given LUN or set of LUNs depending on which protection group you failover.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
wallabyfan2
Contributor
Contributor

Attach Disks for Protection Group "Protection Group 001" Error: Non-fatal error information reported during execution of array integration script: Failed to create lun snapshots. 00:00:25

Wow....I am so, so thankful for everyone's comments here. But I am I wondering about something guys. OK. Just to confirm.

Do I need a Business Copy (Replication, or Snapshot Licensing) CA license present on both the PROD EVA Array as well as DR EVA Array? I am just thinking out loud here (and writing, but anyhey!!!!) but the snapshotting process has to happen in DR right, before VM's vdisk presented to DR host.........

Production Licensing:

Licensed Capacity Summary for Storage System:

Feature

Capacity

Status

Licensed

Used

CV General

Unlimited GB

25331.97 GB

Valid

Business Copy

Unlimited GB

0.00 GB

Valid

Continuous Access

Unlimited GB

6923.00 GB

Valid

License Key Expiration Status for Storage System:

Status ↑\

Feature ↑\

Type

Capacity

Expiration

Description

Expired

All

Instant-On

Unlimited

09-May-2008

HP StorageWorks Command View EVA Instant-On

Valid

Business Copy

Basic

Unlimited

Permanent

HP Bus Copy EVA6K Ser Unlim LTU

Valid

CV General

Basic

Unlimited

Permanent

HP CV EVA 6k Series Unlimited Lic

Valid

Continuous Access

Basic

Unlimited

Permanent

HP Cont Access EVA6K Ser Unlimited LTU

DR EVA Licensing:

Licensed Capacity Summary for Storage System:

Feature

Capacity

Status

Licensed

Used

CV General

Unlimited GB

15995.02 GB

Valid

Business Copy

0 GB

0.00 GB

Not licensed

Continuous Access

Unlimited GB

6923.00 GB

Valid

License Key Expiration Status for Storage System:

Status ↑\

Feature ↑\

Type

Capacity

Expiration

Description

Expired

All

Instant-On

Unlimited

09-May-2008

HP StorageWorks Command View EVA Instant-On

Valid

Continuous Access

Basic

Unlimited

Permanent

HP Cont Access EVA4K Ser Unlimited LTU

Valid

CV General

Basic

Unlimited

Permanent

HP CV EVA 4k Series Unlimited Lic

Reply
0 Kudos
jbloo2
Enthusiast
Enthusiast

Yes, you would need Business Copy license at target because as you say the bulk of the operations (snapshotting, failover) will occur there. Lack of this license would explain why you are not getting any information from the target.

Reply
0 Kudos
depping
Leadership
Leadership

You might want to read the excellent document that HP provides with the SRA, it contains this kind of info. I usually give it to the storage admins so they know what to expect and what I expect of them.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos