VMware Cloud Community
ruddy001
Contributor
Contributor
Jump to solution

NetApp and SRM issue

Hi

We have setup a primary and remote site and SRM with 2 NETapp 6080's using snapmirror for replicating 2 volumes.

Were using SRM ver 1.0.0

ESX 3.5u1

Everything seems to work fine but during the Array Manager wizaed when we get to the last screen and rescan for arrays it does not find any repicated arrays.

NetApp admin has verified that the 2 LUNS are being replicated.

We have a running VM on each replicated LUN.

A look at the log shows that it sees 1 of the 2 replicated arrays but this does not show up.

Heres what i think is the relevant section of the log. Any help would be appreciated.

The full log file is attached.

Found 1 replicated LUN pairs

Found replicated lun:

(dr.san.Lun) {

dynamicType = <unset>,

info = (dr.san.Lun.Info) {

dynamicType = <unset>,

arrayId = "BRI-NETAPP6080-1",

id = "/vol/bri_vm_vol200/lun01",

wwn = <unset>,

number = (dr.san.Lun.Number) [

(dr.san.Lun.Number) {

dynamicType = <unset>,

value = 200,

initiatorGroupId = "bri-srm01.corp.int",

targetId = <unset>,

}

],

consistencyGroupId = <unset>,

},

peerInfo = (dr.san.Lun.PeerInfo) {

dynamicType = <unset>,

arrayKey = "NB-NETAPP6080-1",

lunKey = "/vol/nbdr_vm_vol200/lun01",

},

}

No lun groups created since there are no replicated datastores

0 Kudos
1 Solution

Accepted Solutions
admin
Immortal
Immortal
Jump to solution

Hi,

In SRM log attached in the beginning of the thread I see datastore PROD-SRM01 with one extent vmhba1:0:200 corresponding to device 200 with

initiator ID: 50:06:0b:00:00:6b:1b:54

number: 200

target ID: 50:0a:09:88:87:69:82:a1

Target ID matches one of storage ports returned by the SRA:

Could you double check volume access definitions on the array? Which initiator groups the volume is actually presented to?

-Masha

View solution in original post

0 Kudos
13 Replies
jbloo2
Enthusiast
Enthusiast
Jump to solution

Do you have your primary 6080 configured in an HA/clustered environment, i.e. two physical NetApp controllers providing access to the same storage? If so, can you try to reconfigure the Array Manager (or configure a second one) using the IP address of the other controller and see if that solves your problem? i.e. you entered IP address 10.10.34.20 which returned information about array "BRI-NETAPP6080-1", perhaps there is another IP address, like 10.10.32.21 that manages "BRI-NETAPP6080-2" or something (at the primary side, not the recovery array). If this works I'll try to provide more information as to the issue.

0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso
Jump to solution

Do the volumes being replicated have VMFS on them - populated with VMs?

Regards

Mike

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
0 Kudos
ruddy001
Contributor
Contributor
Jump to solution

We do have a 2nd active head. When we enter the ip address for it we get this error:

"LUNs with different peer array keys recieved from san integration scripts"

The LUNS have VM's on them and are formatted VMFS.

0 Kudos
jbloo2
Enthusiast
Enthusiast
Jump to solution

Probably SRM does not accept an array having multiple replication targets for the various LUNs; i.e. if the primary replicates LUN1 to ArrayB but LUN2 to ArrayC you might see an error like this. The two target arrays in question are probably the V-series pair at your destination.

I think the only way to work around this is to make sure all of the SnapMirrored volumes on this array (the second controller at the primary) are replicating to a single controller at the destination...hopefully that will not be a lot of work for you to change.

0 Kudos
ruddy001
Contributor
Contributor
Jump to solution

This wont be a big deal to change as we are not in production yet. Will try in the morning and let you know.

Thx

0 Kudos
ruddy001
Contributor
Contributor
Jump to solution

Update:

We've now created a new pristine enviornment using all the correct version of ESX software.

VC 2.5

ESX 3.5 u1 (5 required patches)

NetApp SRA 1.0.0

SRM 1.0

We worked with NEtApp on friday and confirmed that all software is on the supported Matrix and working correctly.

My only worry is that Netapp firmware version 73p5 is brand new and supposedly is confirmed to work,

After all this we still get the same issue. Working with VMware this afternoon. Very frustrated.

0 Kudos
ruddy001
Contributor
Contributor
Jump to solution

Another Update.

We've been made aware that SRA 1.0.0 and 1.0.1 have the same issue when it comes to the NetApp HA, like we have on the 6080.

Basically the SRA does not correctly identify the WWN.

"wwn = <unset>"

VMware and Netapp are aware of this issue and hopefully a new SRA will be avalaible shortly,

What sucks is that VMware has listed in the Matrix that they know of this issue with u1 but the rolling back to 1.0 will fix when they know that its a crap shoot.

Love looking like a tool in front of my client.

0 Kudos
admin
Immortal
Immortal
Jump to solution

Hi,

"LUNs with different peer array keys recieved from san integration scripts" message indicates that there are two replicated devices on primary array with different target arrays. This configuration is not supported by SRM 1.0.

-Masha

0 Kudos
admin
Immortal
Immortal
Jump to solution

Hi,

NetApp HA issue is specific to SRM 1.0 U1. This issue doesn't exist in SRM 1.0. You should be able to successfully configure SRM 1.0 as long as all replicated devices have same target array. If you still can't get SRM 1.0 configured and running, post SRM logs so that we could investigate what is wrong.

-Masha

0 Kudos
ruddy001
Contributor
Contributor
Jump to solution

Marilab,

We changed our enviornment so that only 1 lun was replicated but still couldn't get the SRA to work.

The log files are attached @ the begining of this thread. Although VMware is stating the the issue with SRMu1 and NetApp HA can be fixed by rolling back we have found that this is not the case. We are seeing that the issue that the SRA adapter has with SRMu1 is also present in SRM 1.0

We have worked with VMware support and are @ required patches, build #'s etc.

Here is the issue from the logs:

Found replicated lun:

(dr.san.Lun) {

dynamicType = <unset>,

info = (dr.san.Lun.Info) {

dynamicType = <unset>,

arrayId = "BRI-NETAPP6080-1",

id = "/vol/bri_vm_vol200/lun01",

wwn = <unset>,

number = (dr.san.Lun.Number) [

(dr.san.Lun.Number) {

dynamicType = <unset>,

value = 200,

initiatorGroupId = "bri-srm01.corp.int",

targetId = <unset>,

}

],

consistencyGroupId = <unset>,

},

peerInfo = (dr.san.Lun.PeerInfo) {

dynamicType = <unset>,

arrayKey = "NB-NETAPP6080-1",

lunKey = "/vol/nbdr_vm_vol200/lun01",

},

No lun groups created since there are no replicated datastores

Notice that although the SRA sees a replicated LUN it does not see a replicated datastore. We are under the impression that this is because the SRA has issues with reading the wwn which causes the consistencyGroup, taretID and the dynamicType to be unset. Hence the no replicated datastore.

Our issue is exactly the same as liste in KB1008068 except we are at the supposedly working levels of ESXu1 +5patches, VC 2.5, SRM1.0 etc.

If you have a working SRM with NetAPP HA and could show me how your SRA reads in that section I would appreciate it.

0 Kudos
admin
Immortal
Immortal
Jump to solution

Hi,

In SRM log attached in the beginning of the thread I see datastore PROD-SRM01 with one extent vmhba1:0:200 corresponding to device 200 with

initiator ID: 50:06:0b:00:00:6b:1b:54

number: 200

target ID: 50:0a:09:88:87:69:82:a1

Target ID matches one of storage ports returned by the SRA:

Could you double check volume access definitions on the array? Which initiator groups the volume is actually presented to?

-Masha

0 Kudos
ruddy001
Contributor
Contributor
Jump to solution

Masha-

Thanks for your help we have reolved the issue.

What we had was 3 ESX host

bri-srm01, bri-srm02, and bri-srm03.

As we were trouble shooting we removed 01 and 03 from VC but left them in the igroup.

Whats strange is that as SRM scans it stops on the first error. Although srm01 was in the igroup and could see the replicated LUN, it wasn't in VC so it failed.

WE removed 01 and 03 from the igroup and success!

Strange. Thanks for all your help

0 Kudos
uushaggy
Enthusiast
Enthusiast
Jump to solution

I was going throught the same experiences right about this time too. Of course a lot of my POC prior to this was on the NetApp SIM so I did not encounter the SRM 1.0 u1 issue there!

I heard it was an issue berfore getting started on the 6080, but wanted to verify and observe this first hand. Indeed SRM 1.0 u1 with the 6080 did not allow me to configure SRM. A roll back to SRM1.0 did work, but along with this came interface issues and annoyances that were fixed in u1. Still we got it working in a fairly complex layout. We had two different local filers replicating to one DR filer. Issues with RDMs are going to be an issue though.

Any idea on when the NetApp/VMware fix will be out??? When we go production I'd like to have this resolved.

0 Kudos