VMware Cloud Community
shrig
Contributor
Contributor

Hitachi TrueCopy SRA fails to find replicated Datastores

I am trying to configure SRM with Hitachi SRA. The replication has be setup and is working fine. I have also created Datastore on the PVOL and a VM on it. When I try to configure Array Manager (RMSRA), it is able to detect the array pairs, but fails to find reploicated Datastores.

The HORCM is running as a service on the Windows host where only the Command Device is shown.

Here is what the log says:

Can't be attached to HORC manager

:Couldn't connect with the HORC manager.

:Please check if HORC manager is running or if HORCMINST is set correctly.

raidqry: Can't be attached to HORC manager

[Mon Nov 08 17:09:08 2010]: : 01.00.07

[Mon Nov 08 17:09:08 2010]: : discoverArrays

[Mon Nov 08 17:09:08 2010]: :

[Mon Nov 08 17:09:08 2010]: : RAID Manager Storage Replication Adapter

[Mon Nov 08 17:09:08 2010]: : HORCMINST=4

[Mon Nov 08 17:09:08 2010]: :

[Mon Nov 08 17:09:08 2010]: : true

[Mon Nov 08 17:09:08 2010]: : C:\WINDOWS\TEMP\vmware-SYSTEM-3136348361\dr-sanprovider3428-0

[Mon Nov 08 17:09:08 2010]: : trivia

[Mon Nov 08 17:09:08 2010]: : 60 sec

[Mon Nov 08 17:09:08 2010]: : 1

[Mon Nov 08 17:09:08 2010]: : true

[Mon Nov 08 17:09:08 2010]: : '"c:\HORCM\etc\raidqry.exe" -IH -l 1>NUL' returned with RC=0 on HORCMINST=4.

[Mon Nov 08 17:09:15 2010]: : exit with

discoverArrays exited with exit code 0

'discoverArrays' returned <?xml version="1.0" encoding="UTF-8"?>

<StoragePort id="50060e80004a45e3" type="FC">

</StoragePort>

</StoragePortList>

</Array>

</ArrayList>

</Response>

...

Starting process: "C:
Program Files
VMware
VMware vCenter Site Recovery Manager
external
perl-5.8.8
bin
perl.exe" "C:/Program Files/VMware/VMware vCenter Site Recovery Manager/scripts/SAN/RMHTC/command.pl"

discoverLuns's errors:

COMMAND ERROR : EUserId for HORC[5] : SYSTEM (0) Mon Nov 08 17:09:15 2010

CMDLINE : c:\HORCM\etc\raidqry.exe -IH5 -l

*********** SYSTEM ERROR ***********

P.P. : RAID Manager for WindowsNT

Model : RAID-Manager/WindowsNT

Ver&Rev: 01-23-03/07

Release: Production(GA)

Host: vcslx087vm1

EUserId: SYSTEM (0)

Process: 5544

SysCall: CreateFile

LastErr: 2 (The system cannot find the file specified.

)

ErrInfo: Internal Error

ErrTime: Mon Nov 08 17:09:15 2010

SrcFile: shorcmc.c

SrcLine: 1137

17:09:15-d4288-05544- 5544:HORCM death detected or access is denied.

17:09:18-d7d20-05544- ERROR:cm_open[scmclcon() timeout_err]

17:09:18-d7d20-05544- ERROR:horcm_lep_create

17:09:18-d7d20-05544- [exit(251)]

Can't be attached to HORC manager

:Couldn't connect with the HORC manager.

:Please check if HORC manager is running or if HORCMINST is set correctly.

raidqry: Can't be attached to HORC manager

[Mon Nov 08 17:09:15 2010]: : 01.00.07

[Mon Nov 08 17:09:15 2010]: : discoverLuns

[Mon Nov 08 17:09:15 2010]: :

[Mon Nov 08 17:09:15 2010]: : RAID Manager Storage Replication Adapter

[Mon Nov 08 17:09:15 2010]: : HORCMINST=4

[Mon Nov 08 17:09:15 2010]: : 5214

[Mon Nov 08 17:09:15 2010]: : true

[Mon Nov 08 17:09:15 2010]: : C:\WINDOWS\TEMP\vmware-SYSTEM-3136348361\dr-sanprovider3428-1

[Mon Nov 08 17:09:15 2010]: : trivia

[Mon Nov 08 17:09:15 2010]: : 60 sec

[Mon Nov 08 17:09:15 2010]: : 1

[Mon Nov 08 17:09:15 2010]: : true

[Mon Nov 08 17:09:15 2010]: : '"c:\HORCM\etc\raidqry.exe" -IH -l 1>NUL' returned with RC=0 on HORCMINST=4.

[Mon Nov 08 17:09:21 2010]: : exit with

discoverLuns exited with exit code 0

'discoverLuns' returned <?xml version="1.0" encoding="UTF-8"?>

<LunList arrayId="5214">

<Lun id="151"

wwn="UN:KN:OW:N">

8147</ArrayKey>

175</ReplicaLunKey>

</Peer>

</Lun>

</LunList>

</Response>

...

Recomputing LUN groups for array 'array-8288' with ID '5214'

Recomputing LUN groups for array pair '5214' --> '8147'

Found 1 replicated LUN pairs

WWN '44:46:36:30:30:2D:30:30:42:20:20:20:20:20:20:20' of device 'key-vim.host.ScsiDisk-01000b000044463630302d30304220202020202020444636303046' doesn't match

Skipped access path '21:00:00:e0:8b:9c:00:b1;11;50:06:0e:80:00:43:b6:e3' because of unknown target '50:06:0E:80:00:43:B6:E3'

Access paths for device 'key-vim.host.ScsiDisk-01000b000044463630302d30304220202020202020444636303046' don't match:

21:00:00:e0:8b:9c:00:b1;11;50:06:0e:80:00:43:b6:e3

NFS share '10.209.73.150//vx/vmwsfs' doesn't match: unknown NFS server '10.209.73.150'

WWN '60:06:0E:80:00:43:B6:E0:35:32:31:34:00:00:00:97' of device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' doesn't match

Skipped access path '21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3' because of unknown target '50:06:0E:80:00:43:B6:E3'

Access paths for device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' don't match:

21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3

...

No lun groups created since there are no replicated datastores

No datastores match replicated devices on array 'array-8288'

Now I am facing the following questions that may be related to my original problem:

1. I think that my horcm configuration as a Windows service is fine. Then why is the SRA not able to contact HORC manager sometimes?

2. Why is pairdisplay -fw showing some arbitrary storage port WWN (50060e80004a45e3) whereas the array port that is masked to the host is really (fc.200000e08b9cf9a7:210000e08b9cf9a7-fc.50060e800043b6e3:50060e800043b6e3-naa.60060e800043b6e03532313400000097).

I checked the commands that the SRA is firing by monitoring the Task Manager.

raidqry shows the following o/p:

C:\HORCM\etc>"c:\HORCM\etc\raidqry.exe" -IH -l

No Group Hostname HORCM_ver Uid Serial# Micro_ver Cache(MB)

1 --- vcslx087vm1 01-23-03/07 0 5214 06-5F-00/00 2048

3. Why is the SRA starting a new instance (5 in this case) when my service is running instance 4?

0 Kudos
6 Replies
thomps01
Enthusiast
Enthusiast

One thing I found when running the service on Windows 2008 was I needed to start the service using a specifc account which happened to be a local admin, but I'm not sure if this is essential. When I used the default local system account to start the service, I had various issues with HORCM related things.

The other thing to check was that your services file should have a blank line at the end.

The horcm.conf file shouldn't have a blank line at the end.

0 Kudos
shrig
Contributor
Contributor

I checked that there is a blank line at the end of the services file and no blank line at the end of horcm4.conf.

When I open a cmd window and set HORCMINST=4, I can see the o/p of pairdisplay correctly.

C:\HORCM\etc>pairdisplay -g htc1 -fw -CLI

Group PairVol L/R WWN LU Seq# LDEV# P/S Status Fence Seq# P-LDEV# M

htc1 htc1_000 L 50060e80004a45e3 151 5214 151 P-VOL PAIR NEVER 8147 175 -

htc1 htc1_000 R 50060e80004afd33 175 8147 175 S-VOL PAIR NEVER - 151 -

That means that my service is configured correctly, isn't it?

My concern is that pairdisplay is showing some WWNs that I cannot find anywhere else. Neither in the VC GUI, nor in the Storage Navigator GUI.

--shrig

0 Kudos
thomps01
Enthusiast
Enthusiast

You don't need to worry about those WWN's. I believe they are simply the port ID's of the arrays.

If you run the following command without the -CLI, you'll see a human friendly name.

pairdisplay -g htc

0 Kudos
shrig
Contributor
Contributor

Okay.

But why is the RMSRA starting another instance of horcm? why is it not contacting the running instance everytime?

What should be the logon user of the HORCM service on windows to RMSRA to be able to connect to?

I tried both Administrator and LocalSystem, but I get the following message with both settings:

Starting process: "C:
Program Files
VMware
VMware vCenter Site Recovery Manager
external
perl-5.8.8
bin
perl.exe" "C:/Program Files/VMware/VMware vCenter Site Recovery Manager/scripts/SAN/RMHTC/command.pl"

discoverLuns's errors:

COMMAND ERROR : EUserId for HORC[5] : SYSTEM (0) Tue Nov 09 11:38:08 2010

CMDLINE : c:\HORCM\etc\raidqry.exe -IH5 -l

*********** SYSTEM ERROR ***********

P.P. : RAID Manager for WindowsNT

Model : RAID-Manager/WindowsNT

Ver&Rev: 01-23-03/07

Release: Production(GA)

Host: vcslx087vm1

EUserId: SYSTEM (0)

Process: 5472

SysCall: CreateFile

LastErr: 2 (The system cannot find the file specified.

)

ErrInfo: Internal Error

ErrTime: Tue Nov 09 11:38:08 2010

SrcFile: shorcmc.c

SrcLine: 1137

11:38:08-061a8-05472- 5472:HORCM death detected or access is denied.

11:38:11-061a8-05472- ERROR:cm_open[scmclcon() timeout_err]

11:38:11-061a8-05472- ERROR:horcm_lep_create

11:38:11-061a8-05472- [exit(251)]

Can't be attached to HORC manager

:Couldn't connect with the HORC manager.

:Please check if HORC manager is running or if HORCMINST is set correctly.

raidqry: Can't be attached to HORC manager

[Tue Nov 09 11:38:07 2010]: : 01.00.07

[Tue Nov 09 11:38:07 2010]: : discoverLuns

[Tue Nov 09 11:38:07 2010]: :

[Tue Nov 09 11:38:07 2010]: : RAID Manager Storage Replication Adapter

[Tue Nov 09 11:38:07 2010]: : HORCMINST=4

[Tue Nov 09 11:38:07 2010]: : 5214

[Tue Nov 09 11:38:07 2010]: : true

[Tue Nov 09 11:38:07 2010]: : C:\WINDOWS\TEMP\vmware-SYSTEM-3136348361\dr-sanprovider3004-0

[Tue Nov 09 11:38:07 2010]: : trivia

[Tue Nov 09 11:38:07 2010]: : 60 sec

[Tue Nov 09 11:38:07 2010]: : 1

[Tue Nov 09 11:38:07 2010]: : true

[Tue Nov 09 11:38:07 2010]: : '"c:\HORCM\etc\raidqry.exe" -IH -l 1>NUL' returned with RC=0 on HORCMINST=4.

[Tue Nov 09 11:38:13 2010]: : exit with

discoverLuns exited with exit code 0

'discoverLuns' returned <?xml version="1.0" encoding="UTF-8"?>

<Lun id="151"

wwn="UN:KN:OW:N">

</Peer>

</Lun>

</LunList>

</Response>

--shrig

0 Kudos
shrig
Contributor
Contributor

I think that the UNKNOWN wwn in the Response of discoverLuns is causing the issue.

Later in the log I also see following messages:

WWN '60:06:0E:80:00:43:B6:E0:35:32:31:34:00:00:00:97' of device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' doesn't match

Skipped access path '21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3' because of unknown target '50:06:0E:80:00:43:B6:E3'

Access paths for device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' don't match:

21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3

The LunID 60:06:0E:80:00:43:B6:E0:35:32:31:34:00:00:00:97 is really the lunid (scsi page 83) of the lun 151 which is the PVOL.

Another strange thing that I see in the logs is that the WWNs are messed up.

The array port WWN masked to the ESX host is 50060e800043b6e3:50060e800043b6e3. But the log contains some strange WWN prepended to it.

Looks like the strange WWN is the HBA's Port WWN.

This is really driving me crazy.

--shrig

0 Kudos
jgreeninsight
Contributor
Contributor

Shirg,

Did you ever find a solution to this?  I am seeing the same errors when configuring a system with the NetApp SRA.  It's a one to one controller relationship, and all snapmirror replication policies are configured and working between them.

[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] WWN '60:A9:80:00:48:6E:57:34:64:5A:61:72:61:6E:4F:6D' of device 'key-vim.host.ScsiDisk-020006000060a98000486e5734645a6172616e4f6d4c554e202020' doesn't match
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Skipped access path 'iqn.1998-01.com.vmware:x-3ea989da;6;iqn.1992-08.com.netapp:sn.118051176' because of unknown target 'iqn.1992-08.com.netapp:sn.x'
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Access paths for device 'key-vim.host.ScsiDisk-020006000060a98000486e5734645a6172616e4f6d4c554e202020' don't match:
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager']     iqn.1998-01.com.vmware:x-3ea989da;6;iqn.1992-08.com.netapp:sn.x
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] WWN '60:A9:80:00:48:6E:57:34:64:5A:61:72:62:76:50:67' of device 'key-vim.host.ScsiDisk-020001000060a98000486e5734645a6172627650674c554e202020' doesn't match
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Skipped access path 'iqn.1998-01.com.vmware:x-3ea989da;1;iqn.1992-08.com.netapp:sn.118051176' because of unknown target 'iqn.1992-08.com.netapp:sn.x'
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Access paths for device 'key-vim.host.ScsiDisk-020001000060a98000486e5734645a6172627650674c554e202020' don't match:
[2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager']     iqn.1998-01.com.vmware:x-3ea989da;1;iqn.1992-08.com.netapp:sn.x
[2011-02-04 13:01:10.839 03716 verbose 'SanConfigManager'] No lun groups created since there are no replicated datastores
[2011-02-04 13:01:10.839 03716 info 'SanConfigManager'] No datastores match replicated devices on array 'array-7060'

Thanks

0 Kudos