6 Replies Latest reply on Feb 4, 2011 5:48 PM by jgreeninsight

    Hitachi TrueCopy SRA fails to find replicated Datastores

    shrig Lurker

      I am trying to configure SRM with Hitachi SRA. The replication has be setup and is working fine. I have also created Datastore on the PVOL and a VM on it. When I try to configure Array Manager (RMSRA), it is able to detect the array pairs, but fails to find reploicated Datastores.

      The HORCM is running as a service on the Windows host where only the Command Device is shown.

      Here is what the log says:

      Can't be attached to HORC manager

      :Couldn't connect with the HORC manager.

      :Please check if HORC manager is running or if HORCMINST is set correctly.

      raidqry: Can't be attached to HORC manager

      [Mon Nov 08 17:09:08 2010]:      : 01.00.07

      [Mon Nov 08 17:09:08 2010]:      : discoverArrays

      [Mon Nov 08 17:09:08 2010]:    :

      [Mon Nov 08 17:09:08 2010]: : RAID Manager Storage Replication Adapter

      [Mon Nov 08 17:09:08 2010]: : HORCMINST=4

      [Mon Nov 08 17:09:08 2010]:   :

      [Mon Nov 08 17:09:08 2010]:     : true

      [Mon Nov 08 17:09:08 2010]:   : C:\WINDOWS\TEMP\vmware-SYSTEM-3136348361\dr-sanprovider3428-0

      [Mon Nov 08 17:09:08 2010]:    : trivia

      [Mon Nov 08 17:09:08 2010]:      : 60 sec

      [Mon Nov 08 17:09:08 2010]:      : 1

      [Mon Nov 08 17:09:08 2010]:    : true

      [Mon Nov 08 17:09:08 2010]: : '"c:\HORCM\etc\raidqry.exe" -IH -l  1>NUL' returned with RC=0 on HORCMINST=4.

      [Mon Nov 08 17:09:15 2010]:   : exit with

       

      discoverArrays exited with exit code 0

      'discoverArrays' returned <?xml version="1.0" encoding="UTF-8"?>

        

          

            

            

            

               <StoragePort id="50060e80004a45e3" type="FC">

               </StoragePort>

             </StoragePortList>

           </Array>

         </ArrayList>

        

      </Response>

      ...

      Starting process: "C:
      Program Files
      VMware
      VMware vCenter Site Recovery Manager
      external
      perl-5.8.8
      bin
      perl.exe" "C:/Program Files/VMware/VMware vCenter Site Recovery Manager/scripts/SAN/RMHTC/command.pl"

      discoverLuns's errors:

      COMMAND ERROR  : EUserId for HORC[5] : SYSTEM (0)  Mon Nov 08 17:09:15 2010

      CMDLINE : c:\HORCM\etc\raidqry.exe -IH5 -l

      *********** SYSTEM ERROR ***********

      P.P.   : RAID Manager for WindowsNT

      Model  : RAID-Manager/WindowsNT

      Ver&Rev: 01-23-03/07

      Release: Production(GA)

          Host: vcslx087vm1

      EUserId: SYSTEM (0)

      Process: 5544

      SysCall: CreateFile

      LastErr: 2 (The system cannot find the file specified.

      )

      ErrInfo: Internal Error

      ErrTime: Mon Nov 08 17:09:15 2010

      SrcFile: shorcmc.c

      SrcLine: 1137

      17:09:15-d4288-05544- 5544:HORCM death detected or access is denied.

      17:09:18-d7d20-05544- ERROR:cm_open[scmclcon() timeout_err]

      17:09:18-d7d20-05544- ERROR:horcm_lep_create

      17:09:18-d7d20-05544- [exit(251)]

      Can't be attached to HORC manager

      :Couldn't connect with the HORC manager.

      :Please check if HORC manager is running or if HORCMINST is set correctly.

      raidqry: Can't be attached to HORC manager

      [Mon Nov 08 17:09:15 2010]:      : 01.00.07

      [Mon Nov 08 17:09:15 2010]:      : discoverLuns

      [Mon Nov 08 17:09:15 2010]:    :

      [Mon Nov 08 17:09:15 2010]: : RAID Manager Storage Replication Adapter

      [Mon Nov 08 17:09:15 2010]: : HORCMINST=4

      [Mon Nov 08 17:09:15 2010]:   : 5214

      [Mon Nov 08 17:09:15 2010]:     : true

      [Mon Nov 08 17:09:15 2010]:   : C:\WINDOWS\TEMP\vmware-SYSTEM-3136348361\dr-sanprovider3428-1

      [Mon Nov 08 17:09:15 2010]:    : trivia

      [Mon Nov 08 17:09:15 2010]:      : 60 sec

      [Mon Nov 08 17:09:15 2010]:      : 1

      [Mon Nov 08 17:09:15 2010]:    : true

      [Mon Nov 08 17:09:15 2010]: : '"c:\HORCM\etc\raidqry.exe" -IH -l  1>NUL' returned with RC=0 on HORCMINST=4.

      [Mon Nov 08 17:09:21 2010]:   : exit with

       

      discoverLuns exited with exit code 0

      'discoverLuns' returned <?xml version="1.0" encoding="UTF-8"?>

         <LunList arrayId="5214">

           <Lun id="151"

                wwn="UN:KN:OW:N">

               

                  8147</ArrayKey>

                  175</ReplicaLunKey>

                </Peer>

           </Lun>

         </LunList>

        

      </Response>

       

      ...

       

      Recomputing LUN groups for array 'array-8288' with ID '5214'

      Recomputing LUN groups for array pair '5214' --> '8147'

      Found 1 replicated LUN pairs

      WWN '44:46:36:30:30:2D:30:30:42:20:20:20:20:20:20:20' of device 'key-vim.host.ScsiDisk-01000b000044463630302d30304220202020202020444636303046' doesn't match

      Skipped access path '21:00:00:e0:8b:9c:00:b1;11;50:06:0e:80:00:43:b6:e3' because of unknown target '50:06:0E:80:00:43:B6:E3'

      Access paths for device 'key-vim.host.ScsiDisk-01000b000044463630302d30304220202020202020444636303046' don't match:

           21:00:00:e0:8b:9c:00:b1;11;50:06:0e:80:00:43:b6:e3

      NFS share '10.209.73.150//vx/vmwsfs' doesn't match: unknown NFS server '10.209.73.150'

      WWN '60:06:0E:80:00:43:B6:E0:35:32:31:34:00:00:00:97' of device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' doesn't match

      Skipped access path '21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3' because of unknown target '50:06:0E:80:00:43:B6:E3'

      Access paths for device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' don't match:

           21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3

      ...

      No lun groups created since there are no replicated datastores

      No datastores match replicated devices on array 'array-8288'

       

       

       

      Now I am facing the following questions that may be related to my original problem:

      1. I think that my horcm configuration as a Windows service is fine. Then why is the SRA not able to contact HORC manager sometimes?

      2. Why is pairdisplay -fw showing some arbitrary storage port WWN (50060e80004a45e3) whereas the array port that is masked to the host is really (fc.200000e08b9cf9a7:210000e08b9cf9a7-fc.50060e800043b6e3:50060e800043b6e3-naa.60060e800043b6e03532313400000097).

       

      I checked the commands that the SRA is firing by monitoring the Task Manager.

      raidqry shows the following o/p:

      C:\HORCM\etc>"c:\HORCM\etc\raidqry.exe" -IH -l

      No Group    Hostname            HORCM_ver  Uid  Serial#    Micro_ver Cache(MB)

      1  ---     vcslx087vm1       01-23-03/07    0     5214  06-5F-00/00      2048

       

      3. Why is the SRA starting a new instance (5 in this case) when my service is running instance 4?

        • 1. Re: Hitachi TrueCopy SRA fails to find replicated Datastores
          thomps01 Hot Shot

          One thing I found when running the service on Windows 2008 was I needed to start the service using a specifc account which happened to be a local admin, but I'm not sure if this is essential. When I used the default local system account to start the service, I had various issues with HORCM related things.

           

          The other thing to check was that your services file should have a blank line at the end.

           

          The horcm.conf file shouldn't have a blank line at the end.

          • 2. Re: Hitachi TrueCopy SRA fails to find replicated Datastores
            shrig Lurker

            I checked that there is a blank line at the end of the services file and no blank line at the end of horcm4.conf.

            When I open a cmd window and set HORCMINST=4, I can see the o/p of pairdisplay correctly.

            C:\HORCM\etc>pairdisplay -g htc1 -fw -CLI

            Group   PairVol L/R  WWN                LU  Seq# LDEV# P/S Status Fence Seq# P-LDEV# M

            htc1    htc1_000 L   50060e80004a45e3  151  5214   151 P-VOL PAIR NEVER   8147   175 -

            htc1    htc1_000 R   50060e80004afd33  175  8147   175 S-VOL PAIR NEVER      -   151 -

             

            That means that my service is configured correctly, isn't it?

             

            My concern is that pairdisplay is showing some WWNs that I cannot find anywhere else. Neither in the VC GUI, nor in the Storage Navigator GUI.

             

            --shrig

            • 3. Re: Hitachi TrueCopy SRA fails to find replicated Datastores
              thomps01 Hot Shot

              You don't need to worry about those WWN's. I believe they are simply the port ID's of the arrays.

               

              If you run the following command without the -CLI, you'll see a human friendly name.

               

              pairdisplay -g htc

              • 4. Re: Hitachi TrueCopy SRA fails to find replicated Datastores
                shrig Lurker

                Okay.

                 

                But why is the RMSRA starting another instance of horcm? why is it not contacting the running instance everytime?

                 

                What should be the logon user of the HORCM service on windows to RMSRA to be able to connect to?

                 

                I tried both Administrator and LocalSystem, but I get the following message with both settings:

                Starting process: "C:
                Program Files
                VMware
                VMware vCenter Site Recovery Manager
                external
                perl-5.8.8
                bin
                perl.exe" "C:/Program Files/VMware/VMware vCenter Site Recovery Manager/scripts/SAN/RMHTC/command.pl"

                discoverLuns's errors:

                COMMAND ERROR  : EUserId for HORC[5] : SYSTEM (0)  Tue Nov 09 11:38:08 2010

                CMDLINE : c:\HORCM\etc\raidqry.exe -IH5 -l

                *********** SYSTEM ERROR ***********

                P.P.   : RAID Manager for WindowsNT

                Model  : RAID-Manager/WindowsNT

                Ver&Rev: 01-23-03/07

                Release: Production(GA)

                    Host: vcslx087vm1

                EUserId: SYSTEM (0)

                Process: 5472

                SysCall: CreateFile

                LastErr: 2 (The system cannot find the file specified.

                )

                ErrInfo: Internal Error

                ErrTime: Tue Nov 09 11:38:08 2010

                SrcFile: shorcmc.c

                SrcLine: 1137

                11:38:08-061a8-05472- 5472:HORCM death detected or access is denied.

                11:38:11-061a8-05472- ERROR:cm_open[scmclcon() timeout_err]

                11:38:11-061a8-05472- ERROR:horcm_lep_create

                11:38:11-061a8-05472- [exit(251)]

                Can't be attached to HORC manager

                :Couldn't connect with the HORC manager.

                :Please check if HORC manager is running or if HORCMINST is set correctly.

                raidqry: Can't be attached to HORC manager

                [Tue Nov 09 11:38:07 2010]:      : 01.00.07

                [Tue Nov 09 11:38:07 2010]:      : discoverLuns

                [Tue Nov 09 11:38:07 2010]:    :

                [Tue Nov 09 11:38:07 2010]: : RAID Manager Storage Replication Adapter

                [Tue Nov 09 11:38:07 2010]: : HORCMINST=4

                [Tue Nov 09 11:38:07 2010]:   : 5214

                [Tue Nov 09 11:38:07 2010]:     : true

                [Tue Nov 09 11:38:07 2010]:   : C:\WINDOWS\TEMP\vmware-SYSTEM-3136348361\dr-sanprovider3004-0

                [Tue Nov 09 11:38:07 2010]:    : trivia

                [Tue Nov 09 11:38:07 2010]:      : 60 sec

                [Tue Nov 09 11:38:07 2010]:      : 1

                [Tue Nov 09 11:38:07 2010]:    : true

                [Tue Nov 09 11:38:07 2010]: : '"c:\HORCM\etc\raidqry.exe" -IH -l  1>NUL' returned with RC=0 on HORCMINST=4.

                [Tue Nov 09 11:38:13 2010]:   : exit with

                 

                discoverLuns exited with exit code 0

                'discoverLuns' returned <?xml version="1.0" encoding="UTF-8"?>

                  

                     <Lun id="151"

                          wwn="UN:KN:OW:N">

                         

                           

                           

                          </Peer>

                     </Lun>

                   </LunList>

                  

                </Response>

                 

                --shrig

                • 5. Re: Hitachi TrueCopy SRA fails to find replicated Datastores
                  shrig Lurker

                  I think that the UNKNOWN wwn in the Response of discoverLuns is causing the issue.

                  Later in the log I also see following messages:

                  WWN '60:06:0E:80:00:43:B6:E0:35:32:31:34:00:00:00:97' of device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' doesn't match

                  Skipped access path '21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3' because of unknown target '50:06:0E:80:00:43:B6:E3'

                  Access paths for device 'key-vim.host.ScsiDisk-02000f000060060e800043b6e03532313400000097444636303046' don't match:

                       21:00:00:e0:8b:9c:f9:a7;15;50:06:0e:80:00:43:b6:e3

                   

                  The LunID 60:06:0E:80:00:43:B6:E0:35:32:31:34:00:00:00:97 is really the lunid (scsi page 83) of the lun 151 which is the PVOL.

                   

                  Another strange thing that I see in the logs is that the WWNs are messed up.

                  The array port WWN masked to the ESX host is 50060e800043b6e3:50060e800043b6e3. But the log contains some strange WWN prepended to it.

                  Looks like the strange WWN is the HBA's Port WWN.

                   

                  This is really driving me crazy.

                   

                  --shrig

                  • 6. Re: Hitachi TrueCopy SRA fails to find replicated Datastores
                    jgreeninsight Lurker

                    Shirg,

                     

                    Did you ever find a solution to this?  I am seeing the same errors when configuring a system with the NetApp SRA.  It's a one to one controller relationship, and all snapmirror replication policies are configured and working between them.

                     

                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] WWN '60:A9:80:00:48:6E:57:34:64:5A:61:72:61:6E:4F:6D' of device 'key-vim.host.ScsiDisk-020006000060a98000486e5734645a6172616e4f6d4c554e202020' doesn't match
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Skipped access path 'iqn.1998-01.com.vmware:x-3ea989da;6;iqn.1992-08.com.netapp:sn.118051176' because of unknown target 'iqn.1992-08.com.netapp:sn.x'
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Access paths for device 'key-vim.host.ScsiDisk-020006000060a98000486e5734645a6172616e4f6d4c554e202020' don't match:
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager']     iqn.1998-01.com.vmware:x-3ea989da;6;iqn.1992-08.com.netapp:sn.x
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] WWN '60:A9:80:00:48:6E:57:34:64:5A:61:72:62:76:50:67' of device 'key-vim.host.ScsiDisk-020001000060a98000486e5734645a6172627650674c554e202020' doesn't match
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Skipped access path 'iqn.1998-01.com.vmware:x-3ea989da;1;iqn.1992-08.com.netapp:sn.118051176' because of unknown target 'iqn.1992-08.com.netapp:sn.x'
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager'] Access paths for device 'key-vim.host.ScsiDisk-020001000060a98000486e5734645a6172627650674c554e202020' don't match:
                    [2011-02-04 13:01:10.839 03716 trivia 'SanConfigManager']     iqn.1998-01.com.vmware:x-3ea989da;1;iqn.1992-08.com.netapp:sn.x
                    [2011-02-04 13:01:10.839 03716 verbose 'SanConfigManager'] No lun groups created since there are no replicated datastores
                    [2011-02-04 13:01:10.839 03716 info 'SanConfigManager'] No datastores match replicated devices on array 'array-7060'

                     

                    Thanks