VMware Cloud Community
makethevapor
Contributor
Contributor

SRM 4.1 Error: Failed to Copy when failover VM with RDM

So I have been trying to find a resolution to this issue for 2 weeks now and am at my wits end. We have an SRM 4.1 environment with vCenter 4.1, ESXi 4.1 and Clariion arrays replicated using MirrorView.

The problem is somewhat intermittent, though there is a critical VM that I cannot seem to recover no matter what I do now. 

When I failover the protection group, all VM's failover fine except the one with an RDM. I get Error: Failed to copy. This does not happen with every VM that has an RDM, only particular ones. I am getting the following error in the SRM logs of the recovery server. Also on the array you can see that it activates the snapshot and creates a session on the recovery RDM lun.

[2012-06-17 09:59:44.324 03916 trivia 'SecondarySanProvider'] Datastore 'datastore-7161' not found in the cache
[2012-06-17 09:59:44.329 03916 trivia 'SecondarySanProvider'] Added datastore 'datastore-7161:snap-1a6518ed-SRM-CFS-VMware-Data49 - ThinLUN 213' to the cache
[2012-06-17 09:59:44.329 03916 warning 'Libs'] [NFC ERROR] NfcNetTcpWrite: bWritten: -1 [#1]
[2012-06-17 09:59:44.329 03916 warning 'Libs'] [NFC ERROR] NfcNet_Send: requested 264, sent only -1 bytes [#1]
[2012-06-17 09:59:44.329 03916 warning 'Libs'] [NFC ERROR] NfcSendMessage: send failed:
[#1]
[2012-06-17 09:59:44.329 03916 warning 'Libs'] [NFC ERROR] Nfc_GetFile: GET_FILE msg failed [#1]
[2012-06-17 09:59:44.329 03916 error 'SecondarySanProvider'] Failed to copy file '[snap-1a6518ed-mydatastore] VIRTUALMACHINE/VIRTUALMACHINE.vmdk' to 'C:\Windows\TEMP\vmware-SYSTEM46' with error code 3: Network error -- Failed to send complete message: An attempt was made to access a socket in a way forbidden by its access permissions
[2012-06-17 09:59:44.330 03916 warning 'Libs'] [NFC ERROR] NfcNetTcpWrite: bWritten: -1 [#1]
[2012-06-17 09:59:44.330 03916 warning 'Libs'] [NFC ERROR] NfcNet_Send: requested 264, sent only -1 bytes [#1]
[2012-06-17 09:59:44.330 03916 warning 'Libs'] [NFC ERROR] NfcSendMessage: send failed:
[#1]
[2012-06-17 09:59:44.330 03916 trivia 'SecondarySanProvider'] 'Get extent path for disk '[snap-1a6518ed-mydatastore] VIRTUALMACHINE/VIRTUALMACHINE_2.vmdk'' took 61.688 seconds
[2012-06-17 09:59:44.330 03916 trivia 'SecondarySanProvider'] 'Fix RDM mapping file '[snap-1a6518ed-mydatastore] VIRTUALMACHINE/VIRTUALMACHINE_2.vmdk' to point to LUN '0200810000600601600d0023001a36783724b7e111565241494420'' took 75.822 seconds
[2012-06-17 09:59:44.330 03916 error 'SecondarySanProvider'] Failed to resolve 1 device locators for shadow VM 'shadow-vm-397778':
(vmodl.MethodFault) [
[#1]    (dr.fault.NfcCopyFault) {
[#1]       dynamicType = <unset>,
[#1]       faultCause = (vmodl.MethodFault) null,
[#1]       errorCode = 3,
[#1]       filePath = "[snap-1a6518ed-mydatastore] VIRTUALMACHINE/VIRTUALMACHINE_2.vmdk",
[#1]       newFilePath = "C:\Windows\TEMP\vmware-SYSTEM46",
[#1]       msg = "",
[#1]    }
[#1] ]

Attempted Fixes:

There are some SSL errors so we recreated SSL certs on every single ESX host at the recovery side. Thought this fixed the problem but it came back and it appears we are still getting SSL errors.

Moved both the recovery and protected VM to different ESX hosts to see if the problem was based on where the VM was living.

Added additional resources to both SRM servers to ensure that there isn't intermittent timeout issues due to resources.

Deleted and recreated the inactive snapshots on the array.

Thanks in advance for any help that you can give!

0 Kudos
0 Replies