VMware Cloud Community
chris_dd
Contributor
Contributor

SRM Recovery Plan Scripting

Hello,

We have a PowerShell script that we use to configure virtual machines post SRM recovery. The SRM recovery plan is configured to execute a Command step type of "Command on SRM Server". The syntax is configured as such: 

"/bin/sh /home/admin/1_SRM_DisjoinAD.sh"

The above permissions were changed as required per doc:

https://docs.vmware.com/en/Site-Recovery-Manager/8.5/com.vmware.srm.admin.doc/GUID-4F084B4F-DE9C-4A7...

When the recovery plan executes and the command step is instantiated, error message in the recovery plan is thrown:

"Command: EXECUTE DisjoinAD Script Warning - The command '/bin/sh' returned a non-zero value: 255"

We have opened a ticket with VMware for a support request but were directed here for assistance. To provide more detail into the issue, we have two scripts located in the /home/admin/ on the SRM appliance that are called from the recovery plan. If we execute the script directly from the SRM appliance command line it executes the script without issue. The script calls a PowerShell script located on a Windows Server that executes the real DisjoinAD Script. Again, if we execute the /home/admin/1_SRM_DisjoinAD.sh from the SRM appliance itself it works. 

/home/admin/1_SRM_DisjoinAD.sh contents are as follows based on the SRM scripting guidelines:

clear
echo "$(date "+%Y-%m-%d %H:%M:%S") : Recovery Plan $VMware_RecoveryName ran in $VMware_RecoveryMode mode"
# some more custom actions
ssh dr_restoreaccount@adminscriptvm.ourdomain.com "PowerShell.exe D:\SRM_Scripts\Trial-Error\1_SRM_DisjoinAD.ps1 -RPname RP_SRM_Test_VMs"

  1. SSH is enabled and working on the Windows Server
  2. SSH connectivity works from the SRM appliance to the Windows Server
  3. Logins from the SRM appliance are verified in the Windows Event logs from account dr_restoreaccount@adminscriptvm.ourdomain.com
  4. When executing the above script from the SRM recovery plan, the aforementioned error message is received: "Command: EXECUTE DisjoinAD Script Warning - The command '/bin/sh' returned a non-zero value: 255"

It appears that when executing the SRM command recovery step the SSH session doesn't work as no login events are registered on the Windows server. An .ssh folder was created on the SRM appliance in the srm user directory with the SSH key for the dr_restoreaccount@adminscriptvm.ourdomain.com but, that does not appear to help.

Any insight or help in this matter would be greatly appreciated.

 

 

 

Reply
0 Kudos
4 Replies
Arnon
Contributor
Contributor

Hi Chris,

I'm experiencing exactly same issue.

Did you ever managed to solve it?

Reply
0 Kudos
chris_dd
Contributor
Contributor

Hello Arnon,

Yes we were able to resolve the issue with a hint from VMware:

"The fix was to start an ssh session via the srm directory structure to add the Windows Server fingerprint to the known_hosts file. We had previously added the ssh key to the id_rsa file however, we hadn’t updated known_hosts"

Its been working wonderfully ever since. 

Good luck!

Arnon
Contributor
Contributor

Hi Chris,

That's great! We actually tried to copy everything to the SRM user folder using sudo, but it didn't work.

Only after granting enough premissions to the SRM folder did it actually work.

Thanks alot!

Reply
0 Kudos
scott28tt
VMware Employee
VMware Employee

As your post needs moving to the area for ESXi, I have reported it to the moderators.

 


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
Reply
0 Kudos