hawks76
Enthusiast
Enthusiast

vRA Software Components script error

Jump to solution

So, i have a software component that installs SQL and all the tools, and then sets the TCP port to a random number.  The code runs fine, and everything works as expected, but when the port change code runs in the configure phase of the software component, it always exits with an error of exit code 1.  i have looked through all the logs and i can't find anywhere why it's exiting with and error.

I have attached a screen shot.  It's also worth noting that this only happens on SQL 2017 installs, but works fine on SQL 2016 install.  I'm using the same vCenter template (Windows 2016 Standard).  The only difference is the version of SQL.  I can run the code locally under the same user account, and it runs perfectly fine.

Any help is greatly appreciated.

0 Kudos
43 Replies
hawks76
Enthusiast
Enthusiast

HA!!  Far from it.  I'm still tinkering with it.  Maybe i'll learn something new in the process....   Smiley Wink

0 Kudos
hawks76
Enthusiast
Enthusiast

OK.  I have narrowed my issue down to the following ps code:

if (($sc -ne "SQL_2016") -and ($sc -ne "SQL_2014") -and ($sc -ne "SQL_2017"))

{

Write-Output "Not an SQL Server. Skipping Task."

exit 0

}

#region generate and set random port number

Write-Output "Generating Random SQL Port Number....."

    $epoch_date = Get-Date("01/01/1970")

    $now = Get-Date

    $seed = [math]::Floor((New-TimeSpan -Start $epoch_date -End $now).TotalSeconds)

    $random_port = Get-Random -Minimum 10000 -Maximum 50000 -SetSeed $seed

   

    Write-Output "Randomly generated port number: $($random_port)"

    Write-Output "Setting Randon SQL Port....."

    # script derived from https://blog.dbi-services.com/sql-server-2012-configuring-your-tcp-port-via-powershell/

    #Import-Module "SQLPS" -ErrorAction SilentlyContinue

    Import-Module "E:\Program Files (x86)\Microsoft SQL Server\140\Tools\PowerShell\Modules\sqlps\sqlps.psd1" -ErrorAction SilentlyContinue -ErrorVariable $stdout_module

  

    #$smo = 'Microsoft.SqlServer.Management.Smo.'

    #$wmi = new-object ($smo + 'Wmi.ManagedComputer')

    $wmi = new-object ('Microsoft.SqlServer.Management.Smo.Wmi.ManagedComputer')

    $uri = "ManagedComputer[@Name='" + $env:ComputerName + "']/ServerInstance[@Name='MSSQLSERVER']/ServerProtocol[@Name='Tcp']"

    $tcp = $wmi.GetSmoObject($uri)

    if ($tcp.IsEnabled -ne $true)

    {

    $tcp.IsEnabled = $true

    $tcp.Alter()

    }

   

    $currentTcpPort = $wmi.GetSmoObject($uri + "/IPAddress[@Name='IPAll']").IPAddressProperties[1].value

   

   

    Write-Output "Current TCP Listening Port: $($currentTcpPort)"

    $wmi.GetSmoObject($uri + "/IPAddress[@Name='IPAll']").IPAddressProperties[1].Value=$random_port.ToString()

    Write-Output "Finalizing New TCP Port....."

    $tcp.Alter()

    $newTcpPort = $wmi.GetSmoObject($uri + "/IPAddress[@Name='IPAll']").IPAddressProperties[1].value

  

    Write-Output "New TCP Listening Port: $($newTcpPort)"

    Write-Output "Restarting Services MSSQLFDLauncher, MSSQLSERVER, SQLBrowser, SQLSERVERAGENT, and SQLWriter....."

    Get-Service MSSQLFDLauncher, MSSQLSERVER, SQLBrowser, SQLSERVERAGENT, SQLWriter | Restart-Service -Force -Confirm:$false

#endregion

I've run this code manually with the same user account that software components uses, and it doesn't throw any errors.  As well, this code completes successfully in software components even though it exits with an exit code of 1.  As i mentioned before, this same exact code runs without exiting with an error with SQL 2016 and SQL 2014 installs.  I use the same blueprint for all three installs.

0 Kudos
daphnissov
Immortal
Immortal

Did you not try using the SQL cmdlet mentioned by Luc? That's what I would do rather than writing WMI queries. Makes troubleshooting easier as well.

0 Kudos
hawks76
Enthusiast
Enthusiast

I have been trying to get it to work, but haven't done so successfully yet.  I'm working both paths right now.

0 Kudos
LucD
Leadership
Leadership

Something that might explain the 1 exit code.
WMI methods do use an exit code 1 if there is an Informational message.

PowerShell doesn't know this and just propagates the exit code as the exit code of the script.

Can you check the eventlog of the server to verify that there are no Informational events for the WBEM service at the time you call the method?


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
hawks76
Enthusiast
Enthusiast

I don't see an WBEM Information entries, but i do see some DCOM errors related to the service account that is being used to make the WMI call.  I've attached a screen shot.

0 Kudos
LucD
Leadership
Leadership

DCOM error 10016 is a known "feature", and something you can ignore.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
hawks76
Enthusiast
Enthusiast

So, i think i have narrowed this down to the Import-Module SQLPS line of code.  I can comment out everything before and after it, and it returns the exit code of 1.  I can comment the line before and the rest of the script, and it does not return and error or exit code 1.  I'm totally baffled.  I also tried calling the .psd1 file directly with the full path and got the same result.

0 Kudos
LucD
Leadership
Leadership

Did you already try adding the -Verbose switch on the Import-Module cmdlet?
Does the Import-Module work when you do it from a PS prompt on the target server?

Are you sure the Execution Policy is set to RemoteSigned on that server before attempting the Import-Module?


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
hawks76
Enthusiast
Enthusiast

I have tried the -Verbose, but it doesn't output to the software components console.

It works with no issue from a PS prompt on the target server with no errors or warnings.

I haven't checked the execution policy, but i will.  I will also output it during the install to make sure it's set to RemoteSigned.

I'll keep coming back to the fact that this works fine on the same vcenter template and blueprint when installing SQL 2014 or 2016, but for some reason it craps out when installing 2017.

I have opened a case with SDK support in hopes that they can help shed some light on what is going on as well.  If i get anything useful from that, i'll post it here.

0 Kudos
LucD
Leadership
Leadership

You might also want to include a Start-Transcript (at the start) and a Stop-Transcript at the end of your script.

That should normally capture what appears on the console.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
hawks76
Enthusiast
Enthusiast

I tried that at the beginning when i first started troubleshooting this and all it would output was the code where the transcription started and the code that it ended. there was no code for any of the other command in the file.  Also verified that when the scripts run in Software Components that the PS execution policy is RemoteSigned.

0 Kudos
hawks76
Enthusiast
Enthusiast

NOTE:  This is just a work around that fixed the symptoms of the main issue.  This still does not solve the problem i was getting with the exit code of 1.  There is continued discussion around that and how to troubleshoot it effectively.

So, it would seem the fix for this is easier than expected.

Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQLServer\SuperSocketNetLib\Tcp\' -Name "Enabled" -Value 1

Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQLServer\SuperSocketNetLib\Tcp\IPAll\' -Name "TcpPort" -Value "random_port_number"

(for versions other than SQL 2017, replace the MSSQL14.MSSQLSERVER with MSSQL##.MSSQLSERVER, where the ## corresponds with the version of SQL you are installing.)

restart the services, and presto!!  Had the SQL guys verify that this works like the expect it too, so if it's wrong, i blame them.....   🙂

View solution in original post

0 Kudos
daphnissov
Immortal
Immortal

I am actually running into this more frequently LucD​. In this latest case, for example, I've done a Start-Transcript and it shows no errors and have written out both $? and $LASTEXITCODE after almost every line and all those return true, yet at the end of the transcript I'm getting a $global:false and I cannot figure out how to determine where this is coming from. So two questions:

  1. What's the easiest way to troubleshoot this going forward?
  2. How can I simply reset the exit code to 0/true at the end of a script regardless of what happened anywhere? Is this code sufficient for most all cases?

if($?) { 

exit 0  

} else

exit 0  

}

0 Kudos
LucD
Leadership
Leadership

I would need more details on how these scripts are started.
Is it calling powershell.exe with the code as an argument? Or the code in a file?

Also under which account do these scripts then run?

The exit 0 will indeed return the code 0 to the caller.

But again, it depends on how the script was called.

It might be that return codes from a binary called in the script are propagated.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
daphnissov
Immortal
Immortal

It's a combination of native PowerShell in a script and it calling another script. All of this is "wrapped" and called by another script which gets called by the vRA software agent installed on a template. When the template comes up as a new VM, the agent (a Windows service) downloads the work item locally (which comes down as a script) and uses another script to call that script. So script =calls=> script (has native PS plus a separate .ps1 script). And even if I try that code I pasted above, I cannot get it to reset the exit code back to zero. Here's what that user script looks like

Start-Transcript -Path C:\DevCitrixDDCUtils.txt -Append

Write-Output "Mounting PS Drive"

IF ($File_share_username)

{

    $SecurePassword = ConvertTo-SecureString "$File_share_password" -AsPlainText -Force

    $FileShareCredential = New-Object System.Management.Automation.PSCredential ("$File_share_username", $SecurePassword)

    New-PSDrive -Name Z -PSProvider FileSystem -Root "\\$File_server\$File_share" -Persist -Credential $FileShareCredential

}

Else

{

New-PSDrive -Name Z -PSProvider FileSystem -Root "\\$File_server\$File_share"

}

Write-Output "Exit state: $?"

Write-Output $LASTEXITCODE

$certs_location = 'Z:\wildcard'

$citrix_storefront_psdir = 'C:\Program Files\Citrix\Receiver StoreFront\scripts'

$admindomain = $Script_domain

$fqdn = (Get-WmiObject win32_computersystem).DNSHostName+"."+(Get-WmiObject win32_computersystem).Domain

$certpwd = ConvertTo-SecureString -String 'my_password' -AsPlainText -Force

$cert = Import-PfxCertificate -FilePath "$certs_location\${admindomain}wildcard.pfx" -CertStoreLocation Cert:\LocalMachine\My -Password $certpwd

Write-Output "Exit state from Import-PfxCertificate: $?"

Write-Output $LASTEXITCODE

Write-Output "Setting web binding."

New-WebBinding -Name "Default Web Site" -IP "*" -Port 443 -Protocol https

$cert | New-Item -path "IIS:\SslBindings\0.0.0.0!443"

Write-Output "Exit state from New-WebBinding and New-Item: $?"

Write-Output $LASTEXITCODE

Write-Output "Executing SetHostBaseUrl script."

& $citrix_storefront_psdir\SetHostBaseUrl.ps1 https://$fqdn

Write-Output "Exit state from script: $?"

Write-Output $LASTEXITCODE

Write-Output "Cleaning up PSDrive."

Remove-PSDrive -Name Z -Force

Write-Output "Exit state from remove psdrive: $?"

Write-Output $LASTEXITCODE

Write-Output "Resetting exit code"

#?global:true

$LASTEXITCODE = 0

exit 0

if($?) {           

exit 0           

} else {           

exit 0           

}

The variables you don't see which are obvious are being set through the software component and they are expanding, so that's not a problem. The wrapper script which is invoked by vRA software components (which you can't change) is here:  https://pastebin.com/zGaHwBdj

And here is the output being returned to vRA after the execution of the user code shown above. As you can see, all the return codes show true/success.

Transcript started, output file is C:\DevCitrixDDCUtils.txt

Mounting PS Drive

Name           Used (GB)     Free (GB) Provider      Root                     

----           ---------     --------- --------      ----                     

Z                 108.57       2766.23 FileSystem    \\fileshare.domain.c...

Exit state: True

Exit state from Import-PfxCertificate: True

Setting web binding.

IPAddress : 0.0.0.0

Port      : 443

Host      :

Store     : My

Sites     : Microsoft.IIs.PowerShell.Framework.ConfigurationAttribute

Exit state from New-WebBinding and New-Item: True

Executing SetHostBaseUrl script.

Exit state from script: True

Cleaning up PSDrive.

Exit state from remove psdrive: True

Resetting exit code

ABORT. Encountered error in Powershell.

Error while executing script: Process exited with an error: 1 (Exit value: 1)

I *think* that the non-zero exit code is being returned by the PS1 script towards the bottom.

0 Kudos
LucD
Leadership
Leadership

I think I'll need to install vRA in my lab :smileygrin:


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
daphnissov
Immortal
Immortal

It probably wouldn't be a bad thing, but that's a lot of work just to do some troubleshooting for someone else.

Update:  In looking more carefully at the wrapper script and the output, it seems clear that an error is being logged even though the transcript has nothing. I'm trying with $error.clear() at the bottom of the user script to see if I can clear them from within.

0 Kudos
daphnissov
Immortal
Immortal

Ok, it seems that after calling $error.clear() I was actually able to squash that error code. Also for good measure (even though it shouldn't be needed), I set the ExecutionPolicy to bypass and unblocked the script that was being called. That looks like it took care of it. Still, though, I wish I knew how to troubleshoot these types of failures better. Any time vRA executes a software component, any non-zero exit code signifies failure (regardless of whether anything is sent to stderr or not), and tracking down exactly where those are coming from is what I'd like to really understand.

0 Kudos
LucD
Leadership
Leadership

Is the mechanism how vRA runs scripts via the agent documented somewhere?

There are many ways one can start a PowerShell script, just wondering how it is done in practice.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos