Hi,
I have a script that, in a foreach loop, runs the get-esxcli command on a host, collects data on the host, adds it to a report, and moves on to the next host.
Script works fine, but each host takes 10-60 seconds to collect & process the data. With the 600-1000 hosts, depending on scope, the script can take >2 hours, which is fine for an automated run, but not so great when running it ad hoc like for version checking when updating firmware or something.
So my question is if there's any way to optimize the get-esxcli. ie run it against multiple hosts at once, or something else to streamline it.
Rough script:
Connect-viservers
$hosts = get-vmhost
foreach ($host in $hosts){
$esxcli = get-esxcli -vmhost $host -v2
$storage1 = $esxcli.storage.san.fc.list.Invoke()
$network1 = $esxcli.network.nic.list.Invoke()
$software1 = $esxcli.software.vib.list.Invoke()
extract data out of those arrays, write to data.array
}
write data.array to csv
disconnect-viservers
Extracted data is usual things - ESXi version/build, hardware specs, serial, model, NIC version, model, HBA version, model, network info, etc. to give a pretty good low level inventory that can be sorted & analyzed.
The objects returned by Get-EsxCli have no provisions to run parallel.
But you can multiple instances of your script in parallel.
There are a couple of options
- run multiple instances of the script via Start-Job
- if you are using PSv7, you can use Foreach -Parallel
There are some other possibilities, but these have some drawbacks or are phased out.
There is the option of using Runspaces, or the option of using PS Workflows.
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
The objects returned by Get-EsxCli have no provisions to run parallel.
But you can multiple instances of your script in parallel.
There are a couple of options
- run multiple instances of the script via Start-Job
- if you are using PSv7, you can use Foreach -Parallel
There are some other possibilities, but these have some drawbacks or are phased out.
There is the option of using Runspaces, or the option of using PS Workflows.
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
So, that's a good start.
Something like have each vcenter do start-job. And have each vcenter run in parallel.
What would be the consequences of them writing to the csv at the same time? Or is that a few dozen milliseconds that would almost never collide in the real world?
Also, what about optimizing the calling of the $storage1,$network1, $software1?
Any better way to call those? I broke out these parts of my script & timed them. These calls are the vast majority (>95%) of the time spent in the $host loop. I thought that the $esxcli call would take the time & then the retrieving of the data from it would take almost no time. But that doesn't seem to be the case.
Writing to the same file from several threads always has the risk of conflicts.
What I normally do is write the data to the console, and then do a sequential Receive-Job to collect the output.
From there you can write it from a single thread to a single file.
Not sure what you mean by "optimizing the calling of the $storage1,$network1, $software1"
The esxcli commands take indeed time, fetching the required data from the output should be proportionally less time-consuming.
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
Appreciate the help!
This helped me tremendously.
Rough simple version of the script:
$processParams = $vc,$cred
$processCode = {
$output = @()
$connect = connect-viserver -server $using:vc -Credential $using:cred
$hosts = Get-VMHost
$count = $hosts.count
$output = $using:vc,$count
write-output $output
disconnect-viserver -server $using:vc -confirm:$false
}
foreach ($vc in $vcs){
$jobs += start-job -ScriptBlock $processCode -ArgumentList $processParams
}
Wait-Job -Job $jobs | Out-Null
foreach ($job in $jobs){
$data = receive-job -Job $job
$report = [ordered] @{
vCenter = $data[0]
HostCount = $data[1]
}
$hostinfo += New-Object PSObject -Property $report
remove-job $job
}
$hostinfo | export-csv $file -append -NoTypeInformation
Hopefully that helps someone else looking to get parallel vcenter reports for speed increases.
Hey @LucD
Wondering if I can get a little more help on this.
The above code works fine - 2 min to log into bunch of vcenters & get host counts, no problem.
But if I add a $hosts loop, it hangs & never gets any info. And since it's inside a job, I can't see what's failing. And the debug doesn't seem to help show anything either.
Here is all I added to the above script, right after the $count = $hosts.count line:
foreach ($host in $hosts){
$output = @()
write-host $host
$dc = $host | Get-Datacenter
$cluster = $host | Get-Cluster
#$esxcli = Get-EsxCli -VMHost $host -V2
#$software1 = $esxcli.software.vib.list.Invoke()
#$network1 = $esxcli.network.nic.list.Invoke()
#$storage1 = $esxcli.storage.san.fc.list.Invoke()
$output = $using:vc,$dc,$cluster,$host
write-output $output
}
If I comment out the $output = line, it continues but obviously doesn't get any results & I get a null array error for the receive-jobs loop.
I feel like I'm missing something about putting data into an array inside a scriptblock, and because it's a job, I can't see the error.
Any help, much appreciated!
You can let each job write to a log with the Start-Transcript cdmlet.
If you combine that with setting the $verbosepreference variable, you will get some info from inside the job in that log.
I'm not sure how you pass the parameters to each job.
Instead of passing the vCenter name and the credentials, you can also pass the SessionId, and then do a Connect-VIServer (inside the job) with the Session parameter, instead of the credentials.
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
So, if I comment out my $output = $vc, $dc,$cluster,$host line, it completes quickly, and everything is written to the transcript. host count, list of hosts *due to the write-host commands*, but nothing to $output because it's obviously null.
If I leave it in, the background jobs never seem to get results & thus never write anything much to the transcript. Which is basically what I got from the debug breaks. Which is very weird considering the only thing I'm removing is the results array after the data is supposed to have been gotten. No idea why putting it inside a foreach $host loop causes this & without the loop, it doesn't. If I force quit the script, the transcript only reports "Terminating Error(): The pipeline has stopped".
If I put the start/stop-transcript inside the scriptblock or at the start/end of the entire script, I still get the same information - the first loop seems to write-host count & *first* host name, and then hangs the script forever once it gets to the $output line( I waited 6+ hours for it over the weekend)
Only parameters passed are in the first script - $vc, $cred. I figured I only used them at the start of the script block so no big deal to just use $using:vc, $using:cred.
For clarity, here's the whole script in one piece:
$processParams = $vc,$cred
$processCode = {
$connect = connect-viserver -server $using:vc -Credential $using:cred
$hosts = Get-VMHost
$count = $hosts.count
foreach ($host in $hosts){
$output = @()
write-host $host
$dc = $host | Get-Datacenter
$cluster = $host | Get-Cluster
$output = $using:vc,$dc,$cluster,$host
write-output $output
}
disconnect-viserver -server $using:vc -confirm:$false
}
foreach ($vc in $vcs){
$jobs += start-job -ScriptBlock $processCode -ArgumentList $processParams
}
Wait-Job -Job $jobs | Out-Null
foreach ($job in $jobs){
$data = receive-job -Job $job
$report = [ordered] @{
vCenter = $data[0]
DataCenter = $data[1]
Cluster = $data[2]
Host = $data[3]
}
$hostinfo += New-Object PSObject -Property $report
remove-job $job
}
$hostinfo | export-csv $file -append -NoTypeInformation
There are a couple of flaws in that script
- you can't use the $host variable, that is a system-generated variable and is read-only
- you are placing objects (DC,Cluster and VMHost) in the output. You will need to specify the property (Name I guess) you want in the result
- you have to declare a variable as an array before adding elements to it ($hostinfo)
$processParams = $vc, $cred
$processCode = {
$connect = Connect-VIServer -Server $using:vc -Credential $using:cred
$hosts = Get-VMHost
$count = $hosts.count
foreach ($vmhost in $hosts) {
$output = @()
Write-Host $vmhost
$dc = $vmhost | Get-Datacenter
$cluster = $vmhost | Get-Cluster
$output = $using:vc, $dc, $cluster, $vmhost
Write-Output $output
}
Disconnect-VIServer -Server $using:vc -Confirm:$false
}
$jobs = @()
foreach ($vc in $vcs) {
$jobs += Start-Job -ScriptBlock $processCode -ArgumentList $processParams
}
Wait-Job -Job $jobs | Out-Null
$hostinfo = @()
foreach ($job in $jobs) {
$data = Receive-Job -Job $job
$obj = [ordered] @{
vCenter = $data[0]
DataCenter = $data[1].Name
Cluster = $data[2].Name
Host = $data[3].Name
}
$hostinfo += New-Object PSObject -Property $obj
Remove-Job $job
}
$hostinfo | Export-Csv $file -Append -NoTypeInformation
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
Just to clarify, I didn't actually have $host as the name, was just (wrongly) genericizing it for the internet.
So, I did a write-host for each variable, including $output.
And it has all the information, vcenter, datacenter, cluster, hostname. But it never goes to host#2. It never gets to the receive-job loop.
If I put $output = $using:vc+","+$dc+","+$cluster+","+$vmhost, then the script finishes, but only has a line for the first host, and the array row is a single item - combination of all those with the commas. ie not actually separated.
It seems like the $output array doesn't know to be more than one item wide & hangs trying to insert the rest of the items.
I tried my version of your script, and it seems to be working for me.
With $dc, $cluster and $vmhost you are placing objects on the output, not single strings.
In my run, these objects are received, but I had to specify which property I wanted in the resulting CSV.
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
So, copy & paste didn't work for me. But reading your script again, and knowing the problem was inside the script block, I had the variables set to the explicit name instead at the $report part ie:
$dc = $dc.Name
$cluster = $cluster.Name
$vmhost = $vmhost.Name
And then it finally worked. For the last host. Guess I gotta go add a += somewhere.
Thanks again for all the help & troubleshooting!