VMware Cloud Community
halr9000
Commander
Commander

performance of VimDatastore PSDrive

$ds = Get-Datastore
$DSDrive = $ds | ForEach-Object {
	New-PSDrive -name ($_.Name) -psProvider VimDatastore -root / -datastore $_ 
}
$VMXFileInfo = $DSDrive | ForEach-Object {
	Write-Progress "Scanning datastores" "$_..."
	dir "$($_):" -recurse -include *.vmx
}

The above takes 9 minutes to run. This is with two servers with three datastores, all internal SCSI disks. Is that consistent with what you guys are seeing?

@vmware will this be improved in future versions?






Author of the upcoming book: Managing VMware Infrastructure with PowerShell

Co-Host, PowerScripting Podcast (http://powerscripting.net)

My signature used to be pretty, but then the forum software broked it. vExpert. Microsoft MVP (Windows PowerShell). Author, Podcaster, Speaker. I'm @halr9000
0 Kudos
5 Replies
Sirry
Enthusiast
Enthusiast

Its been 3 years and I am experiencing similar issues...

Below script takes nearly half an hour to run on a relatively small set of folders and datastores (70ish folders over 5ish datastores)

I noticed recurse goes slower so I rewrote it slightly to only go two folder deep (increases perf ever so slightly), but it still stinks.

Are there any improvements I accidentally skipped over in my quest to optimize speed for this script?

$dss = Get-Datastore

foreach ($drive in $dss)

{

$nd = New-PSDrive -Name ds -PSProvider VimDatastore -Root '/' -Location $drive

Write-Host "Datastore: "$drive

$vmFolders = get-childitem ds:\ | Where { $_.ItemType -eq "Folder" }

foreach ($vmFolder in $vmFolders)

{

$path = "ds:\" + $vmFolder.Name

$result = get-childitem $path | Where { $_.ItemType -eq "VmSnapshotFile" }

Write-Host $result

}

Remove-PSDrive -Name ds

}

and yes, I know I can get snapshots another way, but we are using a 3rd party product that sometimes ungracefully deletes snapshots, so that they don't appear in vSphere, but remain on disk...

0 Kudos
RvdNieuwendijk
Leadership
Leadership

I tried to solve your problem and made the next new version of your script to make it run faster. It uses PowerCLI 4.1 features so you need that version to run this script. It uses pipelines instead of the foreach statement which is faster. And it uses the DatastoreBrowserPath property so you don't have use the New-PSDrive and Remove-PSDrive cmdlets. The output is delivered as an object which is more a PowerShell way of doing things than resulting strings.

Get-Datastore | ForEach-Object {
  $Datastore = $_
  Get-ChildItem $Datastore.DatastoreBrowserPath | `
    Where-Object { $_.ItemType -eq "Folder" } | ForEach-Object {
      $Folder = $_
      Get-ChildItem "$($Datastore.DatastoreBrowserpath)\$($Folder.Name)" | `
        Where-Object { $_.ItemType -eq "VmSnapshotFile" } | ForEach-Object {
          $Report = "" | Select-Object -Property Datastore,Folder,SnapshotFile
          $Report.Datastore = $Datastore.Name
          $Report.Folder = $Folder.Name
          $Report.SnapshotFile = $_.Name
          $Report
        }
    }
}

The script is still very slow. Line 6, the second Get-ChildItem cmdlet causes the slowness.

I ran a timing test with the old and the new version and there was allmost no difference. The old version was even a few seconds faster. The results:

Old version 3 hours 59 minutes 21 seconds

New version 3 hours 59 minutes 53 seconds.

Sometimes you have to think out of the box to solve a problem. Yesterday I made a script in thread that uses plink.exe to call a console command from PowerCLI. In a console you can use the "find /vmfs/volumes -iname *.vmsn" command to get all the snapshot files in all datastore attached to the ESX server. So I adapted my script from yesterday to solve your problem. It runs much faster than the native PowerCLI script. Which I regret because I would like to be able to do everything in native PowerCLI. The script shows only snapshot files for datastores connected to ESX servers in clusters. It will not be difficult to change the script for standalone ESX servers. The script uses PowerShell v2 features.

$User = "root"
$Password = "password"
$Command = "find /vmfs/volumes -iname *.vmsn"

function Invoke-VMhostCommand {
  # This function assumes that plink.exe is in your path
  param([parameter(Mandatory=$true)][string] $VMHost,
        [parameter(Mandatory=$true)][string] $User,
        [parameter(Mandatory=$true)][string] $Password,
        [parameter(Mandatory=$true)][string] $Command)

  $plink = "plink.exe"
  $plinkoptions = " -v -batch -pw $Password"
  $remoteCommand = '"' + $Command + '"'
  $PlinkCommand = $plink + " " + $plinkoptions + " " + $User + "@" + $VMHost + " " + $remoteCommand
  $msg = Invoke-Expression -command $PlinkCommand
  $Report = "" | Select-Object Host,Output
  $Report.Host = $VMHost
  $Report.Output = $msg
  $Report
}

$Report = Get-Cluster | ForEach-Object {
  Get-VMHost -Location $_ | Select-Object -First 1 | ForEach-Object {
    Invoke-VMHostCommand -VMHost $_.Name -User $User -Password $Password -Command $Command
  }
}
$Report | Select-Object -ExpandProperty Output

It will also be easy to change this script to return all the .vmx files, as Hal would like to do in the first post of this thread.

Timing test of this script against the same environment: 0 hours 0 minutes 53 seconds.

Regards, Robert

Message was edited by: RvdNieuwendijk

Blog: https://rvdnieuwendijk.com/ | Twitter: @rvdnieuwendijk | Author of: https://www.packtpub.com/virtualization-and-cloud/learning-powercli-second-edition
0 Kudos
Sirry
Enthusiast
Enthusiast

Thanks Robert. Although the PowerCLI browsing issue remains, this is definitely an acceptable workaround for now. Thank you for your time!

0 Kudos
ykalchev
VMware Employee
VMware Employee

Hi,

Using direct request to the vCenter server is always faster and in your case preferable, than ps provider command, since you're able to optimize your file search query and get data in a single server call.

The PowerShell providers in order hand simplify the usage by a standard cmdlets and are very powerful in interactive mode but introduce additional abstract layer that does not allow us to optimize in the best way request to the server file system.

However in PowerCLI 4.1 we've continue to improve the performance resolving some issues with Get-ChildItem cmdlet in PS 2.0 but it seem you hit another issue that we've try to resolve.

So can you give us some details about your environment: the versions of Powershell, PowerCLI and vCenter server?

Here is the script that uses a single server call for each datastore file query. It runs for 9 sec in our test environment with 7 datastores compared to 3 min 45 secs for your script using datastore provider:

    $dsList = Get-View -ViewType Datastore -Property "summary.name", "browser"
    $report = @()
    
    foreach ($ds in $dsList) 
    {
        $fileQueryFlags = New-Object VMware.Vim.FileQueryFlags
	$fileQueryFlags.FileSize = $true
	$fileQueryFlags.FileType = $true
	$fileQueryFlags.Modification = $true
	$searchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec
        $searchSpec.query += New-Object VMware.Vim.VmSnapshotFileQuery
	$searchSpec.details = $fileQueryFlags
	$dsBrowser = Get-View $ds.browser
	$rootPath = ""
		
    	$searchResult = $dsBrowser.SearchDatastoreSubFolders($rootPath, $searchSpec)
		
	foreach ($folder in $searchResult)
	{
		foreach ($fileResult in $folder.File)
		{
			if ($fileResult.Path)
			{
				$row = "" | Select DS, Path, File, Size, ModDate
				$row.DS = $ds.summary.Name
				$row.Path = $folder.FolderPath
				$row.File = $fileResult.Path
				$row.Size = $fileResult.FileSize
				$row.ModDate = $fileResult.Modification
				$report += $row
			}
		}
        }
    }
    
    $report

Regards,

Yasen Kalchev

PowerCLI Dev Team

Yasen Kalchev, vSM Dev Team
0 Kudos
EddieShen
Contributor
Contributor

Edit: Sirry == EddieShen. Sorry about that Smiley Happy I'll try to stick to Sirry from now on.

Thank you Yasen - this script is infinitely faster! Looks like I will be looking into constructing more searchSpec-style queries rather than using ps provider commands.

So can you give us some details about your environment: the versions of Powershell, PowerCLI and vCenter server?

PoSH: 2.0

PowerCLI: VMware vSphere PowerCLI 4.0 U1 build 208462

vCenter: 4.0.0 Build 208111

I will try PowerCLI 4.1 soon and see if the old script performance improved although your current solution is superior.

Thanks.

0 Kudos