VMware Cloud Community
rbglhec
Contributor
Contributor

Orphaned Files/Folders

I am trying to edit LucD's spring cleaning datastore script (Link) below to do a few things and need some help.

These are the things I'd like to do:

1. Report on all files and folders on the datastore, not just vmdks.

2. Support NFS datastores. if($ds.ExtensionData.Summary.MultipleHostAccess) should work.

3. Filter out .snapshot NetApp folders in addition to other "." prefixed system folders like HA, StorageIO, etc. (Would be nice to support regex filters.)

4. Exclude SRM and CBT folders/files.

function Remove-OrphanedData {

<#

.SYNOPSIS   Remove orphaned folders and VMDK files

.DESCRIPTION   The function searches orphaned folders and VMDK files

   on one or more datastores and reports its findings.

   Optionally the function removes  the orphaned folders   and VMDK files

.NOTES   Author:  Luc Dekens

.PARAMETER Datastore

   One or more datastores.

   The default is to investigate all shared VMFS datastores

.PARAMETER Delete

   A switch that indicates if you want to remove the folders

   and VMDK files

.EXAMPLE

   PS> Remove-OrphanedData -Datastore ds1

.EXAMPLE

  PS> Get-Datastore ds* | Remove-OrphanedData

.EXAMPLE

  PS> Remove-OrphanedData -Datastore $ds -Delete

#>

  [CmdletBinding(SupportsShouldProcess=$true)]

  param(

  [parameter(Mandatory=$true,ValueFromPipeline=$true)]

  [PSObject[]]$Datastore,

  [switch]$Delete

  )

  begin{

    $fldList = @{}

    $hdList = @{}

    $fileMgr = Get-View FileManager

  }

  process{

    foreach($ds in $Datastore){

      if($ds.GetType().Name -eq "String"){

        $ds = Get-Datastore -Name $ds

      }

      if($ds.ExtensionData.Summary.MultipleHostAccess){

        Get-VM -Datastore $ds | %{

          $_.Extensiondata.LayoutEx.File | where{"diskDescriptor","diskExtent" -contains $_.Type} | %{

            $fldList[$_.Name.Split('/')[0]] = $_.Name

            $hdList[$_.Name] = $_.Name

          }

        }

        Get-Template | where {$_.DatastoreIdList -contains $ds.Id} | %{

          $_.Extensiondata.LayoutEx.File | where{"diskDescriptor","diskExtent" -contains $_.Type} | %{

            $fldList[$_.Name.Split('/')[0]] = $_.Name

            $hdList[$_.Name] = $_.Name

          }

        }

        $dc = $ds.Datacenter.Extensiondata

        $flags = New-Object VMware.Vim.FileQueryFlags

        $flags.FileSize = $true

        $flags.FileType = $true

        $disk = New-Object VMware.Vim.VmDiskFileQuery

        $disk.details = New-Object VMware.Vim.VmDiskFileQueryFlags

        $disk.details.capacityKb = $true

        $disk.details.diskExtents = $true

        $disk.details.diskType = $true

        $disk.details.thin = $true

        $searchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec

        $searchSpec.details = $flags

        $searchSpec.Query += $disk

        $searchSpec.sortFoldersFirst = $true

        $dsBrowser = Get-View $ds.ExtensionData.browser

        $rootPath = "[" + $ds.Name + "]"

        $searchResult = $dsBrowser.SearchDatastoreSubFolders($rootPath, $searchSpec)

        foreach($folder in $searchResult){

          if($fldList.ContainsKey($folder.FolderPath.TrimEnd('/'))){

            foreach ($file in $folder.File){

              if(!$hdList.ContainsKey($folder.FolderPath + $file.Path)){

                New-Object PSObject -Property @{

                  Folder = $folder.FolderPath

                  Name = $file.Path

                  Size = $file.FileSize

                  CapacityKB = $file.CapacityKb

                  Thin = $file.Thin

                  Extents = [string]::Join(',',($file.DiskExtents))

                }

                if($Delete){

                  If ($PSCmdlet.ShouldProcess(($folder.FolderPath + " " + $file.Path),"Remove VMDK")){

                    $dsBrowser.DeleteFile($folder.FolderPath + $file.Path)

                  }

                }

              }

            }

          }

          elseif($folder.File | where {"cos.vmdk","esxconsole.vmdk" -notcontains $_.Path}){

            $folder.File | %{

              New-Object PSObject -Property @{

                Folder = $folder.FolderPath

                Name = $_.Path

                Size = $_.FileSize

                CapacityKB = $_.CapacityKB

                Thin = $_.Thin

                Extents = [String]::Join(',',($_.DiskExtents))

              }

            }

            if($Delete){

              if($folder.FolderPath -eq $rootPath){

                $folder.File | %{

                  If ($PSCmdlet.ShouldProcess(($folder.FolderPath + " " + $_.Path),"Remove VMDK")){

                    $dsBrowser.DeleteFile($folder.FolderPath + $_.Path)

                  }

                }

              }

              else{

                If ($PSCmdlet.ShouldProcess($folder.FolderPath,"Remove Folder")){

                  $fileMgr.DeleteDatastoreFile($folder.FolderPath,$dc.MoRef)

                }

              }

            }

          }

        }

      }

    }

  }

}

0 Kudos
6 Replies
LucD
Leadership
Leadership

Nice additions to the function, and it is the right time of the year (at least in the Northern hemisphere) :smileygrin:

1) You will need to add the other file types to the HostDatastoreBrowserSearchSpec object on the Query property.

The FileQuery object shows what other file types are available for the search

But note, that you would also need to adapt the Where-clause that looks at the LayoutEx.File entries for each VM.

Now it only includes VMDK related files

2) That is correct, this should capture VMFS and NFS

3) The current function already filters out console files.

That Where-clause can be extended to include other patterns.

And yes, a RegEx is a good idea.

4) For those you need to give the characteristics by which these files can be identified.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
rbglhec
Contributor
Contributor

Just a thought. Wouldn't an easier approach be to list the folders referenced in a VM and then do a folder search on the datastore and if the folder isn't in the list of referenced folders it can be cleaned? The exclusions would be ctk files, SRM files, and system folders. I have the base script but I can't figure out how to compare the 2 arrays.

$Datastores = "Datastore1"

foreach($Datastore in Get-Datastore $Datastores) {

   $ds = Get-Datastore -Name $Datastore | %{Get-View $_.Id}

   $SearchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec

   $SearchSpec.matchpattern = "*"

   $dsBrowser = Get-View $ds.browser

   $DatastorePath = "[" + $ds.Summary.Name + "]"

   # Find all paths in Datastore, filtering out ones with (.) periods using regex in the name. (Filters out system and netapp snapshot folders.)

   $arrSearchResult = $dsBrowser.SearchDatastoreSubFolders($DatastorePath, $SearchSpec) | where {$_.FolderPath -notmatch '[.]'} | %{$_.FolderPath }

$arrDSVMs = Get-VM -Datastore "Datastore1" | %{

          $_.Extensiondata.LayoutEx.File | Select-object -first 1 | select Name }

How do I compare these 2 arrays and only list DS folders that are not referenced in a vm? This also doesn't look for templates but I am guessing you could do the get-template -Datastore... to get that array also.

0 Kudos
LucD
Leadership
Leadership

You could only look at folders, but then you will only find orphaned folders.

That will not find orphaned VMDKs in a VM's folder.

If I need to compare arrays, I most of the time place at least one of the arrays in a hash table.

Then you can loop through the 2nd array, and use the ContainsKey method on the hash table to verify if the element is in that array as well.

With a hash table you can also remove entries from an array.

For example:

$vmFolders | %{

     if($allFolders.ContainsKey($_.Name){

        $allFolders.Remove($_.Name)

    }

}

What is left in $allFolders are then the ones that are not used by any VM


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
rbglhec
Contributor
Contributor

Here is what I have so far. I am still having a problem comparing the lists. I also started exporting these to csv files so I can compare them in excel. It would be nice to end up with an excel based folder tree of the selected datastore and highlight the orphaned files/folders using conditional formatting. I think it would give you a nice picture of your datastore topology. Any suggestions on comparing and getting this into a datastore tree listing in excel would be appreciated.

Clear

$arrVMfiles = @{}

$arrDSfiles = @{}

#-------------------------------------

# Get datastore folder and file list

#-------------------------------------

Function getDSInfo {

Param($Datastores)

$excludePathRegex ="[.]" # RegEx Search Path Exclusion

foreach($Datastore in Get-Datastore $Datastores)

{

   $ds = Get-Datastore -Name $Datastore | %{Get-View $_.Id}

   $SearchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec

   $SearchSpec.matchpattern = "*"

   $dsBrowser = Get-View $ds.browser

   $DatastorePath = "[" + $ds.Summary.Name + "]"

}

$dsSearchResult = $dsBrowser.SearchDatastoreSubFolders($DatastorePath, $SearchSpec) | where {$_.FolderPath -notmatch $excludePathRegex}

foreach($folder in $dssearchResult){

  foreach ($file in $folder.File){

  $fileFullPath = $folder.FolderPath + $file.Path

  If ($fileFullPath -match '[.]') {

  $arrDSfiles[($fileFullPath.Split(" ")[1])] = (($fileFullPath.Split(" ")[0]))

  }

  Else

  {

  $arrDSfiles[$folder.FolderPath] = $folder.FolderPath

  }

  }

  }

}

#-------------------------------------

# Get VM file list

#-------------------------------------

Function getVMFiles {

Param ($Datastore)

Remove-VIProperty -Name * -ObjectType *

New-VIProperty -ObjectType VirtualMachine -Name DSFileList -ValueFromExtensionProperty 'LayoutEx.File.Name'

$vmDSInfo = Get-VM -Datastore $Datastore

$vmDSInfo.DSFileList | % {

$arrVMfiles[($_.Split(" ")[1])] = ($_.Split(" ")[0])

}

}

#-------------------------------------

# Main

#-------------------------------------

getVMFiles "datastore1"

$arrVMfiles.GetEnumerator() | Sort-Object -Property Name -Descending |

Select-Object Name, Value | Export-Csv -Path C:\temp\arrVMFiles.csv -NoTypeInformation

getDSInfo "datastore1"

$arrDSfiles.GetEnumerator() | Sort-Object -Property Name -Descending |

Select-Object Name, Value | Export-Csv -Path c:\temp\arrDSFiles.csv -NoTypeInformation

0 Kudos
rbglhec
Contributor
Contributor

Ok, roadblock! Seems like searching with a wildcard ($SearchSpec.matchpattern = "*") on a datastore with alot of VMs (NFS) gives you an error:

"A general system error occurred: Directory too large to search."

This seems like a huge limitation to SearchDatastoreSubFolders. I've also found that it also times out if the search takes too long. Are there any other ways to search the datastore for all files?

LucD, have you had this occur using you'r orphaned VMDK script with large NFS datastores? Seems like this should happen with any large datastore containing many files. Is there a way to just get the root folder list of a datastore and not the entire directory structure?

0 Kudos
rbglhec
Contributor
Contributor

Figured it out. My code is messy but it works. If someone with better coding skills can help optimize this that would be great.

<#

.SYNOPSIS   Lists root folders in datastores not referenced by a VM.

.DESCRIPTION   The function searches orphaned folders

   on one or more datastores and reports its findings.

  (** Please verify folder contents before deleting! **)

.NOTES   Author:  Rick Bankers

.PARAMETER Datastore

   One or more datastores.

   The default is to investigate all datastores

  ($excludePathRegex = RegEx path exclusions applied below.)

  ($excludeDSRegex = RegEx datastore exclusions applied below.)

.EXAMPLE

   PS> Get-DSOrphanedFlds datastore1, datastore2

.EXAMPLE

   PS> Get-DSOrphanedFlds

#>

  param(

  [parameter(Mandatory=$false,ValueFromPipeline=$true)]

  [PSObject[]]$Datastores

  )

#-------------------------------------

# Get datastore root folder list

#-------------------------------------

Function getDSFolders {

Param($Datastores)

Foreach($Datastore in Get-Datastore $Datastores)

{

   $ds = Get-Datastore -Name $Datastore | %{Get-View $_.Id}

   $SearchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec

   $folderquery = New-Object VMware.Vim.FolderFileQuery

   $searchSpec.Query = $folderquery

   $dsBrowser = Get-View $ds.browser

   $DatastorePath = "[" + $ds.Summary.Name + "]"

}

$dsSearchResult = $dsBrowser.SearchDatastore($DatastorePath, $SearchSpec)

$dsSearchResult | % { $_.File.Path | % {

If ($_ -notmatch $excludePathRegex)

{

  $arrDSfolders[([string]::join(' ', $DatastorePath,$_))] = $DatastorePath

}}}}

#-------------------------------------

# Get VM file list

#-------------------------------------

Function getVMFolders {

Remove-VIProperty -Name * -ObjectType *

New-VIProperty -ObjectType VirtualMachine -Name DSFileList -ValueFromExtensionProperty 'LayoutEx.File.Name'

$vmDSInfo = Get-VM

  $vmDSInfo.DSFileList | % {

  $arrVMfolders[($_.Split("/")[0])] = ($_.Split(" ")[0]) | Select-Object -First 1

  }

New-VIProperty -ObjectType Template -Name DSFileList -ValueFromExtensionProperty 'LayoutEx.File.Name'

$vmDSInfo = Get-Template

  If ($vmDSInfo) {

  $vmDSInfo.DSFileList | % {

  $arrVMfolders[($_.Split("/")[0])] = ($_.Split(" ")[0]) | Select-Object -First 1

  }}

}

#-------------------------------------

# Set variables

#-------------------------------------

$excludePathRegex = "[\.]" # RegEx Search Path Exclusion

$excludeDSRegex = "local" # RegEx Search DS Exclusion

$arrVMfolders = @{}

$arrDSfolders = @{}

#-------------------------------------

# Main

#-------------------------------------

  If ($Datastores)

  {

  $Datastores = Get-Datacenter | Get-Datastore -Name $Datastores

  }

  Else

  {

  $Datastores = Get-Datacenter | Get-Datastore | Where-Object { $_.Name -notmatch $excludeDSRegex} | Select Name

  }

  $Datastores | % {

  getDSFolders $_.Name

  }

  getVMFolders

  Foreach ($key in $arrVMfolders.Keys) {

  If ($arrDSfolders.ContainsKey($key)) {

        $arrDSfolders.Remove($key)

  }

  }

$arrDSfolders | Format-table -AutoSize

0 Kudos