Solved: APD down and PowerCLI report

zenivox · ‎01-24-2018

Hello, during a Storage firmware upgrade I used the following script to have a path status all along. Ran it before, during and after. Frankly I should have observed paths down but I didn't. However MS SQL Clusters with shared RDMs lost their disks.

#Checks a group of hosts for potential zoning or san presentation inconsistencies. Not suitable for environments where HBAs are mapped in couples.
$dcs = Get-Datacenter myDC1,myDC2,myDC3
$vClusters =  Get-Cluster -Location $dcs | ?{$_.Name -notmatch "Maintenance"} | sort
$outReport = @()
foreach ($thisCluster in $vClusters)
{
        $scsiLunCount = 0
        foreach ($thisHost in ($thisCluster | get-vmhost))
        {
                write-host "Working on $($thisHost.name)..."
                foreach ($thisHBA in ($thisHost | get-vmhosthba | ? {$_.type -eq "FibreChannel"}))
                {       
                        $target = ((Get-View $thisHBA.VMhost).Config.StorageDevice.ScsiTopology.Adapter | where {$_.Adapter -eq $thisHBA.Key}).Target
                        $nrPaths = ($target | %{$_.Lun.Count} | Measure-Object -Sum).Sum
                        $outHBA = "" | select Cluster,Host,ExpectedLUNs,ActualLUNs,VMHBA,NumberOfPaths
                        $outHBA.VMHBA = $thisHBA.device
                        $outHBA.Cluster = $thisCluster.name
                        $outHBA.Host = $thisHost.name
                        $outHBA.ActualLUNs = ($thisHBA | get-scsiLUN).count
                        $outHBA.NumberOfPaths = $nrPaths
                        if ($scsiLunCount -ne 0)
                        {
                                if ($scsiLUNCount -ne $outHBA.ActualLUNs)
                                {
                                        #Bad condition
                                        write-error "$thisHost $($thisHBA.name) does not have $scsiLUNCount LUNs."
                                }
                        }
                        else
                        {
                                $scsiLUNCount = $outHBA.ActualLUNs
                        }
                        $outHBA.ExpectedLUNs = $scsiLunCount
                        $outReport += $outHBA
                }
        }
}
$outReport

During one Storage Array node reboot the Event viewer on the web client on some ESXi hosts showed APD. I would have expected the script to show me zero paths available, is anything wrong with it?

LucD · ‎01-24-2018

Could it be that the APD status was only for a brief moment during the reboot?

It would be useful to check the events to check the Events to see when and how long the APD status stayed active.

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

View solution in original post

LucD · ‎01-24-2018

Could it be that the APD status was only for a brief moment during the reboot?

It would be useful to check the events to check the Events to see when and how long the APD status stayed active.

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

zenivox · ‎01-24-2018

right again...
10 secs.. so I'll have to prepare another script that runs and check past events within a defined time frame. I hope I can reach ESXi events through Get-View

LucD · ‎01-24-2018

You can, but be aware that the events on an ESXi node are only kept for a limited time., and disappear after a reboot

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

zenivox · ‎01-24-2018

Thanks Luc, I think if I grab events from the past 30 min and run script every 30 min (scheduled) during the upgrade (which lasts several hours), I'll be able to grab useful events that will allow me to raise the flag if needed..

cheers

LucD · ‎01-24-2018

My experience, go for 15 mins.
I often see the events are kept for around 15 mins, but your mileage may vary :smileygrin:

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

All

APD down and PowerCLI report