We have an environment (500+ vms) where often time there is hung process. We observed one of the indication is stale cpu utilization over a period of time.
Let's say if there is a vm with a steady 40% cpu utilization over a day, it is safely assume that there is bad process running because an idle process should consuming minimum.
That being said. Is it possible to query vms and return vms with stale cpu utilization? I am thinking to use get-stat and look for non-zero value but doesn't seem to working. Any thoughts?
Try like this, it adds a condition to see if the average CPU usage is above 20%
Group-Object VM | %{
$values = $_.Group | Sort-Object -Property Hour | %{$_.CPU}
$avg = $_Group | Measure-Object -Property CPU -Average | Select -ExpandProperty Average
$stdvar = Get-WelfordStdVar -Samples $values
if($avg -gt 20 -and $stdvar -le $threshold){
$_ | Select Name,@{N='StdVar';E={[math]::Round($stdvar,2)}},@{N='CPU';E={[string]::Join('/',$values)}}
}
}
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
This will report all VMs that have a daily average CPU usage over 40%
$vms = Get-VM
$stat = 'cpu.usage.average'
$start = (Get-Date).AddDays(-1)
$threshold = 40
Get-Stat -Entity $vms -Stat $stat -Start $start |
Group-Object -Property EntityId | %{
New-Object PSObject -Property @{
VM = $_.Group[0].Entity.Name
CPU = [math]::Round(($_.Group | Where {$_.Instance -eq ''} | Measure-Object -Property Value -Average | Select -ExpandProperty Average),1)
}
} |
where {$_.CPU -ge $threshold} |
Select VM,CPU
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
Is it possible to capture utilization that was constant in a fixed duration?
Yes:
For example: 2pm = 20%, 3pm = 20%, 4pm = 20%, 5pm = 20%
No:
For example: 2pm = 0%, 3pm = 40%, 4pm = 0%, 5pm = 40%
Let me see if I get this right, you want hourly intervals, and then only report the VMs that have the same value for all the intervals ?
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
As a matter of fact this is a very interesting question, and touching the world of statistics. one of my favorite subjects.
Since your cpu.usage.average values for each hour will hardly be ever exactly the same, you would need to have a value to express how close the values are together.
In the statistics world they mostly use the standard variation for this.
In the following script I use a function based on the Welford algorithm to calculate the standard variation.
The following script uses a threshold of 5 for the standard deviation to determine if the CPU percentages are "close" together, and thus indicating a rather constant CPU usage.
Play around with the threshold.
function Get-WelfordStdVar{
param([int[]]$Samples)
$n = $mean = $M2 = 0
$Samples | %{
$n++
$delta = $_ - $mean
$mean += ($delta/$n)
$M2 += ($delta*($_ - $mean))
}
if($n -lt 2){0}
else{
$M2/($n-1)
}
}
$vms = Get-VM
$stat = 'cpu.usage.average'
$start = (Get-Date).AddDays(-1)
$threshold = 5
Get-Stat -Entity $vms -Stat $stat -Start $start |
Group-Object -Property EntityId,{$_.Timestamp.Hour} | %{
New-Object PSObject -Property @{
VM = $_.Group[0].Entity.Name
Hour = $_.Group[0].Timestamp.Hour
CPU = [math]::Round(($_.Group | Where {$_.Instance -eq ''} | Measure-Object -Property Value -Average | Select -ExpandProperty Average),1)
}
} |
Group-Object VM | %{
$values = $_.Group | Sort-Object -Property Hour | %{$_.CPU}
$stdvar = Get-WelfordStdVar -Samples $values
if($stdvar -le $threshold){
$_ | Select Name,@{N='StdVar';E={[math]::Round($stdvar,2)}},@{N='CPU';E={[string]::Join('/',$values)}}
}
}
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
"the same value for all the intervals" - Yes. This is a symptoms of hung process therefore we are looking to correlate it with vmware.
I know a bit of R but don't know anything about it in the powershell world. Will take a look.
Very interesting on computing this using Welford's.
I have ran this for couple days and found it is extremely useful!
Can you suggest where I should put a filter to only show $stat there is greater than 20 to filter idle vms?
Name StdVar CPU
---- ------ ---
vm1 0.14 0.5/0.5/1.4/0.5/0.5/0.6/0.5/0.5/0.5/0.5/0.5/0.5/0.5/1.4/0.5/0.5/0.5/0.5/0.5/1.4/0.5/0.5/0.5/0.5
Thanks!
Try like this, it adds a condition to see if the average CPU usage is above 20%
Group-Object VM | %{
$values = $_.Group | Sort-Object -Property Hour | %{$_.CPU}
$avg = $_Group | Measure-Object -Property CPU -Average | Select -ExpandProperty Average
$stdvar = Get-WelfordStdVar -Samples $values
if($avg -gt 20 -and $stdvar -le $threshold){
$_ | Select Name,@{N='StdVar';E={[math]::Round($stdvar,2)}},@{N='CPU';E={[string]::Join('/',$values)}}
}
}
Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference
Works perfect - Thanks a lot LucD!