VMware Cloud Community
ae_jenkins
Enthusiast
Enthusiast

ESX Host Down Time

Hi

I am working a script that can show total downtime of a esx host for a month, using "get-vmhost HOST | get-stat -stat sys.uptime.latest.  I plan on keying off the Value field (in seconds) to try and roughly calculate how long the server has been down for the month.  Its complicating as the sys.uptime.latest is of course compounding:  Any daily uptime that's less than 86400 (number of seconds in a day) would indicate an outage happened at some point.

Before I go too deeply into this route, does anybody have any suggestions?

5 Replies
LucD
Leadership
Leadership

I'm not sure that the system.uptime.latest is the ideal method to find the system uptime over a month.

This metric shows the seconds since the last system reboot.

A better way could be to use the events.

Something like this will show all connection changes over 1 month

$esx = Get-VMHost MyEsx 
Get-VIEvent
-Entity $esx -MaxSamples 99999 -Start (Get-Date).AddDays(-31) | `
where {"HostConnectionLostEvent","HostConnectedEvent" -contains $_.GetType().Name} | `
Sort-Object
-Property {$_.CreatedTime.DateTime} -Unique | `
Select @{N="Hostname";E={$_.Host.Name}},        @{N="Time";E={$_.CreatedTime.ToShortDateString() + " " + $_.CreatedTime.ToShortTimeString()}},        @{N="Status";E={if($_.GetType().Name -eq "HostConnectedEvent"){"Connected"}else{"Not connected"}}}

Note that a reconnection event from a host often appears twice in the events, for that reason I added the -Unique parameter to the Sort-Object cmdlet.

From the list this produces it should be trivial to calculate the total uptime.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

ae_jenkins
Enthusiast
Enthusiast

Hey LucD

No surprise you were the first to answer 😉

This is ingenious as usual, and helpful, but the disconnect time to me isn't true downtime.  If the host is disconnect it *could* still be running with the VMs all isolated.  I'm defining downtime here as when the ESX host was completely down.

I'm working on my script to post.

Reply
0 Kudos
LucD
Leadership
Leadership

True, but if your ESX(i) is disconnected for a longer time, you will loose some statistical data as well.

There is only a relatively small amount of performance data chached on the ESX(i) server afaik.

The only other alternative that I can think of is to use SNMP traps.

Your monitoring server will receive the traps when the ESX(i) is booted, independent if is connected to a vCenter or not.

But that won't be PowerCLI anymore I'm afraid.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
ae_jenkins
Enthusiast
Enthusiast

LucD

The attached script isn't very tight, but you get the idea what I'm trying to do.  The systems are not 'i', but we will end up using a syslog server for our ESXi systems:  The report will change obviously, but this is kinda what I was looking for.  Suggestions?

Reply
0 Kudos
LucD
Leadership
Leadership

Great, if you can live with a 1-day granularity the script seems to work.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos