VMware Cloud Community
J_Virtual
Contributor
Contributor

Exclude hosts in maintenance from host report?

Hi!

Im trying to make a report to show host availability. A can show the host availibility in % so that works ok. But i can't seem to exclude data from when the hosts are in maintenance in vcenter.

I tried making a group including only hosts out of maintenance and running the report on this group, but it still displays loss of avaiibility when hosts are in maintenance.

0 Kudos
6 Replies
sxnxr
Commander
Commander

The problem i find with availability is that it can be misleading.

Should it be 100% if a host is in maintenance mode?

Should it be 100% if a host is in maintenance mode and rebooted?

Should it be 100% if a host is not in maintenance but user rebooted?

If you are using the vRealise operations generated availability badge it should only show you when a host has been rebooted not if it is in maintenance mode

0 Kudos
J_Virtual
Contributor
Contributor

I would prefer if its 100% even if the host is rebooted (in maintenance mode). For example. If we install new blade firmware that could easily take 20 minutes when the hosts is rebooting and we would not want that to reflect in the availability report..

0 Kudos
sxnxr
Commander
Commander

There may be a way with a complicated super metric using if statements (that is if you can in a super metric) but it would be beyond my skill level

If there is another way i dont know of it.

One thing tho why focus on the availability of the hosts. A host is a commodity and should be treated as one. All my availability reports are based on the VM as it is the only object you should be concerned about. It should not matter if a host is offline for 6 month as long as all your Vms were available. I would look at getting management buy in to not worry about host availability but VM availability.

Using the vms availability the only time it would report an outage is if the VM is powered off ( by and HA event or a guest shutdown (inguest or vcenter user initiated)) if the server is reboted by a windows engineer then that would not be counted.

0 Kudos
dtaliafe
Hot Shot
Hot Shot

I agree with sxnxr, I don't think host availability is all that important.  If I needed to report availability for something other than VMs I would look at the cluster level - i.e. if the cluster has sufficient hosts available to meet HA requirements it's good.  Unfortunately I don't know of a metric for this.

One idea I had though, if you have a custom policy for hosts in maintenance that disables the availability metric I wonder if that would work.  For example you create a custom group that hosts are added to when in maintenance and removed from when they're not (or vice versa).  Either way you have a separate policy for hosts in maintenance where the availability metric can be disabled, so the metric is only enabled when they are "in production".  See this discussion for some details.

How to stop alerts while ESXi Host is in maintenance mode

I don't know if that will get the desired results, but it could be worth a try.

0 Kudos
J_Virtual
Contributor
Contributor

Ok, i get your point. Smiley Happy  So what would be a good way to make a cluster report that shows this? That i have sufficient resources available..

You have any examples? The other way to go is to create a group / customer and measure on the VM availability? What metrics to pick there? And how do i handle if a customer initiates a guest shutdown? That will show in the stats i guess?

0 Kudos
sxnxr
Commander
Commander

I am not saying J_VirtualJ_Virtual is wrong as there are many ways to do this. The problem i see with reporting that a cluster has enough resources free is what do you class as enough free.

Do you look at allocation and use math to say you have a 5:1 vCPU to pCPU ratio then count up all the allocated vCPUs and take them away from the total in the cluster and if they dont go over the total then you are good. Because this is allocation you will more than likely have enough capacity to exceed this.

Same with demand  what is you comfortable over commit level.

I think it just makes it to complicated to a level that management would not understand it.

Now to be clear i am in a visualization team and dont support the guests so my next idea is how i think the windows support team should do it without knowing if it will work. It will also come down to team policies and how well they are enforced or how good your change control is

you could create a view to show the % of time the vm has been powered off taking any that is not 0% and run a script against the windows server to extract who shut it down and the reason why if it is filled in or the change management server.

Either way there is a bit of work involved

0 Kudos