VMware Cloud Community
gnovak
Contributor
Contributor

Availibility Counter - What is it measuring?

I have a quick question here. When I am looking at the main platform page for a machine, there is an availability counter at the top. Usually for most machines it will be all green and say 100%. However I have a few machines that for some reason, have a counter that is displaying bad numbers like 9% available and the line is yellow and red.

The strange thing is that, say, a machine named Bob is available, hasn't gone down, hasn't had any problems what so ever, but on the main page of the machine Bob that counter isn't displaying the machine as being 100% available.

What exactly is the availability counter measuring? Is it measuring the availability of all servers and services on that machine combined? Is it just measuring if the machine is online and accessible? I'm not sure why on some machines it's great, all green, 100% but on others it's not.

Any ideas?
0 Kudos
11 Replies
deeboh
Enthusiast
Enthusiast

If I understand your question, I think the platform availability indicators show a cumalitive sum for all servers monitored. So if you have apache, jboss, hq agents, sshd process all being monitored from the platform page. All availability stats for all of those entities make up the total availability for that platform. so if you have apache at 100% and jboss at 50% then your availability metric would be the avg of those two, or any other entities being monitored. click the metric data tab which shows the raw numbers being used for calculations and this may be a bit more clear.

good luck,

Deeboh
0 Kudos
gnovak
Contributor
Contributor

Well, lets say i have some Favorite resources on the dashboard of hyperic. I have "bob" under favorite resources. I click on Bob and get to the main page for the machine named bob and everything that is being monitored under bob.

Without clicking on any services or servers being monitored on bob, I first notice the availability courter at the top which is under the view drop down menu. It is all red and yellow and availability is at 8.9%. I know this isn't true because Bob hasn't gone down.

I look at every service and server being monitored. I click on each one and notice all are green and when clicking on an individual service or server, availability for them is 100%.

It's really odd but I don't know why on the main page for Bob it's not 100% either.
0 Kudos
gnovak
Contributor
Contributor

I'll continue to watch it but it appears as if it hasn't changed that much. I might restart the agent or something...
0 Kudos
ama_hyperic
Hot Shot
Hot Shot

So the percentage you see will be the availability over the date range that you have specified in your metric viewer. So that percentage is availability over the last 8 hours or whatever it is you have defined.

If you are seeing odd availability stats, it may be because your servers are not time synced. Do you have ntp or anything to keep your servers synced up time wise? A small difference in time between server and agent can really skew your availability results.
0 Kudos
deeboh
Enthusiast
Enthusiast

That is bizarre gnovak. AMA makes a good suggestion. I wouldn't worry about rebooting your agents, those don't do the calculation for availability. I'd reboot the Server. If you do anything to the agent, you could remove the data directory then re-install or re-setup the agent.

Good luck,

Deeboh
0 Kudos
gnovak
Contributor
Contributor

It is odd. I am very frustrated and haven't really found out what the problem is. I've checked everything being monitored and all are ok so I'm not sure where the odd stats are coming from in terms of availability.

I'm not sure what is running on the machine to synch the time but i did check the time and it was accurate on the box.

I will poke around on the machine and see about the time synchs.
0 Kudos
gnovak
Contributor
Contributor

I have checked the agent logs on one of the machines that is acting a bit strange in hyperic. In the logs, there is the error:

Agen tis 227 seconds behind the server. To ensure accuracy of the charting and alerting make sure the agent and server clocks are synchronized.

I'm going to see why this is happening but it looks like yes, it is a time issue.
0 Kudos
admin
Immortal
Immortal

There is also a Server Offset metric that is collected on the HQ
Agent resource that will show you what time difference the server
thinks there is between the server and the agent.

Charles


0 Kudos
gnovak
Contributor
Contributor

Where can I find the server offset metric?
0 Kudos
admin
Immortal
Immortal

Go to an HQ Agent resource in the UI. It's an indicator metric.

Charles




0 Kudos
gnovak
Contributor
Contributor

Charles,

Thanks. I found it. It has been increasing in a steady pattern for some time now. This is most likely the problem. I am going to see about making sure ntp is on all of the machines and see if this resolves the problem.
0 Kudos