Hello:
We are using Hyperic version 4.5, currently we are monitoring the availability of Solaris Servers and we are seeing some servers in the network are not available for a brief period of time and generating alerts.
solaris version: 5.10
Here is the alert message:
============================================================
solaris-db has generated the following alert:
Solaris server component(s) down solaris-db Availability (0.0%)
-----------------------------------------
ALERT DETAIL
- Resource Name: solaris-db
- Alert Name: Solaris server component(s) down
- Alert Date / Time: April 22, 2011 3:11:00 AM CDT
- Triggering Condition(s):
If Availability<100.0% (actual value = 0.0%)
- Alert Severity: !!! - High
Last Indicator Metrics Collected:
[April 22, 2011 3:13:00 AM CDT] Availability = 0.0%
[April 22, 2011 3:10:00 AM CDT] Free Memory = 563.1 MB
[April 22, 2011 3:10:00 AM CDT] Load Average 5 Minutes = 0.9
[April 22, 2011 3:10:00 AM CDT] Swap Used = 7.0 GB
===========================================================
But there is no issue with the Server itself. Also checked with network team if there is any connectivity between the server where hyperic running and the other servers being monitored, and they confirmed there is no activity during that time frame.
Some one please help me to figure out
1) what is the command hyperic runs to probe the Solaris server availability, so that we can run it from the shell and verify.
2) Dose it mean the probe that was sent timed out ?
3) any other way of troubleshoot this issue.
Thanks in advance.
Message was edited by: shquser