I'm a newbie (first post) and am looking at how I could use Hyperic HQ to monitor a bunch of remote servers from which I currently get a collection of emails that report the current status of the server and its applications.
I don't have direct access to these servers, they are behind firewalls in hospital networks. We have some limited manual VPN access to them but each is a different VPN type and some have manual security dongles that you have to get a number from to be able to login to. Difficult to setup a VPN tunnel, not possible in the short term.
So, for the moment, my only option is to collect stats and status via scripts on the servers and forward these on via email.
I was wondering if there are any plugins that might support this sort of monitoring?
From what I have read in the forums, I could write a script that receives the emails and parses them and outputs the required information to stdout and then write an Hyperic HQ plugin that read this to load it into Hyperic.
Your situation sounds complicated. Right now your idea sounds like its the only approach.
However, in 4.0 which is out in a few weeks - this *may* get easier for you as we're building in a single directional agent protocol. Meaning, the agent can talk out to the server only (assume that this is possible since you can send email and its the getting in that is the problem).
However, this will be an HQ Enterprise only feature since Security is one of the axis points for us for Enterprise users. Are you by chance considering Enterprise?
Thanks for the info Stacey. Am I interested in HQ Enterprise? Probably not as I don't have any budget allocated for this and I am not sure that Hyperic can do what we need. One aspect is as described earlier, remote monitoring via email reports.
The second is that we need to do some reasonably complex analysis of incoming events from various sources to determine what is an alert and what isn't. eg:
- if we see more than 4 log messages of ERROR severity of a particular type within 30 minutes then send an alert - if there is a log message for a slow DB query where the time in the message is > 20000mS and the incoming event queue length (retrieved via JMX) is greater than 5 then send an alert.
Alerts based on the presence of single error messages from the log or metrics and thresholds are not very useful to us.
I believe we need a rules engine to process our events and logs to generate the alerts. We are considering building something ourselves based on Drools for the analysis on the customer systems and sending reports from this via email to a Hyperic plugin on our local system.
thanks for the info on the upcoming 4.0 release. If Hyperic can do more complex analysis of metrics and alerts as described above I would be interested, I really don't want to write my own.
You are quite welcome. I completely understand the no budget thing and the desire to use open source first.
That said, the multi-conditional alerts you are looking for are also in enterprise. Since Hyperic is designed to be so self-service for you to deploy, we create value around an enterprise offering by increased automation, intelligence (think reporting) and security features that fit more complex and large scale deployments. This is our business model and how we plan to stick around to continue to help and innovate with you.
I am not sure about your scale, but it does sound like you are crossing that complexity access to be better served by an enterprise trial. It may be worthwhile to talk to sales to get an idea about what it may cost before you pursue a trial, and see if you can justify the savings by having this extra stuff out of the box. We're still an open source company, so it is usually pretty affordable.
OK - ! For your sitch though, it sounds like it may be the best way to save time -- and time can equal money. Good luck, and we hope to keep seeing you here!
I'd like to query another aspect of monitoring via email.
By its very nature, email can arrive late and even out of order. We sometimes get delays of up to 3 days in email delivery from one customer site (problems in their internet, mail gets queued over the weekend).
These emails will contain batches of metrics and log messages taken from the customer servers and apps. These collected metrics and log messages all have timestamps identifying the time at which the value was collected.
I have a simple hyperic plugin that decodes the email and plays the values back into hyperic but hyperic is using the time of replay as the time for the metric. Can I specify a timestamp for the metrics when they are played back to hyperic so they appear at the time of collection rather then the time of replay?
I haven't forgotten about you - but you have me stumped. I see the need, but as far as I know the agent tracks the collection time (as you are seeing) for you. This cuts down on the possibility of errors for most people, but in your case - is preventing you from doing needed overrides. Here is the complete list of the XML descriptors, and its not in there: http://support.hyperic.com/display/DOCSHQ30/Plugin+XML+Descriptor+Syntax
So, what I am thinking - is this may be a reason for HQU, since theoretically it can write anything to the database and including that timestamp. Not sure though - need to check with SIGAR god Doug on this one. Really good question though.