VMware Cloud Community
VoodooZ
Contributor
Contributor

Unable to monitor local agent with version 3.2

Hi,

I've been using Hyperic HQ open source rev. 3.1.1 for months and decided to upgrade my server to 3.2 today. the server upgraded fine and worked with the old agents 3.1.1...

Now, before going around updating all agents I tried updating the local one (on the same box as the server) and whatever I did It never appeared reported to the server.
The strange thing is that starting the old 3.1.1 agent works like a charm...

Is it my agent config? Do i have to do anything special to run it on the same box?

Any help would be appreciated,

Thanks for a great product,
0 Kudos
26 Replies
excowboy
Virtuoso
Virtuoso

Hi,

did you try to setup auth tokens again ? "hq-agent.sh setup"
Anything interesting from the log ?

Mirko
0 Kudos
VoodooZ
Contributor
Contributor

Yep. tried that. tried deleting data folder too..
Nothing odd in the logs either.
I'll try again with DEBUG on..
0 Kudos
BradFelmey
Hot Shot
Hot Shot

I have seen this as well. Upgrading an agent does not work unless and until the platform is deleted from the server, then set up as new.
0 Kudos
admin
Immortal
Immortal

Well, this is certainly unacceptable. You also were upgrading from a
3.1.1 agent to a 3.2 agent?

The only indication that the agent is 'not working' is that the
metrics aren't coming in?

-- Jon



On Feb 13, 2008, at 8:35 AM, Brad Felmey wrote:

> I have seen this as well. Upgrading an agent does not work unless
> and until the platform is deleted from the server, then set up as new.


0 Kudos
VoodooZ
Contributor
Contributor

Yep. production 3.1.1 agent/server to 3.2..
The odd thing is that this problem happens only on my server box as upgrading the agent on a monitored remote box works fine...

Doing 'hq-agent.sh ping' on the server is successful but the box is shown as down in HQ...

so I guess I'll have to put my upgrade on hold until this is fixed..
Right now I kept the server at 3.2 and all my targets' agents at 3.1.1...
0 Kudos
admin
Immortal
Immortal

> Yep. production 3.1.1 agent/server to 3.2..
> The odd thing is that this problem happens only on my
> server box as upgrading the agent on a monitored
> remote box works fine...
>

I've seem similar cases of this. Can you attach your server log?

-Ryan
0 Kudos
admin
Immortal
Immortal

Ack.. make that your agent log.. Sorry for the confusion.

-Ryan
0 Kudos
BradFelmey
Hot Shot
Hot Shot

> Well, this is certainly unacceptable. You also were
> upgrading from a
> 3.1.1 agent to a 3.2 agent?
>
> The only indication that the agent is 'not working'
> is that the
> metrics aren't coming in?

From a variety going all the way back to 2.7.3, up to 3.1.1.

The answer to your second question is "yes". The agent appears to be fine, and the dashboard had shown that it saw the agent update, was imported, etc., but the server ceased to display metrics for that platform. Has happened 100%.
0 Kudos
VoodooZ
Contributor
Contributor

here's my agent.log (for agent version 3.2)
Note that this log might contain multiple restart as I was trying different setup configs...

I did notice a few entries where the agent reports it can't talk to server and after a retry eventually does.. odd..

Thanks,
0 Kudos
afrosheen
Contributor
Contributor

I'm having the same problem with Centos 5. My new server monitors everything but itself. The agent just dies about 12 hours after I launch it (on the server). It shows that it's still running but either refuses to report metrics or collect them, I don't know which. Haven't had this problem with any other version of Hyperic server or agent. I guess I'll dig into the agent log and see what it's doing.
0 Kudos
VoodooZ
Contributor
Contributor

No update on this?
I regressed back to 3.1.1 but there was a problem with it bringing our Oracle down so that's why I want to try 3.2 asap.

Thanks,
0 Kudos
admin
Immortal
Immortal

We think we may have identified the issue and applied a fix. The
problem is specifically with 3.2 agents. We are in the QA process and
will release a maintenance release for 3.2 soon. For now, I would
recommend that you continue to use your 3.1 agents, the server is
backwards compatible with older agents. Thanks for reporting the
problem.

Charles

0 Kudos
admin
Immortal
Immortal

Stephane, if you are able to reproduce consistently, would you be
open to trying out a patch to see if it fixes you up? We have
identified an issue with the 3.2 agent and have a fix for it, but it
may not be the same problem that you are seeing (and we'd like to
know that ASAP).

-- Jon


On Feb 22, 2008, at 10:17 AM, Charles Lee wrote:

> We think we may have identified the issue and applied a fix. The
> problem is specifically with 3.2 agents. We are in the QA process
> and will release a maintenance release for 3.2 soon. For now, I
> would recommend that you continue to use your 3.1 agents, the
> server is backwards compatible with older agents. Thanks for
> reporting the problem.
>
> Charles


0 Kudos
VoodooZ
Contributor
Contributor

Thank you very much for the update.

I'll keep checking the page for the release..

Keep up the excellent work,

ps: are you aware of any bug (old/new) that would cause a monitored Oracle server to become overloaded because of the monitoring? Since my boss had a problem with that we disabled hyperic on that production box... I'll try again with 3.2 but I'd rather not have it go down again..
0 Kudos
VoodooZ
Contributor
Contributor

Of course.
Just PM me..
0 Kudos
scottmf
Enthusiast
Enthusiast

Hi here is the patch for the 3.2 agent.

unzip/untar in your agent dir and you should be off and running.

Let us know if you have any issues.
0 Kudos
VoodooZ
Contributor
Contributor

Thanks.
I'm in the middle of something right now so I won't be able to test right away but I'll definitely get back to you this week..

Take care,
0 Kudos
jtravis_hyperic
Hot Shot
Hot Shot

It would be good if people seeing the issue can test with this patch ASAP, as we are trying to validate it for a bug-fix release this week.
0 Kudos
VoodooZ
Contributor
Contributor

Ok. I tested the patch and it appears to work..
But I had to set the local agent to use localhost for client connection as the hostname was giving me a client not allowed type of message.

Thanks,
0 Kudos