VMware Cloud Community
TheChristoph
Contributor
Contributor

Can't connect to server

One of our Agents can't connect to the Server after a restart (for deploying a Plugin).

The command-window gave the following output:

- Unable to load agent token file. Generating a new one ... Done
- Invoking agent
- Agent thread running
Agent successfully started

[ Running agent setup ]
What is the HQ server IP address: 192.168.8.106
Should Agent communications to HQ always be secure [default=no]: no
What is the HQ server port [default=7080]: 7080
- Testing insecure connection ... Success
What is your HQ login [default=hqadmin]: hqadmin
What is your HQ password: **Not echoing value**
What IP should HQ use to contact the agent [default=10.10.16.21]: 10.10.16.21
What port should HQ use to contact the agent [default=2144]: 2144
- Received temporary auth token from agent
- Registering agent with HQ
org.hyperic.hq.bizapp.client.AgentCallbackClientException: IO error: java.net.SocketTimeoutException: Read timed out
at org.hyperic.hq.bizapp.client.AgentCallbackClient.invokeLatherCall(AgentCallbackClient.java:171)
at org.hyperic.hq.bizapp.client.BizappCallbackClient.registerAgent(BizappCallbackClient.java:91)
at org.hyperic.hq.bizapp.agent.client.AgentClient.cmdSetup(AgentClient.java:662)
at org.hyperic.hq.bizapp.agent.client.AgentClient.main(AgentClient.java:1133)
- Error registering agent: IO error: java.net.SocketTimeoutException: Read timed out


When I start the Agent it lasts a few minutes at "Registering agent with HQ" and the Server-Log outputs the following:

2009-04-09 13:22:14,433 INFO [Thread-372] [org.hyperic.hq.agent.client.AgentConnection] 10.10.16.21 -> agent:ping
2009-04-09 13:22:17,808 INFO [Thread-372] [org.hyperic.hq.bizapp.server.session.LatherDispatcher] Updating agent information for 10.10.16.21:2144


Than the timeout-error appears in the command window.

I attached the Agent-Log and the Agent-Properties File.

A few minutes later I got hundret of these entries in the Server-Log File:

2009-04-09 13:47:35,035 INFO [AgentScheduleSyncListener1] [org.hyperic.hq.agent.client.AgentConnection] 10.10.16.21 -> rtm:unscheduleMeasurements
2009-04-09 13:47:35,441 INFO [AgentScheduleSyncListener1] [org.hyperic.hq.agent.client.AgentConnection] 10.10.16.21 -> rtm:unscheduleMeasurements
...


Could anybody tell me what's the problem?

Message was edited by: TheChristoph
Reply
0 Kudos
5 Replies
jvalkeal_hyperi

You didn't told on what OS this agent is running. When I develop plugins, there constant need to restart the agent. I've seen these

2009-04-09 13:22:30,665 ERROR [Thread-1] [AutoinventoryCommandsServer] Unable to send autoinventory platform data to server, sleeping for 22 secs before retrying. Error: Unable to communicate with server -- provider not yet setup

messages many times, especially on windows. There is few things what I usually try:

1. If Win, skipping starting agent through service. Instead using direct command: hq-agent.bat start.
2. Also I may try to delete agents 'data' directory before restart.
3. If not success, try to use hq-agent.bat command under bundle. (bundles/agentXXX/bin/hq-agent.bat)

Sometimes this tanuki wrapper stuff just seems to fail. Don't know why.
Reply
0 Kudos
TheChristoph
Contributor
Contributor

Hello jvalkeal, thank you for you help.

The Operating System of my Agent is "Windows Server 2003 x64 Edition".

I think I allready tried the steps:

1. I'm not starting the agent on this computer as a service.
2. I have allready deleted the 'data' directory and also work and temp on the server (hq-engine\server\default).
3. I'm using Hyperic HQ 3.2.4 so I'm not yet having the bundle-stuff.

I have now found the following in the Server-Log:

2009-04-09 13:32:14,402 WARN [http-0.0.0.0-7080-14] [org.hyperic.lather.jboss.JBossLatherServlet] Execution of 'registerAgent' exceeded 600 seconds

Any further ideas?

Message was edited by: TheChristoph
Reply
0 Kudos
jvalkeal_hyperi

Oh, gosh... something I haven't ever seen. Looks like deadlock or something else... or maybe some db update/query is jammed. It just seems that servlet is taking too much time to finish.

Have you tried to restart the server? (usually don't fix anything)

You was in process to update some plugins? Did it went well? No complaints on any logs...

Is data for your platform important, so for last step is it possible to delete platform and discover it again.

This is a version what I don't know very well. I'm a bit more familiar with 4.x.
TheChristoph
Contributor
Contributor

Hello again,

I was ill the last weeks, so I couldn't answer.

I tried to restart the server several times, but this have no effect on the problem.

With the plugin, you are right. I tried to install a query-plugin and it first went wrong. Later I installed it on another agent and fix some thing, now it works.
I have deleted the plugin on the agent with the connection-problem and the log-file also shows nothing about plugin problems.
Reply
0 Kudos
TheChristoph
Contributor
Contributor

I have deleted and reinstalled the relevant agent on the server-website.

It took some hours but the problem is solved now.
Reply
0 Kudos