One of our Agents can't connect to the Server after a restart (for deploying a Plugin).
The command-window gave the following output:
- Unable to load agent token file. Generating a new one ... Done - Invoking agent - Agent thread running Agent successfully started
[ Running agent setup ] What is the HQ server IP address: 192.168.8.106 Should Agent communications to HQ always be secure [default=no]: no What is the HQ server port [default=7080]: 7080 - Testing insecure connection ... Success What is your HQ login [default=hqadmin]: hqadmin What is your HQ password: **Not echoing value** What IP should HQ use to contact the agent [default=10.10.16.21]: 10.10.16.21 What port should HQ use to contact the agent [default=2144]: 2144 - Received temporary auth token from agent - Registering agent with HQ org.hyperic.hq.bizapp.client.AgentCallbackClientException: IO error: java.net.SocketTimeoutException: Read timed out at org.hyperic.hq.bizapp.client.AgentCallbackClient.invokeLatherCall(AgentCallbackClient.java:171) at org.hyperic.hq.bizapp.client.BizappCallbackClient.registerAgent(BizappCallbackClient.java:91) at org.hyperic.hq.bizapp.agent.client.AgentClient.cmdSetup(AgentClient.java:662) at org.hyperic.hq.bizapp.agent.client.AgentClient.main(AgentClient.java:1133) - Error registering agent: IO error: java.net.SocketTimeoutException: Read timed out
When I start the Agent it lasts a few minutes at "Registering agent with HQ" and the Server-Log outputs the following:
2009-04-09 13:22:14,433 INFO [Thread-372] [org.hyperic.hq.agent.client.AgentConnection] 10.10.16.21 -> agent:ping 2009-04-09 13:22:17,808 INFO [Thread-372] [org.hyperic.hq.bizapp.server.session.LatherDispatcher] Updating agent information for 10.10.16.21:2144
Than the timeout-error appears in the command window.
I attached the Agent-Log and the Agent-Properties File.
A few minutes later I got hundret of these entries in the Server-Log File:
2009-04-09 13:47:35,035 INFO [AgentScheduleSyncListener1] [org.hyperic.hq.agent.client.AgentConnection] 10.10.16.21 -> rtm:unscheduleMeasurements 2009-04-09 13:47:35,441 INFO [AgentScheduleSyncListener1] [org.hyperic.hq.agent.client.AgentConnection] 10.10.16.21 -> rtm:unscheduleMeasurements ...
You didn't told on what OS this agent is running. When I develop plugins, there constant need to restart the agent. I've seen these
2009-04-09 13:22:30,665 ERROR [Thread-1] [AutoinventoryCommandsServer] Unable to send autoinventory platform data to server, sleeping for 22 secs before retrying. Error: Unable to communicate with server -- provider not yet setup
messages many times, especially on windows. There is few things what I usually try:
1. If Win, skipping starting agent through service. Instead using direct command: hq-agent.bat start. 2. Also I may try to delete agents 'data' directory before restart. 3. If not success, try to use hq-agent.bat command under bundle. (bundles/agentXXX/bin/hq-agent.bat)
Sometimes this tanuki wrapper stuff just seems to fail. Don't know why.
The Operating System of my Agent is "Windows Server 2003 x64 Edition".
I think I allready tried the steps:
1. I'm not starting the agent on this computer as a service. 2. I have allready deleted the 'data' directory and also work and temp on the server (hq-engine\server\default). 3. I'm using Hyperic HQ 3.2.4 so I'm not yet having the bundle-stuff.
I tried to restart the server several times, but this have no effect on the problem.
With the plugin, you are right. I tried to install a query-plugin and it first went wrong. Later I installed it on another agent and fix some thing, now it works. I have deleted the plugin on the agent with the connection-problem and the log-file also shows nothing about plugin problems.