VMware Cloud Community
ZacharyD
Contributor
Contributor

PostgreSQL and Hyperic HQ Issue

We are running our Hyper HQ software using Windows 2003 server and we are running into probles witht the PostreSQL. Once a day we get many errors in event viewer that informs us that: "The description for Event ID (0) in Source (PostgreSQL) cannont be found. The local computer may not have the necessary registry information or message DLL files to diaplay messages from a remote computer. You may be able to use the /AUXSOURCE = flag to retrieve this description; see Help and Support for details. The following information is part of the event: [2006-08-31 15:23:09.189] FATAL: could not read from statistics collector pipe: No such file or directory." This is usually accompanied by our hypericHQ software telling us that every platform or server that is set up in hypericHQ is unavailable, when we know for a fact that everything is available. If i restart the windows server, the software comes back online and works until that error is posted again in Event Viewer. Is there somthing wrong with PostgreSQL or our Hyperic software? Please inform.
0 Kudos
19 Replies
ZacharyD
Contributor
Contributor

Here is a follow up to the original message. I went into yesterday's log and pulled this error out of it. It is very extensive and this is one of hundreds of errors. Let me know what you think.



2006-08-30 00:07:18,514 ERROR [org.hyperic.hq.events.server.session.SessionEJB] SQLException determining if alert definition is enabled
org.postgresql.util.PSQLException: An I/O error occured while sending to the backend.
at org.postgresql.core.v2.QueryExecutorImpl.execute(QueryExecutorImpl.java:342)
at org.postgresql.core.v2.QueryExecutorImpl.execute(QueryExecutorImpl.java:231)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:389)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:330)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:240)
at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:78)
at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:296)
at org.hyperic.hq.events.server.session.AlertDefinitionManagerEJBImpl.getIdFromTrigger(AlertDefinitionManagerEJBImpl.java:507)
at sun.reflect.GeneratedMethodAccessor270.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.jboss.invocation.Invocation.performCall(Invocation.java:345)
at org.jboss.ejb.StatelessSessionContainer$ContainerInterceptor.invoke(StatelessSessionContainer.java:214)
at org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:185)
at org.jboss.ejb.plugins.StatelessSessionInstanceInterceptor.invoke(StatelessSessionInstanceInterceptor.java:130)
at org.jboss.webservice.server.ServiceEndpointInterceptor.invoke(ServiceEndpointInterceptor.java:51)
at org.jboss.ejb.plugins.CallValidationInterceptor.invoke(CallValidationInterceptor.java:48)
at org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:105)
at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:335)
at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:166)
at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:139)
at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:192)
at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:624)
at org.jboss.ejb.Container.invoke(Container.java:873)
at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:413)
at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:88)
at $Proxy510.getIdFromTrigger(Unknown Source)
at org.hyperic.hq.events.ext.AbstractTrigger.fireActions(AbstractTrigger.java:159)
at org.hyperic.hq.bizapp.server.trigger.conditional.MeasurementThresholdTrigger.processEvent(MeasurementThresholdTrigger.java:240)
at org.hyperic.hq.bizapp.server.mdb.RegisteredDispatcherEJBImpl.dispatchEvent(RegisteredDispatcherEJBImpl.java:59)
at org.hyperic.hq.bizapp.server.mdb.RegisteredDispatcherEJBImpl.onMessage(RegisteredDispatcherEJBImpl.java:92)
at sun.reflect.GeneratedMethodAccessor185.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.jboss.invocation.Invocation.performCall(Invocation.java:345)
at org.jboss.ejb.MessageDrivenContainer$ContainerInterceptor.invoke(MessageDrivenContainer.java:475)
at org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:185)
at org.jboss.ejb.plugins.MessageDrivenInstanceInterceptor.invoke(MessageDrivenInstanceInterceptor.java:87)
at org.jboss.ejb.plugins.CallValidationInterceptor.invoke(CallValidationInterceptor.java:48)
at org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:105)
at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:335)
at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:166)
at org.jboss.ejb.plugins.RunAsSecurityInterceptor.invoke(RunAsSecurityInterceptor.java:94)
at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:192)
at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
at org.jboss.ejb.MessageDrivenContainer.internalInvoke(MessageDrivenContainer.java:389)
at org.jboss.ejb.Container.invoke(Container.java:873)
at org.jboss.ejb.plugins.jms.JMSContainerInvoker.invoke(JMSContainerInvoker.java:1090)
at org.jboss.ejb.plugins.jms.JMSContainerInvoker$MessageListenerImpl.onMessage(JMSContainerInvoker.java:1392)
at org.jboss.jms.asf.StdServerSession.onMessage(StdServerSession.java:256)
at org.jboss.mq.SpyMessageConsumer.sessionConsumerProcessMessage(SpyMessageConsumer.java:904)
at org.jboss.mq.SpyMessageConsumer.addMessage(SpyMessageConsumer.java:160)
at org.jboss.mq.SpySession.run(SpySession.java:333)
at org.jboss.jms.asf.StdServerSession.run(StdServerSession.java:180)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.write(Unknown Source)
at org.postgresql.core.PGStream.SendChar(PGStream.java:154)
at org.postgresql.core.v2.QueryExecutorImpl.sendQuery(QueryExecutorImpl.java:355)
at org.postgresql.core.v2.QueryExecutorImpl.execute(QueryExecutorImpl.java:336)
... 56 more
0 Kudos
ZacharyD
Contributor
Contributor

Does anyone have any clue to why the hq software would be acting this way. I can't seem to stop if from telling me that all my platforms are unvailable when nothing has gone down. All platforms are running. Some help would be greatly appreciated. Thank you.
0 Kudos
john_hyperic
Hot Shot
Hot Shot

It looks like the PostgreSQL server on your machine is either not
working well or crashing. It's not clear why that would happen, I have
not seen this behavior before on any platform. There are several
successful installations of Postgres out there on Windows, so it's tough
to say.

If you have the resources, I would try running the HQ PostgreSQL
database on a separate machine. If you go that route, take a look at
the following page so you can be sure to configure your database correctly.

http://support.hyperic.com/confluence/display/DOCSHQ27/Database+Preparation

If you do not have another machine to run a database on, another option
to try would be to install PostgreSQL from Postgres (as opposed to using
the binaries that are in the HQ bundle, which I assume is what you are
using). Just download the PostgreSQL server for your platform from
http://www.postgresql.org/download/ and prepare it the same way as
documented in the link above.

Let us know what happens.

0 Kudos
admin
Immortal
Immortal

Do all the errors look like the one you posted earlier? I'm wondering if it's a problem with that specific API call, or if it's a more general database error. I'm assuming the latter.

What are the hardware specs of the machine running the HQ server? How many platforms are you monitoring? How often do these errors occur? It could be that we just need to tweak your embedded database settings.

If there is anything interesting in your server-2.7.3/logs/hqdb.log feel free to post it here.

Thanks,
-Ryan
0 Kudos
ZacharyD
Contributor
Contributor

The server we are running it on has two 2.8 GHz Intell Xeon processor with 1 GB of ram. It is a windows 2003 server with service pack 1 installed on it. We are currently monitoring 95 different platforms, mainly our cisco routers, switches and access points. The windows error messages are different everytime, the error messages from yesterdays log are different than the one i posted. They look more like this:

2006-09-06 00:30:58,483 ERROR [org.quartz.core.ErrorLogger] An error occured while marking executed job complete. job= 'Data Purge Group.Data Purge Job'
org.quartz.JobPersistenceException: Couldn't delete fired trigger: ERROR: could not write block 28900 of relation 1663/16384/17118: Permission denied [See nested exception: org.postgresql.util.PSQLException: ERROR: could not write block 28900 of relation 1663/16384/17118: Permission denied]
at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggeredJobComplete(JobStoreSupport.java:1945)
at org.quartz.impl.jdbcjobstore.JobStoreCMT.triggeredJobComplete(JobStoreCMT.java:1270)
at org.quartz.core.QuartzScheduler.notifyJobStoreJobComplete(QuartzScheduler.java:1490)
at org.quartz.core.JobRunShell.completeTriggerRetryLoop(JobRunShell.java:384)
at org.quartz.core.JobRunShell.run(JobRunShell.java:268)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
* Nested Exception (Underlying Cause) ---------------
org.postgresql.util.PSQLException: ERROR: could not write block 28900 of relation 1663/16384/17118: Permission denied
at org.postgresql.core.v2.QueryExecutorImpl.receiveErrorMessage(QueryExecutorImpl.java:515)
at org.postgresql.core.v2.QueryExecutorImpl.processResults(QueryExecutorImpl.java:439)
at org.postgresql.core.v2.QueryExecutorImpl.execute(QueryExecutorImpl.java:337)
at org.postgresql.core.v2.QueryExecutorImpl.execute(QueryExecutorImpl.java:231)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:389)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:330)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:282)
at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeUpdate(CachedPreparedStatement.java:83)
at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeUpdate(WrappedPreparedStatement.java:316)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.deleteFiredTrigger(StdJDBCDelegate.java:3561)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggeredJobComplete(JobStoreSupport.java:1943)
at org.quartz.impl.jdbcjobstore.JobStoreCMT.triggeredJobComplete(JobStoreCMT.java:1270)
at org.quartz.core.QuartzScheduler.notifyJobStoreJobComplete(QuartzScheduler.java:1490)
at org.quartz.core.JobRunShell.completeTriggerRetryLoop(JobRunShell.java:384)
at org.quartz.core.JobRunShell.run(JobRunShell.java:268)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)

This happens daily, I reboot the server and the hq software works perfect. Then about 15 minutes later, the hq webpage displays every platform as unavailible. The hqdb.log is very interesting cause the log dates everything as july 14. Here's what it looks like:

[2006-07-14 15:20:05.482 ] LOG: database system was shut down at 2006-01-27 11:13:35 Central Standard Time
[2006-07-14 15:20:05.482 ] LOG: checkpoint record is at 0/38C5C0
[2006-07-14 15:20:05.482 ] LOG: redo record is at 0/38C5C0; undo record is at 0/0; shutdown TRUE
[2006-07-14 15:20:05.482 ] LOG: next transaction ID: 569; next OID: 16386
[2006-07-14 15:20:05.482 ] LOG: next MultiXactId: 1; next MultiXactOffset: 0
[2006-07-14 15:20:05.497 ] LOG: database system is ready
[2006-07-14 15:20:05.497 ] LOG: transaction ID wrap limit is 2147484146, limited by database "postgres"
[2006-07-14 15:20:07.097 ] LOG: incomplete startup packet
[2006-07-14 15:20:17.483 ] ERROR: view "eam_virt_plat_view" does not exist
[2006-07-14 15:20:17.498 ] ERROR: view "eam_virt_svr_view" does not exist
[2006-07-14 15:20:17.498 ] ERROR: view "eam_virt_svc_view" does not exist
[2006-07-14 15:20:17.498 ] ERROR: view "eam_metric_prob_view" does not exist
[2006-07-14 15:20:17.725 ] ERROR: table "eam_autoinv_history" does not exist
[2006-07-14 15:20:17.830 ] ERROR: table "eam_autoinv_schedule" does not exist
[2006-07-14 15:20:17.830 ] ERROR: table "eam_plugin" does not exist
[2006-07-14 15:20:17.845 ] ERROR: table "eam_stat_errors" does not exist
[2006-07-14 15:20:17.845 ] ERROR: table "eam_error_code" does not exist
[2006-07-14 15:20:17.845 ] ERROR: table "eam_request_stat" does not exist
[2006-07-14 15:20:17.845 ] ERROR: table "eam_service_request" does not exist
[2006-07-14 15:20:17.845 ] ERROR: table "eam_qrtz_locks" does not exist
[2006-07-14 15:20:17.861 ] ERROR: table "eam_qrtz_scheduler_state" does not exist
[2006-07-14 15:20:17.861 ] ERROR: table "eam_qrtz_paused_trigger_grps" does not exist
[2006-07-14 15:20:17.861 ] ERROR: table "eam_qrtz_calendars" does not exist
[2006-07-14 15:20:17.861 ] ERROR: table "eam_qrtz_fired_triggers" does not exist
[2006-07-14 15:20:17.861 ] ERROR: table "eam_qrtz_trigger_listeners" does not exist
[2006-07-14 15:20:17.876 ] ERROR: table "eam_qrtz_blob_triggers" does not exist
[2006-07-14 15:20:17.876 ] ERROR: table "eam_qrtz_cron_triggers" does not exist
[2006-07-14 15:20:17.876 ] ERROR: table "eam_qrtz_simple_triggers" does not exist
[2006-07-14 15:20:17.876 ] ERROR: table "eam_qrtz_triggers" does not exist
[2006-07-14 15:20:17.891 ] ERROR: table "eam_qrtz_job_listeners" does not exist
[2006-07-14 15:20:17.891 ] ERROR: table "eam_qrtz_job_details" does not exist
[2006-07-14 15:20:17.891 ] ERROR: table "eam_config_props" does not exist
[2006-07-14 15:20:17.891 ] ERROR: table "eam_metric_prob" does not exist
[2006-07-14 15:20:17.906 ] ERROR: table "eam_srn" does not exist
[2006-07-14 15:20:17.906 ] ERROR: table "eam_measurement_bl" does not exist
[2006-07-14 15:20:17.906 ] ERROR: table "eam_numbers" does not exist
[2006-07-14 15:20:17.906 ] ERROR: table "eam_measurement_data_1d" does not exist
[2006-07-14 15:20:17.921 ] ERROR: table "eam_measurement_data_6h" does not exist
[2006-07-14 15:20:17.921 ] ERROR: table "eam_measurement_data_1h" does not exist
[2006-07-14 15:20:17.921 ] ERROR: table "eam_measurement_data" does not exist
[2006-07-14 15:20:17.936 ] ERROR: table "eam_measurement" does not exist
[2006-07-14 15:20:17.936 ] ERROR: table "eam_measurement_arg" does not exist
[2006-07-14 15:20:17.936 ] ERROR: table "eam_measurement_templ" does not exist
[2006-07-14 15:20:17.936 ] ERROR: table "eam_measurement_cat" does not exist
[2006-07-14 15:20:17.936 ] ERROR: table "eam_monitorable_type" does not exist
[2006-07-14 15:20:17.951 ] ERROR: table "eam_fired_trigger" does not exist
[2006-07-14 15:20:17.951 ] ERROR: table "eam_trigger_event" does not exist
[2006-07-14 15:20:17.951 ] ERROR: table "eam_event" does not exist
[2006-07-14 15:20:17.951 ] ERROR: table "eam_user_alert" does not exist
[2006-07-14 15:20:17.966 ] ERROR: table "eam_event_log" does not exist
[2006-07-14 15:20:17.966 ] ERROR: table "eam_alert_action_log" does not exist
[2006-07-14 15:20:17.966 ] ERROR: table "eam_alert_condition_log" does not exist
[2006-07-14 15:20:17.966 ] ERROR: table "eam_alert" does not exist
[2006-07-14 15:20:17.966 ] ERROR: table "eam_alert_condition" does not exist
[2006-07-14 15:20:17.981 ] ERROR: table "eam_registered_trigger" does not exist
[2006-07-14 15:20:17.981 ] ERROR: table "eam_action" does not exist
[2006-07-14 15:20:17.981 ] ERROR: table "eam_alert_definition" does not exist
[2006-07-14 15:20:17.981 ] ERROR: table "eam_control_schedule" does not exist
[2006-07-14 15:20:17.996 ] ERROR: table "eam_control_history" does not exist
[2006-07-14 15:20:17.996 ] ERROR: table "eam_user_config_resp" does not exist
[2006-07-14 15:20:17.996 ] ERROR: table "eam_subject_role_map" does not exist
[2006-07-14 15:20:17.996 ] ERROR: table "eam_subject" does not exist
[2006-07-14 15:20:17.996 ] ERROR: table "eam_role_operation_map" does not exist
[2006-07-14 15:20:18.012 ] ERROR: table "eam_role_resource_group_map" does not exist
[2006-07-14 15:20:18.012 ] ERROR: table "eam_role" does not exist
[2006-07-14 15:20:18.012 ] ERROR: table "eam_res_grp_res_map" does not exist
[2006-07-14 15:20:18.012 ] ERROR: table "eam_resource_group" does not exist
[2006-07-14 15:20:18.012 ] ERROR: table "eam_resource" does not exist
[2006-07-14 15:20:18.027 ] ERROR: table "eam_operation" does not exist
[2006-07-14 15:20:18.027 ] ERROR: table "eam_resource_type" does not exist
[2006-07-14 15:20:18.027 ] ERROR: table "eam_principal" does not exist
[2006-07-14 15:20:18.027 ] ERROR: table "eam_virtual" does not exist
[2006-07-14 15:20:18.027 ] ERROR: table "eam_cprop" does not exist
[2006-07-14 15:20:18.042 ] ERROR: table "eam_cprop_key" does not exist
[2006-07-14 15:20:18.042 ] ERROR: table "eam_service_dep_map" does not exist
[2006-07-14 15:20:18.042 ] ERROR: table "eam_app_service" does not exist
[2006-07-14 15:20:18.042 ] ERROR: table "eam_application" does not exist
[2006-07-14 15:20:18.057 ] ERROR: table "eam_app_type_service_type_map" does not exist
[2006-07-14 15:20:18.057 ] ERROR: table "eam_service" does not exist
[2006-07-14 15:20:18.057 ] ERROR: table "eam_svc_cluster" does not exist
[2006-07-14 15:20:18.057 ] ERROR: table "eam_aiq_service" does not exist
[2006-07-14 15:20:18.057 ] ERROR: table "eam_aiq_ip" does not exist
[2006-07-14 15:20:18.072 ] ERROR: table "eam_aiq_server" does not exist
[2006-07-14 15:20:18.072 ] ERROR: table "eam_aiq_platform" does not exist
[2006-07-14 15:20:18.072 ] ERROR: table "eam_server" does not exist
[2006-07-14 15:20:18.072 ] ERROR: table "eam_service_type" does not exist
[2006-07-14 15:20:18.072 ] ERROR: table "eam_tier_type" does not exist
[2006-07-14 15:20:18.072 ] ERROR: table "eam_platform_server_type_map" does not exist
[2006-07-14 15:20:18.087 ] ERROR: table "eam_server_type" does not exist
[2006-07-14 15:20:18.087 ] ERROR: table "eam_ip" does not exist
[2006-07-14 15:20:18.087 ] ERROR: table "eam_platform" does not exist
[2006-07-14 15:20:18.087 ] ERROR: table "eam_platform_type" does not exist
[2006-07-14 15:20:18.087 ] ERROR: table "eam_application_type" does not exist
[2006-07-14 15:20:18.102 ] ERROR: table "eam_agent" does not exist
[2006-07-14 15:20:18.102 ] ERROR: table "eam_agent_type" does not exist
[2006-07-14 15:20:18.102 ] ERROR: table "eam_config_response" does not exist
[2006-07-14 15:20:18.434 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_config_response_pkey" for table "eam_config_response"
[2006-07-14 15:20:18.464 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_agent_type_pkey" for table "eam_agent_type"
[2006-07-14 15:20:18.495 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_agent_pkey" for table "eam_agent"
[2006-07-14 15:20:18.540 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_application_type_pkey" for table "eam_application_type"
[2006-07-14 15:20:18.555 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_platform_type_pkey" for table "eam_platform_type"
[2006-07-14 15:20:18.585 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_platform_pkey" for table "eam_platform"
[2006-07-14 15:20:18.630 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_ip_pkey" for table "eam_ip"
[2006-07-14 15:20:18.661 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_server_type_pkey" for table "eam_server_type"
[2006-07-14 15:20:18.721 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_tier_type_pkey" for table "eam_tier_type"
[2006-07-14 15:20:18.736 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_service_type_pkey" for table "eam_service_type"
[2006-07-14 15:20:18.766 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_server_pkey" for table "eam_server"
[2006-07-14 15:20:18.842 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_aiq_platform_pkey" for table "eam_aiq_platform"
[2006-07-14 15:20:18.902 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_aiq_server_pkey" for table "eam_aiq_server"
[2006-07-14 15:20:18.963 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_aiq_ip_pkey" for table "eam_aiq_ip"
[2006-07-14 15:20:19.008 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_aiq_service_pkey" for table "eam_aiq_service"
[2006-07-14 15:20:19.038 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_svc_cluster_pkey" for table "eam_svc_cluster"
[2006-07-14 15:20:19.053 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_service_pkey" for table "eam_service"
[2006-07-14 15:20:19.129 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_application_pkey" for table "eam_application"
[2006-07-14 15:20:19.159 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_app_service_pkey" for table "eam_app_service"
[2006-07-14 15:20:19.204 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_service_dep_map_pkey" for table "eam_service_dep_map"
[2006-07-14 15:20:19.234 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_cprop_key_pkey" for table "eam_cprop_key"
[2006-07-14 15:20:19.264 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_cprop_pkey" for table "eam_cprop"
[2006-07-14 15:20:19.280 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_virtual_pkey" for table "eam_virtual"
[2006-07-14 15:20:19.310 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_principal_pkey" for table "eam_principal"
[2006-07-14 15:20:19.340 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_resource_type_pkey" for table "eam_resource_type"
[2006-07-14 15:20:19.385 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_operation_pkey" for table "eam_operation"
[2006-07-14 15:20:19.446 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_resource_pkey" for table "eam_resource"
[2006-07-14 15:20:19.506 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_resource_group_pkey" for table "eam_resource_group"
[2006-07-14 15:20:19.551 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_res_group_res_mapping_key" for table "eam_res_grp_res_map"
[2006-07-14 15:20:19.581 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_role_pkey" for table "eam_role"
[2006-07-14 15:20:19.612 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_role_res_group_map_key" for table "eam_role_resource_group_map"
[2006-07-14 15:20:19.642 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_role_operation_map_key" for table "eam_role_operation_map"
[2006-07-14 15:20:19.672 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_subject_pkey" for table "eam_subject"
[2006-07-14 15:20:19.717 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_subject_role_mapping_key" for table "eam_subject_role_map"
[2006-07-14 15:20:19.763 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_user_config_resp_pkey" for table "eam_user_config_resp"
[2006-07-14 15:20:19.808 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_control_history_pkey" for table "eam_control_history"
[2006-07-14 15:20:19.838 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_control_schedule_pkey" for table "eam_control_schedule"
[2006-07-14 15:20:19.898 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_alert_definition_pkey" for table "eam_alert_definition"
[2006-07-14 15:20:19.929 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_action_pkey" for table "eam_action"
[2006-07-14 15:20:19.974 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_registered_trigger_pkey" for table "eam_registered_trigger"
[2006-07-14 15:20:20.004 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_alert_condition_pkey" for table "eam_alert_condition"
[2006-07-14 15:20:20.034 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_alert_pkey" for table "eam_alert"
[2006-07-14 15:20:20.065 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_alert_condition_log_pkey" for table "eam_alert_condition_log"
[2006-07-14 15:20:20.095 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_alert_action_log_pkey" for table "eam_alert_action_log"
[2006-07-14 15:20:20.125 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_event_log_pkey" for table "eam_event_log"
[2006-07-14 15:20:20.170 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_user_alert_pkey" for table "eam_user_alert"
[2006-07-14 15:20:20.200 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_event_pkey" for table "eam_event"
[2006-07-14 15:20:20.246 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_fired_trigger_pkey" for table "eam_fired_trigger"
[2006-07-14 15:20:20.276 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_monitorable_type_pkey" for table "eam_monitorable_type"
[2006-07-14 15:20:20.306 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_measurement_cat_pkey" for table "eam_measurement_cat"
[2006-07-14 15:20:20.351 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_measurement_templ_pkey" for table "eam_measurement_templ"
[2006-07-14 15:20:20.397 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_measurement_arg_pkey" for table "eam_measurement_arg"
[2006-07-14 15:20:20.427 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_measurement_pkey" for table "eam_measurement"
[2006-07-14 15:20:20.457 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "measurement_data_id_time_pk" for table "eam_measurement_data"
[2006-07-14 15:20:20.487 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "measurement_data_1h_id_time_pk" for table "eam_measurement_data_1h"
[2006-07-14 15:20:20.502 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "measurement_data_6h_id_time_pk" for table "eam_measurement_data_6h"
[2006-07-14 15:20:20.517 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "measurement_data_1d_id_time_pk" for table "eam_measurement_data_1d"
[2006-07-14 15:20:20.548 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_numbers_pkey" for table "eam_numbers"
[2006-07-14 15:20:20.563 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_measurement_bl_pkey" for table "eam_measurement_bl"
[2006-07-14 15:20:20.593 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "measurement_srn_res_id" for table "eam_srn"
[2006-07-14 15:20:20.638 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_config_props_pkey" for table "eam_config_props"
[2006-07-14 15:20:20.668 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_job_details" for table "eam_qrtz_job_details"
[2006-07-14 15:20:20.683 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_job_listeners" for table "eam_qrtz_job_listeners"
[2006-07-14 15:20:20.714 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_triggers" for table "eam_qrtz_triggers"
[2006-07-14 15:20:20.729 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_simple_triggers" for table "eam_qrtz_simple_triggers"
[2006-07-14 15:20:20.744 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_cron_triggers" for table "eam_qrtz_cron_triggers"
[2006-07-14 15:20:20.759 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_blob_triggers" for table "eam_qrtz_blob_triggers"
[2006-07-14 15:20:20.789 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_pk_qrtz_trigger_listeners" for table "eam_qrtz_trigger_listeners"
[2006-07-14 15:20:20.804 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_qrtz_fired_triggers_pkey" for table "eam_qrtz_fired_triggers"
[2006-07-14 15:20:20.819 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_qrtz_calendars_pkey" for table "eam_qrtz_calendars"
[2006-07-14 15:20:20.834 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_qrtz_paused_trigger_grps_pkey" for table "eam_qrtz_paused_trigger_grps"
[2006-07-14 15:20:20.849 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_qrtz_scheduler_state_pkey" for table "eam_qrtz_scheduler_state"
[2006-07-14 15:20:20.849 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_qrtz_locks_pkey" for table "eam_qrtz_locks"
[2006-07-14 15:20:20.895 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_service_request_pkey" for table "eam_service_request"
[2006-07-14 15:20:20.955 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_request_stat_pkey" for table "eam_request_stat"
[2006-07-14 15:20:21.016 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_error_code_pkey" for table "eam_error_code"
[2006-07-14 15:20:21.046 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_stat_errors_pkey" for table "eam_stat_errors"
[2006-07-14 15:20:21.076 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_plugin_pkey" for table "eam_plugin"
[2006-07-14 15:20:21.106 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_autoinv_schedule_pkey" for table "eam_autoinv_schedule"
[2006-07-14 15:20:21.182 ] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "eam_autoinv_history_pkey" for table "eam_autoinv_history"
[2006-07-14 15:20:28.322 ] LOG: received smart shutdown request
[2006-07-14 15:20:28.322 ] LOG: shutting down
[2006-07-14 15:20:29.167 ] LOG: database system is shut down
0 Kudos
admin
Immortal
Immortal

It looks like you are running into performance issues with the postgres server. What is the disk configuration you are running the postgres server on? Is it being run on a raid? How many physical agents are reporting into the system? (This number could be less than 95 since our agents can monitor many network devices, which show up as platforms in our inventory)

To better understand what's going on here we would need to see a few of the server.log's. Can you zip up a couple and attach it to the forum? You can also email them to support@hyperic.com if you don't want your logs available in a public forum.

Thanks,
-Ryan
0 Kudos
ZacharyD
Contributor
Contributor

I tend to believe it is a performance problem, the same server is used to run the sql database and the hq software, and that server is running on is very slow. I checked the task manager in windows and there are around 35 instances of postgres.exe running. The server is using all but 64MB of the 1 GB of memory in the system. Most of that memory is being used by the postgres.exe programs running. It doesn't seem right that there are that many running. I continually reboot, i have twice today already but it doesn't seem to help. If i kill the database service the exe's go away. I have sent you the logs in an email let me know what you think. I was wrong it isn't a dual processor server, there is just one 2.8GHz Xeon processor with hyperthreading. And there is a RAID 1 setup on the hard drives. A memory upgrade may be needed, we also use the same server for other monitoring programs, the big main one is HP insight manager. How do i find out how many agents are running?

Message was edited by: ZacharyD
0 Kudos
admin
Immortal
Immortal

You can find the number of agents by going to the Browse Resources link on the header, then click on Servers, then use the drop down to select 'HQ Agent'. That should give you a listing of all the agents in your environment.

For an environment of your size, it would be helpful to have more RAM available in the system. Disk IO throughput is also an important factor.

For now, there are a couple of things we can try to get the database healthly again:

Shutdown the HQ server, then open a cmd window and changed into your server-2.7.3/bin directory. From there, run db-start.bat to start the database, then db-psql.bat to open up a connection to the database. From within psql, run 'VACUUM FULL ANALYZE VERBOSE;'. After it completes, run \q to exit the shell, then restart your HQ server.

-Ryan
0 Kudos
ZacharyD
Contributor
Contributor

Ryan,

First we have 6 agents running currently.

Second I am running the 2.7.0 version of the hq software (not the 2.7.3). Because of this i can't find the db-psql.bat batch file in the bin directory. I searched the whole hard drive and can't find it anywhere. Is there a different way to connect to the database in version 2.7.0, or another way i can execute the command you suggested?

As a side point, i was watching the forum thread about another user having issues with too many postgresql.exe instances running. Suggestions were made to make changes to the config files and i was wondering if you think my problem is related since i have approximately 35 instances of postgresql running when the hq database is running. If it is related i can make the changes that were suggested there.
0 Kudos
admin
Immortal
Immortal

My apologies, that script is not included in 2.7.0. It will be in our upcoming 2.7.4 release.

In the meantime, I have attached the file to this thread. Simply copy it to your server-2.7.0/bin directory and follow the instructions from before. The VACUUM will take some time to complete.

As for the tuning of the DB, lets wait until we have your current database back and functioning. Your environment is large enough (with 95 platforms) that you need some decent horsepower to ensure HQ is able to keep with all the metrics you are collecting. If anything, I'd suggest putting in another GB of RAM and upping the shared buffers so that more of your monitoring data can be cached by the database.

-Ryan
0 Kudos
ZacharyD
Contributor
Contributor

I ran the VACUUM command on the database and the same error messages are showing up. I haven't upgraded the RAM yet, I plann on adding RAM still. I think that since i am maxing out the RAM on the server, the upgrade will do a lot for the performance of the HQ software. I will get back to you after i upgrade the ram, in the meantime if you have anymore suggestions let me know.
0 Kudos
admin
Immortal
Immortal

The extra RAM will be good. HQ is usually limited by the backend database, so ensuring it has the proper specs is essential. In addition to RAM, disk I/O is also very important. Dan and Brad have a thread that discuss hardware requirements for their environments. (which are on the range of 100-300 physical agents)

http://forums.hyperic.org/jiveforums/message.jspa?messageID=725#725

Hope that helps,
-Ryan
0 Kudos
ZacharyD
Contributor
Contributor

After reading the hardware specs thread, is it recomended that the database and hq software be run on seperate servers? We have them both running on one server. If it is not recomended, should we look into running them on seperate servers because of how many platforms are reporting to HQ? Once the system is up and running properly we will be adding more platforms to monitor. We might be around 150 platforms being monitored when we have everything entered into the system.
0 Kudos
BradFelmey
Hot Shot
Hot Shot

Zach, I can tell you from our own experience that you will be very hard-pressed to run 150 agents against a single app/db HQ server, unless that server is very robust. CPU is not affected much, but it will need gobs of RAM and more importantly solid I/O (read: very fast disk subsystem). Our 3.2GHz, 3GB RAM, RAID-1 system started having serious slowdowns around the 50-agent mark, and we're not monitoring a whole lot of detailed metrics, either. Our CPU is almost pathetically unused, but the drives are being beaten to death.
0 Kudos
borism
Contributor
Contributor

After playing with all kinds of parameters, i found out that increasing this parameter is helpful (for vacuum process)
max_fsm_pages = 2000000
In addition, I raised maintenance memory parameter to ~400m:
maintenance_work_mem = 416384

Regards,
Boris Mikhailovski

Message was edited by: borism
0 Kudos
dgorman_hyperic
Enthusiast
Enthusiast

Yup. RAM / Disks for the DB are critical. You generally can work around CPU bottlenecks, but the biggest bang for the buck are Disks/RAM
0 Kudos
ZacharyD
Contributor
Contributor

I am working on the upgrade right now. I don't know if the company is ready to put the money into this right now. I would gladly accept any other ideas to improve performace outside of hardware upgrades. I totally agree that the hardware upgrades are needed, though.
0 Kudos
admin
Immortal
Immortal

As a side note, we are looking into possible solutions that would help minimize the database IO. Mainly, removing the need for the hourly database maintenance that compresses metrics from one time window to the next.

We're also looking into database partitioning which would help with the fragmentation caused from deleting compressed metric data.
0 Kudos
ZacharyD
Contributor
Contributor

We have upgraded the server to 3GB of RAM and that has appeared to solve the issue for now. It has only been about a day, but everything appears to be running great. We are looking into upgrading to version 2.7.5 and redoing the installation so that the database and hq software aren't running on the same server. Everything runs fine but everything runs quite slowly. I will try many of the suggestions I got in this thread to speed things up. I will mark my question as answered because my initial question has been solved. I would like to thank all that posted and helped me with this issue. The software works great.
0 Kudos