1 Reply Latest reply: Mar 18, 2009 3:49 PM by RSS

    [JIRA] Created: (HHQ-2899) HQ server needs better handling when HQ      database becomes unavailable

      HQ server needs better handling when HQ database becomes unavailable
      --------------------------------------------------------------------

                       Key: HHQ-2899
                       URL: http://jira.hyperic.com/browse/HHQ-2899
                   Project: Hyperic HQ
                Issue Type: Improvement
                Components: Server
          Affects Versions: 4.0.3
               Environment: all
                  Reporter: Todd Rader
                  Assignee: Todd Rader
                  Priority: Critical
                   Fix For: 4.1.1, 4.2.0


      CBSi reported an incident whereby a crash of the HQ database caused a major alert storm.  In the case of an HQ database outage, HQ should stay up but attempt to do as little as possible -- metric data cannot be stored, alert states and escalations cannot be evaluated, etc., until the database recovers and HQ is able to connect.  The verification of this should focus on narrow issues: no alert storms (a small flurry of alerts is okay), no excessive exception output to the server log, and a reasonable ERROR-level message.

      If the user has configured an alert on the availability of the HQ database, it's very important to get that right -- the DB alert must fire in that case.

      --
      This message is automatically generated by JIRA.
      -
      If you think it was sent incorrectly contact one of the administrators: http://jira.hyperic.com/secure/Administrators.jspa
      -
      For more information on JIRA, see: http://www.atlassian.com/software/jira

             

        • 1. [JIRA] Resolved: (HHQ-2899) HQ server needs better handling when HQ      database becomes unavailable

               [ http://jira.hyperic.com/browse/HHQ-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

          Kashyap Parikh resolved HHQ-2899.
          ---------------------------------

              Resolution: Fixed

          Same HHQ-2937 which is resolved.

          > HQ server needs better handling when HQ database becomes unavailable
          > --------------------------------------------------------------------
          >
          >                 Key: HHQ-2899
          >                 URL: http://jira.hyperic.com/browse/HHQ-2899
          >             Project: Hyperic HQ
          >          Issue Type: Improvement
          >          Components: Server
          >    Affects Versions: 4.0.3
          >         Environment: all
          >            Reporter: Todd Rader
          >            Assignee: Todd Rader
          >            Priority: Critical
          >             Fix For: 4.1.1, 4.2.0
          >
          >
          > CBSi reported an incident whereby a crash of the HQ database caused a major alert storm.  In the case of an HQ database outage, HQ should stay up but attempt to do as little as possible -- metric data cannot be stored, alert states and escalations cannot be evaluated, etc., until the database recovers and HQ is able to connect.  The verification of this should focus on narrow issues: no alert storms (a small flurry of alerts is okay), no excessive exception output to the server log, and a reasonable ERROR-level message.
          > If the user has configured an alert on the availability of the HQ database, it's very important to get that right -- the DB alert must fire in that case.

          --
          This message is automatically generated by JIRA.
          -
          If you think it was sent incorrectly contact one of the administrators: http://jira.hyperic.com/secure/Administrators.jspa
          -
          For more information on JIRA, see: http://www.atlassian.com/software/jira