Tested with build #112 (Linux, MySQL, 1327 platforms) and found that backfiller starts before the entire queue is empty and hence alerts are fired for all the agents for during the time the server was down.
Steps/explanation:
1.Stop hq-server #105 ~10:30 am 2.Start it after 1 hour 3.Stop it within the next ~5 mins 4.Upgrade server to #112 5.Start upgraded server (now 2 hours since it was first brought down) ~12:30 pm
The server stats and logs show that backfiller starts at 1:03 and immediately alerts are fired for hundreds of agents. The availability queue size does decrease aorund this time but keeps spiking and falling down for a few more minutes before its stable at a lower value.
As discussed with Patrick and Scott, avail q size falling to lesser than 1000 does not seem to be a good indication to start backfiller. Re-opening for change of logic in backfiller start time.
Attached hq-stats has charts (in separate sheet) for correlation and the time/row at which backfiller starts is marked in red.
> Implement Backfiller Startup "smart" logic > ------------------------------------------ > > Key: HHQ-4298 > URL: http://jira.hyperic.com/browse/HHQ-4298 > Project: Hyperic HQ > Issue Type: Improvement > Components: Alerts > Affects Versions: 4.5 > Reporter: Patrick Nguyen > Assignee: Patrick Nguyen > Priority: Major > Fix For: 4.5 Sprint 30, 4.5 M7, 4.5 > > > Need to find a way to deterministically know when it is acceptable to start the backfiller.