mnagel

Alerts on Longer Periods within Datasources

Recommended Posts

For a datasource, we would like to be able to set the alert threshold over more than a single sample.  You can set the number of threshold violations needed for an alert, but this is far different in nature than setting a threshold over a time range.  For example, 60% CPU over 2 hours versus 60% CPU over 10 samples.  You might see CPU fluctuate within that period, preventing an alert, but the average over a longer period is valuable.  Similarly, we would like to get alerts not just on average over a time period, but also on slope over a time period, though perhaps the latter should be a separate request.

Thanks,

Mark

Share this post


Link to post
Share on other sites

The number of consecutive violations multiplied by the DS polling frequency will help map out firing an alert over X minutes in age.  If you specify multiple severities in the dataPoint threshold, be advised- if the severities are "too close" to each other, then the calculation is reset if a polled value jumps into a new severity (this is true in both directions: warn->error->critical and critical->error->warn).

For myself, I'd love to see thresholds updated to have a more structured scripting language so I can eval last X values to determine when to fire the alert.  Zabbix NMS has this (https://www.zabbix.com/documentation/2.4/manual/config/triggers/expression)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now