Dean Banks

Support alerts based on data aggregated from multiple hosts

Recommended Posts

We run a horizontally distributed architecture. As such, we really don't care (too much) if we lose one of N hosts, provided that a minimum number of hosts/processes/etc. are up and healthy. LogicMonitor makes it easy to make a graph of computed datapoints that span hosts, but doesn't let us configure alerts on the same computed data.

Tangible example: One application, when running, publishes capacity data to LM. This capacity data is aggregated and graphed, giving us great insight for planning purposes. However, the only alert configuration that LM supports requires us to alert on every single host, sometimes causing unnecessary wake ups in the middle of the night. Operationally, we'd be fine having one host be down, as long as we maintain adequate reserve capacity. System-wide reserve capacity can only be determined by aggregating data across the set of hosts (just like the graphs do).

We've been told to write custom scripts to do the collection and aggregation, and perhaps some rainy day we will. However, it seems like
1) LM does so much of the necessary bits already and
2) this would be a really useful capability for anyone that runs a horizontally distributed architecture.
This isn't a "holy cow, gotta have this now!" type of feature request, but certainly would be a great value-add.

Edited by Mike Suding
  • Upvote 2

Share this post


Link to post
Share on other sites

Ability to alert on N number of hosts down for a given host group would be nice. Or service dependency alerts where if X is down or Y is down, I don't care but if X is down and Y is down, then I do care. Realistically adding logic to alerts with ability to take multiple inputs. If condition A and B, then check C. If C=D while A and B condition is triggered, then alert.

Edited by Mike Suding

Share this post


Link to post
Share on other sites

Hi Steve, Great to hear from you. Cluster Alerts are close, but don't actually do what we want. I need the ability to sum or average a DataPoint across the host group, and then alert based on the computed value. Think of it as a multi-host complex datasource that supports alerting.
Cheers

Edited by Mike Suding

Share this post


Link to post
Share on other sites

This is a late comment to this thread, but will maybe spark the discussion again. As an MSP, I have customer contracts that involve billing for disk space. Currently there is no way to aggregate and alert off of a single datasource, such as WinVolumeUsage. I don't particularly care how a customer uses their allocated space, but I do care when they cross a threshold, as that would result in a follow-up process for my accounting team. And prior to them cross that threshold, I don't need to be manually monitoring the data manually, hence the request for alerting, as opposed to reporting.

Edited by Mike Suding

Share this post


Link to post
Share on other sites

I can't believe I am responding to a 4.5 year old topic, but aggregated data handling is sorely needed.  We have an example currently where we care about total ISDN B-channel usage across two or more voice gateways.  The only option we can come up with is to use the API within a custom datasource, which is a PITA without library support (topic of another FR I filed some time back).  Thanks to @Steve Francis we at least have some sample code to get API data collected, but...wow.  It should be possible to do this more directly within datasource definitions.  Alternatively, allow people to set alerts on widgets, which do allow aggregation of data like this.

Share this post


Link to post
Share on other sites

@Sarah TerryOK, thanks.  I am a bit concerned about the SolarWinds-ification going on here where every useful feature is turned into an add-on, but this one seems to warrant it.  I've suggested to our CSM previously that all such features be tagged as 'Premium' or whatever word you prefer clearly in the documentation.  As it stands now, this and others (e.g., LMConfig) have documentation with no way to know they need licensing to activate.

  • Upvote 1

Share this post


Link to post
Share on other sites

Disappointed that this is a paid for add-on feature, it should be core to the product. 

Share this post


Link to post
Share on other sites

Having tried it a bit, I have to agree 100%.  It is long-missing core capability, and so far it seems to have limited functionality (useful regardless).  There is a wizard that creates a new type of datasource that has to live in a parallel tree to devices, but the names used for groups are not allowed to overlap with device group names.  As an MSP, this means the per-client structure in place is broken immediately.  After you use the wizard, you have to edit the datasource created to make changes.  Not horrible, but clunky and definitely not something I see worth extra fees, unless I am missing something.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now