• 0

Alerts that monitor other alerts


grantae
 Share

Question

Example: I have one router connected to the network's other router with 2 links (interfaces, tunnels, etc). If one of the links goes down the normal alert rule to email me is fine. However, if BOTH links go down I want a page. 

Cluster alerts was close to what I needed but it seemed to only be able to be set for if ANY 2 links go down then do this, instead of if these 2 links go down. I care about the relation between 2 specific links on a device, not the other ports going to random servers and stuff happening to go down. (I have different alerts for those.)

Has any one dealt with an issue similar to this and found a work around/solution? Maybe an eventsource (or something) would be able to check for if Alert A and Alert B exist at the same time?

Link to comment
Share on other sites

3 answers to this question

Recommended Posts

  • 1
  • Administrators

You could create a service out of those two links. The service metric would be interface status. You would choose to aggregate the status data by "mean". If both links are up, they'd both return 1, so the average would be 1. If one link is down, you'd get the average of 1 and 2 (1.5). If both links are down, you'd average 2 and 2 (2). Set your threshold to >=2 and you should be good to go.

The only tedious part is setting this up for each pair of links you have.

  • Like 1
Link to comment
Share on other sites

  • 0

Sure, you can use Service Insight for this, but it is a premium feature, which is using an expensive mallet to handle something that should be available without that extra cost.  Or, there should be a Service Insight light for this stuff, leaving the costly part for the intended enhanced features of Service Insight (like Kubernetes).

My recommendation on this was to extend cluster alerts so you could at least match up instances.  My use case at the time was to detect an AP offline on a controller cluster.  There is no way to do this without SI, which as you say is complex, and it is an extra cost.  We need stuff like this in the base product.

  • Like 1
Link to comment
Share on other sites

  • 0
1 hour ago, Stuart Weenig said:

You could create a service out of those two links. The service metric would be interface status. You would choose to aggregate the status data by "mean". If both links are up, they'd both return 1, so the average would be 1. If one link is down, you'd get the average of 1 and 2 (1.5). If both links are down, you'd average 2 and 2 (2). Set your threshold to >=2 and you should be good to go.

The only tedious part is setting this up for each pair of links you have.

 

This sounds perfect! Thank you for the suggestion!

I put the 2 instances I want to monitor together and made a rule for it. Just need to test it and see if it works as intended.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share