Dependencies or Parent/Child Relationships


Richard Ortiz
 Share

Recommended Posts

  • 1 month later...

@Mosh I had not considered building myself, but as you say that, I think I could put together at least a rudimentary solution using the API.  To be effective, however, it would need to either poll very frequently or be triggered by alerts (and this would still be leaky).  Will have to mock something up as the current situation is unbearable for us, let alone our clients who receive alerts from the system...

Link to comment
Share on other sites

  • 1 month later...
  • LogicMonitor Staff

Agree - this has taken way too long to get into the product officially. (It is in the works, but as Mike said, is at least 6 months away. We're working on improving our processes and efficiencies, too...)

In the interim, these two datasources available from the registry with these locators can achieve dependencies on a device level.  Feedback appreciated!

SDT_Dependent_Devices: locator 24KKNG

SDT_Assign_Primary_For_Dependencies: locator NFTHXG

Creating Device Dependencies 

With these two datasources, LogicMonitor supports device dependencies in order to help reduce alert noise.

Dependent devices have a primary device. When the primary device reports a specific kind of alert (by default, a ping alert, but this is configurable), then the dependent devices are placed in scheduled downtime. This means that if the dependent devices report alerts, they will not be escalated.

Dependent devices will be placed in Scheduled Downtime for 30 minutes at a time. If the primary device is still in alert, the Scheduled Downtime will be refreshed for another 30 minutes, before the existing Scheduled Downtime period expires. Note: when the alerts clear on the primary device, the dependent devices will remain in Scheduled Downtime for the remainder of the existing 30 minute period - this is to allow circuits to re-establish, and alerts to clear, etc. 

 Configuring Device Dependencies

Ensure your account has the SDT_Dependent_Devices and SDT_Assign_Primary_For_Dependencies datasources. Import them from the registry using the above locators if necessary.

You will need a LogicMonitor API token for a user that has rights to manage the primary and dependent devices. Create two properties on the root level of your LogicMonitor account: logicmonitor.access.id and logicmonitor.access.key, and set their values to the API token’s ID and Key, respectively.

To create a dependency on device A, so that devices B and C will be automatically placed in scheduled downtime when device A is in alert:

Navigate to device A, and determine the device’s displayname as entered in LogicMonitor. Note: this is not the IP/DNS  name, but the value of the  name field when managing the device.

e.g. in the below screen shot, the relevant name is ESXi1 - Dell iDRAC8

5a67ad6f57115_ScreenShot2018-01-23at1_47_05PM.png.3f95bcfca8772bddce92dbeef3081af5.png

Now simply navigate to devices B and C in LogicMonitor, and add the property depends_on to each device, and set it to the value of the displayName of device A.

That’s it.

Within 30 minutes of the first device set to have device A as a primary device, LogicMonitor will configure itself so that if device A has an alert on the ping datasource, it will place all dependent devices into scheduled downtime for 30 minutes, as described above. (Note: You can cause the reconfiguration to happen immediately if you run Poll Now for the SDT_Assign_Primary_For_Dependencies datasource on one of the dependent devices.)

Once the primary device is in an alert that matches the alert conditions (any Ping alert, by default), it will SDT the dependent devices. You will see a property created on the primary device: dependents_sdtd  - that contains a list of the devices that were most recently placed in SDT by the dependency action. There will also be another property, dependents_sdt_until that contains the epoch time in which the last set SDT will expire. If the alert condition still exists 5 minutes before the expiration of the SDT, a new SDT will be created.

Note that devices that are primary for one set of devices can themselves be dependent on other devices. ( e.g. a remote server can be dependent on a brach office router, but that router may be dependent on a VPN router.)

If a dependent device has a depends_on property that is set to a device that does not exist, a warning alert will be raised on that dependent device. (Similarly, there will be warning if the authentication credentials are not set correctly.)

Optional - changing the alert conditions for the primary device to trigger dependencies

By default, primary devices will trigger SDT for dependent devices if the primary device is in any ping alert (either packet loss or latency) of any level. You can change the conditions that trigger the dependency action by setting the property primaryalert on the primary device.
This property can be set to any valid filter supported by the LogicMonitor REST API call that returns alerts for a device.
The property is appended to the API query filter=resourceTemplateName:
Thus the simple case is to simply set the property primaryalert to another datasource's Displayed As field (not name), to act on alerts about that datasource.

Setting property primaryalert to this value

 

will suppress dependent devices’ alerts when the primary has this alert:

HTTPS-

any alerts about the HTTPS- datasource.

HTTPS-,instanceName:HTTPS-443

alerts on the 443 instance of the HTTPS- datasource

HTTPS-,instanceName:HTTPS-443,dataPointName:CantConnect

alerts on the datapoint CantConnect, on the 443 instance of the HTTPS- datasource.

HTTPS-,instanceName:HTTPS-443,dataPointName:CantConnect,severity:4|3

also require that the alerts are of level Error (3) or Critical (4). 

 

For details of alert fields that can be used in filtering, see https://www.logicmonitor.com/support/rest-api-developers-guide/alerts/about-the-alerts-resource/

Removing Dependencies

The dependency configuration will be automatically removed once there are no devices that have the depends_on property pointing at a primary device - but not until the primary device alerts next. (You can manually remove the properties is_primary_device, dependents_sdt_until and dependents_sdtd to immediately remove the dependency datasource).

Feedback appreciated.

image.png

Link to comment
Share on other sites

  • 3 weeks later...

This will surely help a lot of customers, thank you for that.   You cannot add this at the moment - the import throws an error - any idea when it will be available (or similar functionality)?

 

503 : This LogicModule is currently undergoing security review. It will be available for import only after our engineers have validated the scripted elements.

image.png.27cd35270ad7245bf486d82360e1b868.png

 

Link to comment
Share on other sites

  • LogicMonitor Staff

Fixed -these are now importable.

(Note that the locators changed - I edited the article above to the current ones. There was a slight improvement to using the globally unique displayname as the host reference in the depends_on property - the above article also reflects that...)

Link to comment
Share on other sites

On 2/9/2018 at 6:14 PM, Steve Francis said:

Yes - it puts the whole device into SDT - so all interfaces, etc.

Great.  Not sure if assumed, but if we put node A into SDT, then the connecting interfaces NOT on this device, but connected to, would also go into SDT?  

Link to comment
Share on other sites

  • 2 weeks later...
  • LogicMonitor Staff

AS it's currently written, it doesn't support Services as either the dependent or primary. (It could be made to support that, with some extra scripting. Let me know if that's important.)

And yes, you can set the depends_on property at the group level (and then override on devices, if needed.)

 

Link to comment
Share on other sites

This is great!  I'm testing this now in our environment. 

One thing that would be useful for us is allowing us to suppress alerts when DNS resolution is failing to a particular device.  For example, we have a tunnel between two locations, and several monitors on the other side of the DNS server.  If DNS resolution breaks, all monitors using hostnames are going to break.  It would be nice to only be alerted once for the DNS problem, not the individual devices that are having DNS problems.

Edit: Oh shoot i just noticed this doesn't support services currently.  That is mostly what we'd use this for. Our use-case is that we have a large number of URL monitors that are monitoring from the location of users, to systems running the cloud.  When we have a DNS problem, we get hundreds of alerts all at once for the same issue, which is obviously not ideal.  We would definitely need this to support service monitors.

Edited by Bryan Fehl
Link to comment
Share on other sites

Hmm... This doesn't seem to be working for me. It appears to be; up until the point an alert fires. 

I've set a group that has the depends_on property set. Then have placed all relevant devices into said group (same upstream device). Everything appears to be working as expected. I see all devices in the group have inherited the depends_on property. And when I look at the primary device, it has indeed got the is_primary_device = true property set. However, when I kick off a test alert for PingLossPercent on the primary device, none of the dependent devices are put into SDT.

Upon investigation I found that the is_primary_device property was removed from the primary device. And, with the test alert still active, when I poll now from one of the dependent devices, I see it get recreated briefly but then disappear again. Once the alert clears, I am able to poll now from a dependent device and see the is_primary_device property get recreated.

???

 

Link to comment
Share on other sites

  • LogicMonitor Staff

Yep - I made a mistake in the device filtering, so it was only finding dependents that had the depends_on set directly on the device, not those that were inheriting it via groups (although I was sure I tested that...)

Anyway, I've found the error, and fixed it, and will publish tomorrow after a bit more testing.

Sorry about that..

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share