Mike Suding 0 Report post Posted November 8, 2017 (edited) Thanks for your feedback and use cases - keep'em coming. Dependencies is currently under active research with a release planned in Q3 of 2018. Edited February 15, 2018 by Mike Suding Quote Share this post Link to post Share on other sites
mnagel 95 Report post Posted December 15, 2017 I have customers who really need this feature, and they are quite upset to learn the throttling stand-in could cause loss of knowledge about the actual root cause. This thread has been open since 2013. Exactly where on the roadmap is this? Mark 3 Quote Share this post Link to post Share on other sites
Mosh 115 Report post Posted December 19, 2017 What features are being targeted for Q2 2018? Would be good to some idea so we know if it'll be worth waiting or to build something ourselves now. Quote Share this post Link to post Share on other sites
mnagel 95 Report post Posted December 20, 2017 @Mosh I had not considered building myself, but as you say that, I think I could put together at least a rudimentary solution using the API. To be effective, however, it would need to either poll very frequently or be triggered by alerts (and this would still be leaky). Will have to mock something up as the current situation is unbearable for us, let alone our clients who receive alerts from the system... Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted January 23, 2018 Agree - this has taken way too long to get into the product officially. (It is in the works, but as Mike said, is at least 6 months away. We're working on improving our processes and efficiencies, too...) In the interim, these two datasources available from the registry with these locators can achieve dependencies on a device level. Feedback appreciated! SDT_Dependent_Devices: locator 24KKNG SDT_Assign_Primary_For_Dependencies: locator NFTHXG Creating Device Dependencies With these two datasources, LogicMonitor supports device dependencies in order to help reduce alert noise. Dependent devices have a primary device. When the primary device reports a specific kind of alert (by default, a ping alert, but this is configurable), then the dependent devices are placed in scheduled downtime. This means that if the dependent devices report alerts, they will not be escalated. Dependent devices will be placed in Scheduled Downtime for 30 minutes at a time. If the primary device is still in alert, the Scheduled Downtime will be refreshed for another 30 minutes, before the existing Scheduled Downtime period expires. Note: when the alerts clear on the primary device, the dependent devices will remain in Scheduled Downtime for the remainder of the existing 30 minute period - this is to allow circuits to re-establish, and alerts to clear, etc. Configuring Device Dependencies Ensure your account has the SDT_Dependent_Devices and SDT_Assign_Primary_For_Dependencies datasources. Import them from the registry using the above locators if necessary. You will need a LogicMonitor API token for a user that has rights to manage the primary and dependent devices. Create two properties on the root level of your LogicMonitor account: logicmonitor.access.id and logicmonitor.access.key, and set their values to the API token’s ID and Key, respectively. To create a dependency on device A, so that devices B and C will be automatically placed in scheduled downtime when device A is in alert: Navigate to device A, and determine the device’s displayname as entered in LogicMonitor. Note: this is not the IP/DNS name, but the value of the name field when managing the device. e.g. in the below screen shot, the relevant name is ESXi1 - Dell iDRAC8 Now simply navigate to devices B and C in LogicMonitor, and add the property depends_on to each device, and set it to the value of the displayName of device A. That’s it. Within 30 minutes of the first device set to have device A as a primary device, LogicMonitor will configure itself so that if device A has an alert on the ping datasource, it will place all dependent devices into scheduled downtime for 30 minutes, as described above. (Note: You can cause the reconfiguration to happen immediately if you run Poll Now for the SDT_Assign_Primary_For_Dependencies datasource on one of the dependent devices.) Once the primary device is in an alert that matches the alert conditions (any Ping alert, by default), it will SDT the dependent devices. You will see a property created on the primary device: dependents_sdtd - that contains a list of the devices that were most recently placed in SDT by the dependency action. There will also be another property, dependents_sdt_until that contains the epoch time in which the last set SDT will expire. If the alert condition still exists 5 minutes before the expiration of the SDT, a new SDT will be created. Note that devices that are primary for one set of devices can themselves be dependent on other devices. ( e.g. a remote server can be dependent on a brach office router, but that router may be dependent on a VPN router.) If a dependent device has a depends_on property that is set to a device that does not exist, a warning alert will be raised on that dependent device. (Similarly, there will be warning if the authentication credentials are not set correctly.) Optional - changing the alert conditions for the primary device to trigger dependencies By default, primary devices will trigger SDT for dependent devices if the primary device is in any ping alert (either packet loss or latency) of any level. You can change the conditions that trigger the dependency action by setting the property primaryalert on the primary device. This property can be set to any valid filter supported by the LogicMonitor REST API call that returns alerts for a device. The property is appended to the API query filter=resourceTemplateName: Thus the simple case is to simply set the property primaryalert to another datasource's Displayed As field (not name), to act on alerts about that datasource. Setting property primaryalert to this value will suppress dependent devices’ alerts when the primary has this alert: HTTPS- any alerts about the HTTPS- datasource. HTTPS-,instanceName:HTTPS-443 alerts on the 443 instance of the HTTPS- datasource HTTPS-,instanceName:HTTPS-443,dataPointName:CantConnect alerts on the datapoint CantConnect, on the 443 instance of the HTTPS- datasource. HTTPS-,instanceName:HTTPS-443,dataPointName:CantConnect,severity:4|3 also require that the alerts are of level Error (3) or Critical (4). For details of alert fields that can be used in filtering, see https://www.logicmonitor.com/support/rest-api-developers-guide/alerts/about-the-alerts-resource/ Removing Dependencies The dependency configuration will be automatically removed once there are no devices that have the depends_on property pointing at a primary device - but not until the primary device alerts next. (You can manually remove the properties is_primary_device, dependents_sdt_until and dependents_sdtd to immediately remove the dependency datasource). Feedback appreciated. Quote Share this post Link to post Share on other sites
Ray Scholl 0 Report post Posted February 8, 2018 This will surely help a lot of customers, thank you for that. You cannot add this at the moment - the import throws an error - any idea when it will be available (or similar functionality)? 503 : This LogicModule is currently undergoing security review. It will be available for import only after our engineers have validated the scripted elements. Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted February 8, 2018 Huh - thought I put that through the review already... Let me fix that today.. Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted February 8, 2018 Fixed -these are now importable. (Note that the locators changed - I edited the article above to the current ones. There was a slight improvement to using the globally unique displayname as the host reference in the depends_on property - the above article also reflects that...) Quote Share this post Link to post Share on other sites
Ray Scholl 0 Report post Posted February 8, 2018 No, not so much (same error) - unless I need to logout/login Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted February 9, 2018 That looks like you are using the old locator code (as the new one is not v1.0.0). Can you try with locator JZ62NH? That should be what the above article shows -maybe there was some caching.... Quote Share this post Link to post Share on other sites
Ray Scholl 0 Report post Posted February 9, 2018 That was it - I have added both datasources (I did refresh this page). Thanks Steve! Quote Share this post Link to post Share on other sites
Don 2 Report post Posted February 9, 2018 Safe to assume this stitched relationship would also trickle into putting a node into SDT and having their uplinks (if switches...) also inherit SDT from parent? Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted February 9, 2018 Yes - it puts the whole device into SDT - so all interfaces, etc. Quote Share this post Link to post Share on other sites
Don 2 Report post Posted February 16, 2018 On 2/9/2018 at 6:14 PM, Steve Francis said: Yes - it puts the whole device into SDT - so all interfaces, etc. Great. Not sure if assumed, but if we put node A into SDT, then the connecting interfaces NOT on this device, but connected to, would also go into SDT? Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted February 17, 2018 So this set of datasources doesn't directly know anything about connections. It requires the user to set the depends_on property. If A is in Alert, and B depends on A (via the property), then B and all its interfaces will be in SDT. Quote Share this post Link to post Share on other sites
Parth Dave 0 Report post Posted February 26, 2018 This is awesome. Do these datasources allow a Service to be a parent for bunch of dependent devices and vice versa? Can we set the depends_on property at the group level? Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted March 1, 2018 AS it's currently written, it doesn't support Services as either the dependent or primary. (It could be made to support that, with some extra scripting. Let me know if that's important.) And yes, you can set the depends_on property at the group level (and then override on devices, if needed.) Quote Share this post Link to post Share on other sites
Bryan Fehl 0 Report post Posted March 6, 2018 (edited) This is great! I'm testing this now in our environment. One thing that would be useful for us is allowing us to suppress alerts when DNS resolution is failing to a particular device. For example, we have a tunnel between two locations, and several monitors on the other side of the DNS server. If DNS resolution breaks, all monitors using hostnames are going to break. It would be nice to only be alerted once for the DNS problem, not the individual devices that are having DNS problems. Edit: Oh shoot i just noticed this doesn't support services currently. That is mostly what we'd use this for. Our use-case is that we have a large number of URL monitors that are monitoring from the location of users, to systems running the cloud. When we have a DNS problem, we get hundreds of alerts all at once for the same issue, which is obviously not ideal. We would definitely need this to support service monitors. Edited March 6, 2018 by Bryan Fehl Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted March 6, 2018 OK, I can extend this to support Services... I've got a few other things I'm in the middle of, so it may be a week or two... Quote Share this post Link to post Share on other sites
Bryan Fehl 0 Report post Posted March 6, 2018 Thank you sir, that would help us out tremendously. Quote Share this post Link to post Share on other sites
Parth Dave 0 Report post Posted March 6, 2018 + 1 for adding services support from me as well Quote Share this post Link to post Share on other sites
Joe Flowers 0 Report post Posted March 9, 2018 Hmm... This doesn't seem to be working for me. It appears to be; up until the point an alert fires. I've set a group that has the depends_on property set. Then have placed all relevant devices into said group (same upstream device). Everything appears to be working as expected. I see all devices in the group have inherited the depends_on property. And when I look at the primary device, it has indeed got the is_primary_device = true property set. However, when I kick off a test alert for PingLossPercent on the primary device, none of the dependent devices are put into SDT. Upon investigation I found that the is_primary_device property was removed from the primary device. And, with the test alert still active, when I poll now from one of the dependent devices, I see it get recreated briefly but then disappear again. Once the alert clears, I am able to poll now from a dependent device and see the is_primary_device property get recreated. ??? Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted March 9, 2018 Yep - I made a mistake in the device filtering, so it was only finding dependents that had the depends_on set directly on the device, not those that were inheriting it via groups (although I was sure I tested that...) Anyway, I've found the error, and fixed it, and will publish tomorrow after a bit more testing. Sorry about that.. Quote Share this post Link to post Share on other sites
Joe Flowers 0 Report post Posted March 9, 2018 @Steve Francis thank you sir! Please do let us know when the update has been published. Quote Share this post Link to post Share on other sites
Steve Francis 20 Report post Posted March 9, 2018 Published, locator 24KKNG Updated the documentation article above. Please let me know if there are any other issues. Quote Share this post Link to post Share on other sites