mnagel

Members
  • Content Count

    321
  • Joined

  • Last visited

  • Days Won

    61

Community Reputation

94 Excellent

4 Followers

About mnagel

  • Rank
    Community All Star
  • Birthday July 17

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I just cannot bring myself to paste the same complex code into multiple LogicModule scripts, leaving little land mines scattered randomly. I was working today on a general template for using the API from within LogicModules using code I found scattered around different modules (we keep backups of everything, making it somewhat easy to search for those). Just a few things I noticed: * all the code is different * nothing I found so far accounts for API rate limiting * various inefficiencies exist in at least some of what I found The correct solution to all of this is to make a library feature available so we can maintain Groovy functions and such in one place, calling them from LogicModule scripts. It is very sad to see how little re-use is possible within the framework at all levels, and this one is especially bad in terms of maintenance and things breaking easily when changes are made in the API backend.
  2. Oh it is, but it is definitely a non-obvious side-effect of disabling alerts and re-enabling. I frequently get the feeling different aspects of LM were written by summer interns :).
  3. Right now, ACK and SDT work, but miss important functionality. Please consider addressing all of these: * ACK should be able to expire (critical issues that should not be lost forever, or to set a maximum expected recovery time period -- not possible with SDT). * ACK should be able to clear if a worse condition occurs (in Nagios, this is a non-sticky ACK) * ACK and SDT notices should be shipped to custom email integrations (this one is a bug as far as I am concerned)
  4. I have raised this numerous times with my CSM, account manager, etc. but it does not seem to be getting traction. It is a lawsuit waiting to happen, so it really deserves attention. Right now, the trigger for LMConfig is entirely arbitrary depending on who wrote which code. Most often it is based on ssh.user and ssh.pass being defined in a device. The problem with this is there are other reasons to have those properties (e.g., Err-Disabled port detection), so you can enable LMConfig across many devices and incur a large cost (especially with the new contract terms) without intent. It should be required to check off an "Enable LMConfig" option at the device or group level, and similarly for any other premium feature. Minimally, all the configsources should be changed to have a"enable_lmconfig" or similar property required in the Applies To logic.
  5. There are at least two reasons why not to use LMConfig. First is cost -- it is a premium feature and as applicable as it might be here, it is insane to invoke an extra charge to get this basic concept implemented. Second (more important) is that LM does not actually tell you what changed. We work around this via the API to download, commit to a git repo and use a hook to get email on changes. That could also work, but again seems like a lot to ask of users. The file storage method could work, but if there is a collector failover or change you lose state. Building redis or similar into the toolset would help with this sort of thing.
  6. I am trying to get an eventsource that reports when the firmware version has changed (this is something other tools "just do"). To do this, my "applies to" for auto.firmware_version works great, but then the script needs to use this logic: if auto.firmware_version != auto.firmware_version.prev then generate event that says "firmware version has changed from old to new" set auto.firmware_version.prev to auto.firmware_version end I imagine I could use the API for the "set" operation, but using the API in logicmodules always makes me cringe due to lack of library support. I detest maintaining the same code across many different modules as it is error-prone. If there could be a hostProps.set method, that would be very helpful. I understand this could be dangerous, so if it must have the same restrictions as propertysources, I can live with that.
  7. I just tried this as well and it is definitely cumbersome. There is no completion when you start with ! and if you use completion, there is no opportunity to prepend the !. I would hope with such a major revamp that a complex expression editor would be part of the upgrade
  8. Tossed this together today to track throughput license usage on platforms that license maximum levels (e.g., ISR4K) as the impact of exceeding this can be otherwise tricky to identify. Definitely could use more work, but a decent starting point. 7ZYRDH
  9. The key here is "if BGP was supported...". What if it is not? Do you think it would be given this specific case? I think it could be (i.e., peering topology identified), but to the extent it is not (or anything else is not), I think we need a way to reflect the dependency without serious programming effort to avoid alarm storms. I guess we have something to chat about next time we meet
  10. I believe this is now possible with Service Insight. Unfortunately, that is an expensive premium feature targeted at Kubernetes and such. This use case can also be handled, but should be part of the base product. We have other basic use cases, like total PRI channels in use across multiple voice gateways. I have conveyed this concern to anyone who will listen and have had some hopeful feedback, but no change yet.
  11. We continue to do battle with LM when alerts trigger due to dependent resource outages. I know the topology mapping team is working on alert suppression, but I am not convinced that will solve all problems regardless of how well they succeed. We really need a way to setup dependencies within logic modules and it should not need dozens of lines of API code each time (most of which should be made available as a library function IMO). One fresh specific example -- site with multiple firewalls in a VPN mesh running BGP. One firewall goes down, then all other firewalls report BGP is down. We care about BGP down, so we have alerts trigger escalation chains. It should be possible to define a dependency in the datapoint that suppresses the alert if the remote peer IP is in a down state. There is no way to express this in LM right now and that leads to many alerts in a batches, and that leads to numb customers who ignore ALL alerts.
  12. Just a word of caution -- we found long ago that using groups for taxonomy creates massive security problems if you also want to grant different users access to functional groups (e.g., SQL admins to SQL servers). With RBAC as it is, if a device is in two or more such groups, you cannot give access that way without giving access to all groups the device is in (this is apparently not considered a bug). There really needs to be an option to mark a group as a security group to avoid this. In the meantime, we have moved from static groups almost entirely to dynamic groups. Our biggest problem before was this one -- using location-based groups to organize devices and to avoid setting the location string many times. Now we use custom propertysources to set a location property value to define a dynamic location group, and that group gets the location string. As far as your issue, I assume you could recurse to get the data, but definitely there should be a way to do this in one shot, just like inherited properties.
  13. @Mike Suding Wanted to try this, but I guess it is very complicated -- still pending :(.
  14. We find at times the need to monitor usage on one device interface but show traffic information from another source. For example, we may get a utilization alarm from the physical crossconnect on an external switch to the ISP, but we have no useful traffic data (or no data) on that switch. The next step would be to go to traffic details on downstream devices, like firewalls. It would be helpful to have a "Related To" URL list available to avoid manual navigation each time. Ideally, this would be in the UI and available in alert tokens.
  15. In this case, yes. I never noticed myself, but can see why someone might take the instructions literally. I just hate too-strict systems that error out like this and frustrate users unnecessarily. We also link LM to ticketing in some cases, but found when it is done via email integration (easier with the ticketing system we use), LM made the decision that ACK and SDT notices are not sent via custom email integration, no way to fix short of development changes. Really need at least some folks over there focusing on the basics -- some of the new advanced stuff is nice, but poor alert handling (not this one specifically, which is annoying but at least can be worked around) is a shame.