mnagel

Members
  • Content count

    217
  • Joined

  • Last visited

  • Days Won

    42

Community Reputation

61 Excellent

2 Followers

About mnagel

  • Rank
    Community All Star
  • Birthday July 17
  1. multiple parallel datapoint thresholds

    Never saw any feedback on this, and here we are a year later. I just ran into another case where this would be useful (there are many). The current Cisco Stack DS has a State datapoint. That has a builtin threshold for > 4 as warning. Really, this should be "< 4" warning and "> 4" critical, but this is currently impossible. Please make fixing this a priority!
  2. event handlers

    Alert processing happens outside the detection point (in "the cloud") -- there need to be triggers to an event handler that operate in the collector context. One possibility would be to create datasources that don't actually collect data, but do the check and repair operation, with a datapoint as a side effect. It would be easier if datasource code could cross-reference other datasource/instance datapoints without having to replicate the same API code into each (e.g., code library support), but it is feasible. Triggers would be much cleaner.
  3. Currently, disabling an alert on an actively broken element causes a "cleared" message to be generated. This is what you might call "fake news". What should happen is the alert enters into a disabled state and the message should reflect that. It is of some value that it does clear since there is currently no other method to reset an accidental ACK, but that is a hack side effect. It is very annoying to have to explain to customers when this happens that, no, the problem did not really go away, we just toggled alerts off.
  4. I tried to come up with a solution to this -- I added custom properties to devices/device groups that I hoped could be inserted into the alert template, but the problem is this really can't work since there is no single master template. What you CAN do is use a custom email integration, which allows you to fetch the resultant alert message as ##MESSAGE##, and then you can put stuff around it based on whatever you like. But if you choose to do this, you lose all the features of normal email alerts, like reply-commands (ACK, SDT etc.). The current alert system is very limited, but I am hopeful it will be improved in the somewhat near future based on conversations I've had with LM folks. What we have been using is a custom integration that feeds into an actual template system (not just unconditional token substitution), from which we generate markdown output that feeds into a ticket. In that I can synthesize some of the links that allow for ACK/SDT and API calls can be used to populate more information into the ticket.
  5. The alert "template" system has no way to reference another instance datapoint value currently. Why would this be good? If you have a datapoint that alerts for status, you want to insert the value of the thing you care about, not the status value. I see the contortions LM datasource developers have gone through to workaround this (e.g., Cisco Nexus Temperature). Please make it possible to reference datapoints at least within the same datasource within alert templates. There are many other issues with alerts, but I will stop here for now :).
  6. Another option here would be to avoid stranding anything bound to a user -- when you remove a user in other systems, like G Suite, you are required to assign a new user for all their documents.
  7. I have brought this up with our CSM previously, but I thought maybe it would help to note the issue here as well. Currently, iDRAC monitoring requires consumption of a second device slot in addition to the device itself. This is a non-starter for most people due to the cost factor (especially if they have hundreds of servers), so they just don't bother. There is really no reason the datasources could not be adjusted to use something like drac.ip, etc. to monitor as an element of the host. The only glitch would be the network interfaces, which I think would be fine to leave out. Technically, I understand why the DS authors went this way, but it is an unnecessary cost burden to customers. Thanks, Mark
  8. I would prefer not to see every possible integration listed under every user when adding recipients to escalation chains or recipient groups. Please enable a user list within the integration definitions so they can be restricted to only those users. Thanks, Mark
  9. So I found out today that you can't even generally insert properties into the existing alerts. If you want an email alert that you can reply to ACK or SDT, you cannot use a custom integration. If you do not use a custom integration and want to append something to alerts, you would have to edit EVERY datapoint template. I was actually advised to use the API to append my custom token expansion into every datapoint template, which is not going to happen, for many many reasons. This is a huge issue that needs immediate attention -- we have to be able to customize what appears in alerts per client. The closest I have is custom integration since MESSAGE is the result of whatever template is filled out. If the custom email integration could act the same as regular email, then that would be a feasible solution.
  10. Currently, if a device was deleted and then you go to add that device into the system, you get an error dialog. How about we have the computer do the work here instead of me and add a button to the dialog to restore that device if desired? Thanks, Mark
  11. GRAPE support for groovy scripts

    Dredging this one up from almost 3 years ago -- we could really use this since there are limitations in the builtin libraries. Either Grape or a JAR manager, please!
  12. It does seem like Sarah's idea would work, just have to aggregate results across all devices to find datasources that have no associated device instances (except it looks like instanceNumber is the field). That said, sometimes the indication of no instances found for a DS is itself a problem and this is hard to know this within LM currently. I have a script I am working on that will evaluate what LM discovers against a template, specified by a property. This will allow detection of faulty AD logic or problems on the device itself. For example, a switch should discover interfaces, preferably all of them, but at least one. If for some reason SNMP is busted (happens due to platform issues, or in LM when a device produces too many results and discovery times out due to lack of bulkwalk support), I would want to know that and not be surprised later when the data is needed and not available. I wrote this Perl snippet for yours, which shows the instance count for each device/datasource pair, and a final report of any datasource with no instances: my $lmapi = WM::LMAPI->new(company => $COMPANY) or die; my %foundanyinstances; if (my $devices = $lmapi->get_all(path => "/device/devices", fields => "id,displayName")) { for my $device (@$devices) { if (my $devicedatasources = $lmapi->get_all(path => "/device/devices/$device->{id}/devicedatasources", fields => "id,dataSourceName,instanceNumber")) { for my $devicedatasource (@$devicedatasources) { my $dsname = $devicedatasource->{dataSourceName}; $foundanyinstances{$dsname} = 0 if not exists $foundanyinstances{$dsname}; my $numinstances = $devicedatasource->{instanceNumber}; $foundanyinstances{$dsname} = 1 if $numinstances > 0; next if defined $MININSTANCES and $numinstances < $MININSTANCES; next if defined $MAXINSTANCES and $numinstances > $MAXINSTANCES; printf "%s: %s: %d\n", $device->{displayName}, $devicedatasource->{dataSourceName}, $numinstances; } } } } for my $dsname (keys %foundanyinstances) { print "$dsname: NO INSTANCES\n" if $foundanyinstances{$dsname} == 0; } 65,1 Bot
  13. netflow data access via API

    Updated WM::LMAPI.pm at https://github.com/willingminds/lmapi-scripts.git -- still working on more issues, but this handles most of my pain.
  14. netflow data access via API

    OK -- under control for now. DISCLAIMER: everything is subject to change, as it was before when accessing undocumented REST resources. The short answer is you must now include an 'X-version: 2' header in your request headers, or add 'v=2' as a query param. When you do this, it is supposed to work for any endpoint except one (can't recall which), but I updated my module to use v2 only as needed for specific endpoint patterns. The other important finding is that the data structure returned for the v2 API is different. Instead of items being a subkey of data, it is (for now anyway) a top-level key. I also had been checking the status key (copy of the HTTP response code), but that was removed. I am still working on updates to my scripts, but the netflow script is working again. I will post my revised LMAPI.pm to github when it is stable.
  15. netflow data access via API

    I have been given a tentative fix, will post if I get it working...