mnagel

Members
  • Content Count

    255
  • Joined

  • Last visited

  • Days Won

    44

Community Reputation

71 Excellent

3 Followers

About mnagel

  • Rank
    Community All Star
  • Birthday July 17

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. mnagel

    Remote Session File Transfer

    It would also be nice to be able to cut and paste :).
  2. One of the biggest challenges we face putting together new datasources in many cases is the need to identify in advance every possible result and convert that to a number. Often it is not possible to know all the possible outputs, but you may know what is good that and anything else is bad. An example we are running into right now is with the Cisco Nexus NXAPI extracting information otherwise difficult to obtain, like "invalid interface SFP" or "bad monitor session". As we were going over options, it seemed like the best thing would be to set a datapoint for error status and to store the actual error/reason string in an ILP that could then be emitted as a token if needed. Right now you can only set ILPs during AD or via propertysources. My request is that it be possible to set ILPs as the result of data collection, which could then include these text values. I understand they will be ephemeral, but as long as they are maintained alongside each datapoint, that would still be helpful to include those values with the alert body. I'm sure this is not the best approach, but seems like the only method that would fit with the LM architecture as it stands (i.e., I am fairly sure there will never be a DPLP option).
  3. Yeah, I posted at least one FR on this -- it would be necessary to define a correlation key that tracks an incident. We have used SEC for this previously which provided primitives for handling incidents using that key, but there is no similar capability in LM. We will probably look at moving Windows event capture into SumoLogic as we were forced to after finding syslog from routers and switches does not work.
  4. mnagel

    Alert Supression Logs

    What we ended up doing to deal with this issue early on was to set a CC on every escalation that routes to a folder (in our case, via procmail rules). The result is we retain every alert sent so we have a reference for later analysis, including those messages.
  5. I just ran into a concrete example of why this missing function is crucial. Because LM does not support multiple parallel thresholds, it is sometimes necessary to create a virtual datapoint that remaps values to a virtual alert datapoint. Yet, it is impossible to reference the underlying datapoint in the alert template and the alerting datapoint imparts little useful information other than severity. Here is the use case I have right now -- out of the box, LM has no datasource to detect a Catalyst 4K/6K supervisor redundancy failure (and, as it turns out, anything Cisco IOS-like properly uses this MIB, include wireless LAN controllers in SSO mode). I created one based on a previous plugin we used in Nagios, but unfortunately the various states are not linearly mapped as would be required for the existing threshold mechanism. As a result, I had to construct this: if(or(lt(PeerUnitState,4),and(gt(PeerUnitState,4),lt(PeerUnitState,9))),2,if(or(eq(PeerUnitState,9),eq(PeerUnitState,14)),0,1)) The alerting DP maps "normal" to 0, "error" to 1, and "critical" to 2. The right thing to include in the alert template is the value of UnitState and PeerUnitState so those could be cross-referenced against the table of possible values, but it is impossible in the current product. Please make it a priority to fix this problem.
  6. mnagel

    SDT "groups"

    We have clients who have planned maintenance on specific locations requiring SDT. This should be easy, but in fact is not at all easy and in fact is very error prone due to the level of manual effort required to think through what all is impacted by a "site" outage. Each different element requires a different approach to setting downtime. For example, we recently had a location maintenance notice and had to think through all of the following: * resources at the location (reasonably easy due to using location-based device groups, even though using those completely breaks the RBAC security model) * websites used to monitor that location (internal and external) * collectors at the location * we are not using the new service feature yet at this client, but that would be yet another SDT requirement It would be far simpler and less error-prone if all of these could be scheduled for downtime at one time based on the fact that they are impacted by the location maintenance. I was thinking one option would be an SDT group that could contain all element types, then that group could easily be scheduled in one fell swoop.
  7. mnagel

    custom speed for interfaces

    Just wondering when this feature will be generally applied to the various interface datasources within LogicMonitor? There are a plethora of interface datasources that do not support this method (e.g., FortiGate interfaces, etc.)
  8. mnagel

    option to require FQDN

    As we have run into numerous cases now where the device name uniqueness requirement has bitten us, my recommendation to LM is now this: * add a portal option to require a FQDN for all devices -- failure to include would cause rejection * add an option to have a default FQDN suffix defined as a group property which would be used when adding a new device within that group I am not sure what to suggest for Websites and the new Services naming, but similar issues apply, especially for the current RBAC-based MSP model.
  9. I tried to get this handled as a bug a while back, but I was told no in very clear terms by support and our CSM at the time. Here is the situation and what is needed: If you have a storm of alarms begin at a location (e.g., due to maintenance), the quickest fix for this is to set downtime for the location. Unfortunately, even though a bunch of alerts have been sent, recoveries for those do not get sent during downtime. We need to either have a rule that recoveries are ALWAYS sent for corresponding sent alerts, or we need to be able to enable that via an alert rule flag. This is important when you integrate alerting into ticketing -- as a result of this behavior, we end up with stranded tickets that cannot autoclose due to the recovery having been sent.
  10. Having tried it a bit, I have to agree 100%. It is long-missing core capability, and so far it seems to have limited functionality (useful regardless). There is a wizard that creates a new type of datasource that has to live in a parallel tree to devices, but the names used for groups are not allowed to overlap with device group names. As an MSP, this means the per-client structure in place is broken immediately. After you use the wizard, you have to edit the datasource created to make changes. Not horrible, but clunky and definitely not something I see worth extra fees, unless I am missing something.
  11. @Sarah TerryOK, thanks. I am a bit concerned about the SolarWinds-ification going on here where every useful feature is turned into an add-on, but this one seems to warrant it. I've suggested to our CSM previously that all such features be tagged as 'Premium' or whatever word you prefer clearly in the documentation. As it stands now, this and others (e.g., LMConfig) have documentation with no way to know they need licensing to activate.
  12. @Sarah TerryIs this being rolled out staggered? Not in our portals...
  13. Wow, good timing then! We shall check it out!
  14. I can't believe I am responding to a 4.5 year old topic, but aggregated data handling is sorely needed. We have an example currently where we care about total ISDN B-channel usage across two or more voice gateways. The only option we can come up with is to use the API within a custom datasource, which is a PITA without library support (topic of another FR I filed some time back). Thanks to @Steve Francis we at least have some sample code to get API data collected, but...wow. It should be possible to do this more directly within datasource definitions. Alternatively, allow people to set alerts on widgets, which do allow aggregation of data like this.
  15. mnagel

    Dependencies or Parent/Child Relationships

    Never mind -- I waded through the docs more thoroughly and found the section about defining the API credentials.