mnagel

Members
  • Content Count

    496
  • Joined

  • Last visited

  • Days Won

    88

Everything posted by mnagel

  1. Check Mike Suding's blog page -- lots of cool stuff, including this. A bit old, but probably still works :). http://blog.mikesuding.com/2016/09/20/restart-a-service-alert-if-restart-fails/ As far as the debugger, yeah -- that stuff freaks me out a lot given that LM more or less requires Domain Admins on collectors (really should be Performance Monitoring Users, especially after the recent SolarWinds incident). You can run those debugger commands from the API as well, even more scary.
  2. I 100% agree this is needed -- we have to hack around this all the time with escalation chains that have one or more empty stages, and still that does not prevent alerts from registering in the system. But this is just one case that would be trivial to solve with DS inheritance, something I have been pushing for well over four years now. The issue with creating new DSes is they are then freestanding clones, meaning each must now be maintained independently (and this is commonly pushed by support as a solution, sadly). If we could just get inheritance done (not just for DSes, but that would b
  3. I guess not ASAP: This LogicModule is currently undergoing security review. It will be available for import only after our engineers have validated the scripted elements. I guess I will check back at some indeterminate date in the future .
  4. Thank you! I have been asking for this via "proper" channels for some time with no results -- will try it out ASAP as I have an 8320 cluster waiting. FWIW, I recommend using a standard property name alongside the ssh.user/ssh.pass (e.g., lmconfig.enabled) to allow disabling this premium feature at the group (client) level when it has not been subscribed to. I know it is an uphill battle to get those all fixed, but I sure wish it could be done. We still cannot use the new AD and DHCP modules due to lack of ability to disable LMConfig per client.
  5. Almost certainly there is code as Palo Alto checks virtually always require API access. Review has seemed in most cases I have been involved with to be a mostly ad hoc process (or if not, definitely opaque). I suggested in one of our UI/UX meetings that there be a "Request Review" button or similar to create or escalate a request for security review. As a bonus, use a ticketing system (this would be welcome for feedback as well, which as I understand generates internal-only tickets). A unified customer visible ticket system for feedback and module review would be very helpful.
  6. Been there, done that -- you can't reference those in widgets, sadly. You have to just create your own datasource that sets values equal to the properties and then reference those. First time I ran into this I wanted to chart device usage against subscription levels, latter of which was a property. In ours, the collector is a Groovy script that does nothing (not sure why that was how we did it, but it works). The CDP is just equal to ##property## in each case. It is Groovy mode, but the code is literally just that.
  7. Yes, and LM actually agreed with me and others (eventually) and fixed this in v133. And then they broke it sometime after that, no ETR that I am aware of.
  8. FWIW, having also come originally from Nagios, I miss the ability to transmit arbitrary string data back via alerts. Some of this can be emulated with auto properties, but those can be set only during discovery not collection. I posted a feature request previously to allow definition of enums that can be bound to datapoints (global values and overridden values within specific datasources/datapoints). these could then be used to avoid the current awkward legend method and actually show the intended purpose of DP values where needed via tokens. Imagine a line that showed the actual meaning of t
  9. Eventsources don't support embedded Powershell, though they certainly should. You can upload a script though. That said, eventsources are also almost entirely unsuited for monitoring, more like additional information to see along with monitoring. Among other things, you cannot ACK them in a meaningful way due to lack of correlation across eventsource results. I'm sure the yet-another-premium-module LMLogs will fix all those problems, though.
  10. You can do this under Alert Tuning at the group level. There is no similar option for specific devices short of editing the applies to code.
  11. I would not hold your breath -- I have had to fight just to get and keep SPF enabled on our email. Regardless, even if you could use the builtin alerts with a distinct From address, you would still have portal links embedded in the message that reveal it is LogicMonitor. You could do what we do and submit everything via a custom email integration (or a web integration via an API handler), then handle the data any way you like. In our case, we feed the tokens into an actual template system to format messages using conditional logic and all that stuff missing in the LM blind token substitutio
  12. Here are at least two items that need to be added to make the dashboard token feature more useful: adjust widgets that cannot use tokens so they can (e.g., Alerts, Netflow, etc.) allow arbitrary tokens to be inserted as needed within widget fields (e.g., device patterns, instance patterns, etc.) A concrete example of the latter came upon me this morning. We have multiple locations with similar equipment for which we want to display Internet usage details, one set per dashboard (cgraph and netflow widgets). The edge device names vary as do the uplink ports to the ISPs in each
  13. Many examples of using WMI from Groovy, none that select from Win32_Service, but should be simple enough to adjust the query. See Microsoft_LyncServer_StorageService as one example.
  14. The normal way I monitor services is via AD, but you would end up with a new instance wildvalue each time it was changed if you use the normal option (WMI-based datasource). If you use Groovy script DS instead, you could strip the PID portion to build the wildvalue so that the data is stable. There should be some examples of that in the existing datasource repo, need to dig around....
  15. Yeah, brings back horrible memories of me requesting repeatedly the documentation on how to pass parameters and getting the most insane response from support :).
  16. Best we have been able to do here is a script leveraging the API to download as many endpoints as we are able to access with checkin to a git repo. Works, but needs frequent tweaking as things change on the backend. Having a way to revert to a previous snapshot or similar would be very handy. My script came about originally after I implemented alert rule resequencing with an error and lost some rules. My latest incarnation of this script has an option to check the items as well for problems (e.g., broken widgets).
  17. I recommend opening a support ticket on this -- it is a bug or at least an undesirable limitation in the SSH library. Possibly fixed in a newer collector version.
  18. Seems like a no-brainer to support that, right? The only way you could as it stands is via API calls, which, without library support, is a non-starter for me. It could be done if you want to maintain many copies of the same code across modules (which seems to be the norm now, except they are all slightly different based on who writes them as you would expect).
  19. Since I can't wait for this, we now have code to grab widget data for all supported widget types incorporated into our existing backup script (pulls virtually anything I can from the API into a Git repo regularly). Most issues can be detected via exception (non-200 status code), some require a bit more analysis (no data in any line in a cgraph, for example). Working reasonably well now for the first phase, which is to be aware of busted widgets before we are embarrassed during client review. Next phase will be to analyze data more specifically to the context (once I figure out how to represen
  20. I am incorporating this into my resource check script. For now, the first item I am testing is Windows_DHCP, which requires that the DHCP Server role is installed (we have an auto.winfeatures property that is populated by a PropertySource. I don't recall where we got it, somewhere in these forums IIRC :). The PS code is simple: hostname=hostProps.get("system.hostname") my_query="Select NAME from Win32_serverfeature" def session = WMI.open(hostname); import com.santaba.agent.groovyapi.win32.WMI def result = session.queryAll("CIMv2", my_query, 15); println "WinFeatures=" + result.NAME
  21. It seems like a bug to me that if you assign Thresholds in a role that you are allowed to edit thresholds, but not toggle alerts or notifications on and off. Please either include that within Thresholds, or add another permission to enable that without having to assign full Manager permissions.
  22. Unfortunately, they will need to make instance groups into actual groups first, which I have requested many times. As it stands they are basically limited tags where you can only have a single tag for any instance and then you can apply thresholds to the tag, but those don't apply to new items tagged the same way. This is why they throw up the warning about how new instances won't get thresholds automatically. We've had to workaround that with API scripts to refresh those instance group tags and thresholds when new volumes are added to servers (as just one example). I assume if they ever ar
  23. I would start here: https://openvpn.net/community-resources/management-interface/ There is an example of how to use this (not in an LM-friendly way) at https://kifarunix.com/how-to-monitor-openvpn-connections-using-openvpn-monitor-tool/ There is also an example (a bit dated) on how to expose data via SNMP here: https://github.com/Phhere/openvpn-snmp The problem seems to be a general lack of standard monitoring since OpenVPN runs on so many platforms. If check_nrpe works, perhaps just punt and use that via an LM script datasource :).
  24. It sounds like you are expecting a function to be able to take arguments, because "function". I did as well many years ago when I wanted to create a single isClient function with the client name as an argument, but found after a very painful support ticket that they absolutely are not functions, just macros. You would need to write a separate version for each check. In this case, you would have to add a new function like this and a new one for each version: system.virtualization=~"VMware" && auto.version_number =~"6\\.5\\.0*"
  25. I would like to have SSL expiration profiles I can apply to websites with automatic selection based on the CA. This is (for now) based on the fact that Let's Encrypt certificates tend to have short expiration intervals, but it seems like a good general solution to discover stuff automatically as elsewhere within LM. Thanks, Mark