All Activity

This stream auto-updates     

  1. Yesterday
  2. Last week
  3. Was this ever resolved or were you able to figure out how dimensionless Cloudwatch Metric could be monitored ? I am facing the same issue.
  4. ... or Jenkins Shared Library.
  5. Hi Team, I have a server that is triggering an alert for FileServer instance with the datapoint of ErrorsLogon. I checked the Event Log and there is no events for failed logins. The WMI Query is SELECT * FROM Win32_PerfRawData_PerfNet_Server. Anyone have any ideas on how I can troubleshoot this? Has anyone else experienced this issue? It is only 1 server, no others are having this issue. As I stated in Event Viewer there is no failed login attempts. This triggers several times a day. Any help would be appreciated.
  6. This is what our tool handles -- uses the API to download, then the checkin to git is done, which generates the reports. I had published this via github a while back, just ran a refresh with code changes we've made in the interim, plus a change suggested by someone here to abstract the property name we use to split changes per client. https://github.com/willingminds/lmapi-scripts
  7. In resources, under the device you can view the configuration changes that were made. If this could be exported into a viewable report that generates every time a config change detected, we could meet our auditors requests.
  8. I am not sure if you are talking about LMConfig or device (resource) settings or something else. For LMConfig, we had to solve this by using the API to download configurations, then committing to a git repo (local gitlab instance) with post-commit hooks for notifications. Sadly, this reveals numerous flaws in the LMConfig process, which we have to work around by detecting and skipping bad updates prior to saving for repo commit. For device settings, we have a similar method where we regularly download devices and other endpoints from the API. If you mean yet another thing, like actual target device configuration details not handled by LMConfig, then LMConfig may be a good option for tracking those (but may require a custom module or two).
  9. Something akin to ServiceNow's Script Includes feature would be good.
  10. I like the ability to display side by side comparisons of device configuration changes. If this could be made into a report, that would meet our financial auditors compliance standards.
  11. Earlier
  12. @BCO assuming you have a website in LM with the exact name 'sitename.com', you should be on the right track. Though, in v2 of the API you should enclose filter values in double quotes (per https://www.logicmonitor.com/support/rest-api-developers-guide/), so try '/website/websites?filter=name:"sitename.com"'.
  13. In our environment we have a mixture of thin and thick provisioned datastores and my problem is that all of the ones that are thick provisioned immediately trigger the percent used thresholds set on the datapoint. Have any of you tinkered with splitting out thin vs thick into 2 datasources? I'm looking for ideas. Thanks.
  14. Good Morning All, As you can imagine, we had some human error, were an engineer had accidentally dragging and dropping a group into an incorrect location. In working with support, I see that I could make a user account and modify that account to work as needed. However, if there could be an easier or more direct method of preventing users from moving but still allowing engineers how to do their tasks; I feel it would be a great benefit to the product and sanity of management. Thank you, Rudy
  15. Problem: How do you know how many collection tasks are failing to return data on any given device? You could set "no data" alerting, but that's fraught with issues. An SNMP community can give access to some parts of the OID tree and not others, so unless you set "no data" alerts on every SNMP metric or DataSource (DO NOT DO THIS!!) you might not see an issue. If you do do this, be prepared for thousands of alerts when SNMP fails on one switch stack... Here are a suite of three LogicModules that cause the collector to run a '!tlist' (task list) debug command per monitored resource, which produces a summary output of task types being attempted on the resource, counts of those task types, and counts of how many have some or all metrics returning 'NaN' (no data). As the collector is running the scripts, no credentials are needed. Unusually, I've used a PropertySource to do the work of Active Discovery, because right now the Groovy import used isn't available in AD scripts and an API call (and therefore credentials) would have been necessary. Additionally, creating a property for instances gives further abilities to the DataSources in terms of comparing what the collection scripts find vs what they were expecting to find, meaning they can "fill in the blanks" and identify a need to re-run Active Discovery. There are then two DataSources, one returning counts and NaN counts per task type, and the other returning total counts and NaN counts, plus counts of task types not yet discovered by the PropertySource (i.e., Active Discovery is needed - don't worry, that'll sort itself out with the daily Auto Properties run). There are no alert thresholds set as presented here, and the reasons are various. Firstly there's no differentiation between tasks that have *some* NaN values and tasks with *all* NaN values. That would demand massively more (unfeasibly more) scripting. Therefore it's a bit fuzzier than just being able to say "Zero is fine, anything else is bad". Secondly, some DataSources sometimes have some NaN values without this indicating any sort of failure. Every environment is different so what we're looking for here is patterns, trends, step changes, that sort of thing - these metrics would be ideal presented in top-N graphs in a dashboard, at least until you get a feel for what's "normal" in your environment. This will help guide you to resources with high percentages of tasks returning no data without generating alert noise. Enjoy... PropertySource: "NoData_Tasks_Discovery": v1.3: NPEMD9 DataSources: "NoData_Tasks_By_Type": v1.3: N6PXZP "NoData_Tasks_Overall": v1.3: 3A4LAJ Substantial kudos goes to @Jake Cohen for enlightening me to the fact that the TlistTask import existed and these were therefore possible. Standing on the shoulders of giants, and all that. NB. Immediately after a collector restart, the NoData counts and percentages will likely drop to zero, because while the collector will know the tasks it's going to run, none of them have failed since the restart because they haven't been attempted yet. Therefore, don't set delta alerts. It might look a bit like this in a dashboard for total tasks per resource: Or for a specific task type on a resource: Yes, I have a lot of NaN on some of my resources, thanks to years of experimenting. I probably ought to tidy that up, and now I can see where I need to concentrate my efforts...
  16. OK, I understood the first part. I'm not sure on the expression for the second part. I'm not familiar with the complex datapoint syntax for the expression. Should it look like if(maxrtt>100, 1, NaN)?
  17. Does LogicMonitor have the ability to schedule automated reboots of Windows Servers AND provide notifications if/when a scheduled restart fails? Currently using DesktopCentral to schedule groups of servers to reboot on different days/times but it is limited in regards to notifications/reporting. If for whatever reason, ServerA does not reboot, no one knows unless you manually check. It would be great if I could schedule a server, (or group of servers), to be rebooted at a certain day&time and if for some reason one of them doesn't or can't reboot, a notification is sent out informing someone that it did not reboot.
  18. I am relatively new to LM API and may just have a syntax problem. This works: ..... '/website/websites?filter=id:4096' This does not '/website/websites?filter=name:sitename.com' Any thoughts on what is wrong with the second syntax?
  19. I just cannot bring myself to paste the same complex code into multiple LogicModule scripts, leaving little land mines scattered randomly. I was working today on a general template for using the API from within LogicModules using code I found scattered around different modules (we keep backups of everything, making it somewhat easy to search for those). Just a few things I noticed: * all the code is different * nothing I found so far accounts for API rate limiting * various inefficiencies exist in at least some of what I found The correct solution to all of this is to make a library feature available so we can maintain Groovy functions and such in one place, calling them from LogicModule scripts. It is very sad to see how little re-use is possible within the framework at all levels, and this one is especially bad in terms of maintenance and things breaking easily when changes are made in the API backend.
  20. Assuming you leverage and consume custom alert messaging, you can define the KB article at the datasource template level. Taking your CPU utilization example, go to your CPU datasource LogicModule, and add the URL to the custom alert messaging for the desired datapoint triggering alerts. If the KB is different for different subset of resources, then the alert messaging should be updated to reference a custom property that would be assigned to or inherited by the resource. Example ##CPU_KB_URL##, then you would assign/inherit the cpu_kb_url property to your different subsets of resources. This does mean you will have to maintain these properties in LM.
  21. If you have a website built that will take a URL structure that can be married to device/instance property values, you can have the alert generate the URL form the inciting instances properties to direct you to the appropriate page. You may need to build out a redirection page within your site that receives and interprets those URLs for you. ##DATASOURCE## might be the right token to use for building that decision/redirection tree.
  22. Thank you for your reply Cole. I should have been more specific. In the alert that is generated, we would like it to include a link to a knowledge base article outside of LogicMonitor that we are building. That way, when an alert is routed to us, we can simply click the link and follow our organization's instructions for how to troubleshoot it and what procedures to follow. Hope you can help!
  23. Hi Can we please get the 'Threshold History' and a 'Last Modified' date as optional columns in the Alert Thresholds report. This would allow a huge time saving in reviewing modified thresholds. Currently we have to run the report which results in hundreds of entries, each of which then requires someone to go to the corresponding device, locate the alert, go into edit the thresholds in order to view the history and time the change was made. This current method is not practical now when we've just started using LogicMonitor let alone when we add hundreds more devices.
  24. Hi guys I've notice that the audit logs don't capture all events all the time. EG When changing a threshold on an alert the accompanying note is only captured by auditing some of the time. This makes me worry about what else is not being captured reliably. Support suggested I log this as a feature request so please make auditing capture ALL events ALL the time. Thanks.
  25. Thinking about this, I would find useful if an acknowledgeded alert that is place into SDT could also become automatically unacked at the end of the SDT. (I'd also like the ability to limit SDTs to no later than n days from timenow.)
  26. Oh it is, but it is definitely a non-obvious side-effect of disabling alerts and re-enabling. I frequently get the feeling different aspects of LM were written by summer interns :).
  27. ACK should be removable if determined it was checked incorrectly by a user.
  1. Load more activity