Popular Content

Showing content with the highest reputation since 02/26/2016 in all areas

  1. 12 points
    Allow devices to be dependent on one another. If a router goes down, the switch behind it will most likely go down or have an error as well.
  2. 4 points
    As we move towards a DevOps model, we increasingly have a need for small teams to have full admin access to the tools they use to manage their IT services. When it comes to LogicMonitor, this has proven difficult with the existing role permission model. DevOps pods would like to be able to manage their own datasources, alerts, and escalation chains but this isn't possible unless we give them broad access rights to those areas, which could cause significant harm to other groups of monitoring users. For example, an inexperienced DevOps user could inadvertently modify a datasource that applies to all Windows devices or they could create an alert rule that causes alerts not to be delivered to other users. To solve this problem, I'd propose that LogicMonitor offer alert groups, escalation chain groups, along with the existing datasource groups. Then, LogicMonitor could provide the ability to restrict roles to manage these specific groups. DevOps pods could be given the ability to manage their own custom subset of datasources and set up their own alerts in a rule range after the main set of rules.
  3. 3 points
    Rather than have Websites as separate section in the product with a separate hierarchy to manage; how about making all of the Websites stuff part of the Device Tree and rename the Devices section to something that covers both. Then if I want to add a website or service check I simply do it against the "group". This way I wouldn't have to maintain two hierarchies of business services. What do other folks think of this?
  4. 3 points
    Please add the option to alert on "no data" condition to the instance level Alert Tuning configuration dialog. We don't want to generate "no data" alerts for everything and we don't want to split the data sources (extra maintenance when updating), so it would be easier to have this as a instance level override.
  5. 3 points
    I would appreciate it if datapoints marked for alerts on No Data were indicated in the Alert Tuning page with the designated alert level displayed. Right now, to know this, you have to dive into the datasource definition to find out. Thanks, Mark
  6. 3 points
    Currently, a table must have static columns and rows defined before the widget will display data. It would be great to be able to dynamically build a table's rows based on * To expand on this, it would be great for the table to have the option to exclude instances with zero/no data from the list. For example, I would like a table that displays all MSMQ queue names and the number of messages in each queue - but not display anything if the current queue length = 0
  7. 3 points
    I have customers who really need this feature, and they are quite upset to learn the throttling stand-in could cause loss of knowledge about the actual root cause. This thread has been open since 2013. Exactly where on the roadmap is this? Mark
  8. 2 points
    In my opinion the alerts New UI colours are terrible, please allow us to adjust them or create few other themes (dark theme or old grey theme worked great) to choose from.
  9. 2 points
    It would be useful to have SNMP traps that trigger within a specific timeframe to be considered the same alert. We have a few cases where devices start throwing traps every minute and by the time we react to fix we already have dozens of alerts. It would be better to consider the same trap within a time frame to be the same alert to avoid this alert flood.
  10. 2 points
    NOC is an acronym for Network Operation Center. Heads up monitors are typically called NOC monitors, so you can see where the widget being called a NOC widget is useful for those types of dashboards.
  11. 2 points
    Hello LM Team, It would be great if the NOC widget in the Dashboards have the possibility to filter out inactive devices and show only the ones that actually have alerting enabled. For example we have VCenter that have 1500 virtual machines, but only about 10% of them are to be monitored at all times, so by default alerting is disabled for all VM machines from VCenter but these we actually need. Unfortunately the NOC widget will show us everything from that Datasource and it's problematic. Thank you.
  12. 2 points
    Please add the ability to select first, second, third, fourth weekend or day of month. So, for example, we need to be able configure: o Third weekend of the month, 0200 to 0400 - meaning both the third Saturday and Sunday of the month o Second Tuesday of the month, 0400 to 0600 And so on. At the moment we would have to manually add multiple entries and keep updating.
  13. 2 points
    Good Afternoon. Would it be possible to set up some sort of color/theme options for the portal? My biggest issue is having to triple check when I am in our production environment vs our sandbox. Sure, I can look at the url - just wanted to suggest that there would be some more options to customize the color scheme either at a user level or to differentiate between production and sandbox. My suggestion from support was to create a new logo, which I can and will do temporarily. Thanks!
  14. 2 points
    We would like to overlay the Warn, Error and Critical thresholds onto our graphs so you have a visible view of any resources / metrics trending towards a threshold. The graph background could be green for everything under Warn, Yellow for warn, Amber for Error and Red for Critical. It would make it very clear on all dashboards if a host or service was under load and would clearly indicate the threshold on all metrics whether the datasource default was being user or if they had been alert tuned for different times of day / use.
  15. 2 points
    I'd like to suggest the possibility of adding a "category" to the datapoints. This would assist in alerting and metrics around what types of alerts are occuring. Adding a field to the datapoint would allow us to categorize the type of datapoint for automatic correlation of the events in other systems and as well as provide additional logic for dashboards where you can show alerts based on category type. My idea of default categories would be (not exhaustive, I'd like to see these customizable in the admin area for these to make since for everyone's organization): Capacity Configuration Other Outage/Uptime Performance I've been toying with the idea of adding these just to the alert message template for every alert we have configured and just stripping it out in my automation, but I'm sure there are others that would love to see this as well to help classify the types of alerts that are generated.
  16. 2 points
    I would love it if we could reference ##SERVICESRESPONSE## on an overall alert. We don't deliver alerts for singular test location failures, since our mandate to only notify on systemic issues across all test locations. So the question will probably, which response to include in the event there are differing responses? Why not include all of them! Or only include the first one in the test location array for that service. Or pick a random one. Or arbitrarily decide certain failure reasons have a "higher" priority than others and choose the "highest" one.
  17. 1 point
    Have you looked at the data APIs? I haven't used them myself but seems to fit the request. https://www.logicmonitor.com/support/rest-api-developers-guide/v1/data/get-graph-data/#Get-widget-data https://www.logicmonitor.com/swagger-ui-master/dist/#/Data/
  18. 1 point
    One day we might get a dark theme... 😀
  19. 1 point
    I am not sure exactly how to describe this other than by example. We created an API-based method a while back to control alerting on interfaces based on the interface description. This arose because LM discovered interfaces that would come and go (e.g., laptop ports), and then would alarm about the port being down. With our change, those ports are labeled with a string that we examine to enable or disable alerting. The fly in the ointment is that if an up and monitored port went down due to some change, our clients think they should be able to change the description to influence behavior. Which they should. Unfortunately, because LM will not update the instance description due to the AD filter, the down condition is stuck until either the description is manually changed in LM or until the instance is manually removed in LM. Manual either way, very annoying. My proposal is that there should be a way to update the instance description even if the AD filter triggers. Or a second AD filter for updates to existing instances. I am sure there are gotchas here and perhaps a better way exists. I considered using a propertysource, but I don't think that applies here. The only other option is a fake DS using the API to refresh the descriptions, but then you have to replicate the behavior of many different datasources for interfaces.
  20. 1 point
    If you don't want to change the event timing itself, you can add a blank line to the Escalation chain... it will use the escalation interval on that blank step which will add time. We use this for Services restarting that take a long time. We need to know that they've restarted, but also need to know if they don't finish restarting. So we have an escalation chain just for the service alerts that alert our team, then waits 20 more minutes before alerting us again. If you add a blank to the end of the escalation chain, you can stop repeated messaging as well. Works especially well if you are using a ticketing system that only accepts email as an incoming connector.
  21. 1 point
    In Nagios, there is a concept of an event handler that can run to try to fix problems (e.g., restart a service, remove old files, etc.). I see no similar capability in LM and it is of course something customers want to see happen. For example, I just deployed a custom DS for someone to check for too many files in a share, indicating a service problem. Once I implemented that, the next question was "Can you restart the service when that goes into warning?" I see no facility for this in LM, but perhaps a custom alert could be used to trigger the behavior. If I used that approach, I would insert a custom HTTP alert into the escalation chain earlier on to give the problem a chance to be corrected, then I will have to create a secure REST API server to accept those and trigger the correct behavior. So in theory it could be done (if I am not missing something), but it feels like using a screwdriver to hammer in a nail. Thanks, Mark
  22. 1 point
    Hey there, I'm sharing my datasources for CUCM that use the Serviceability APIs using PowerShell and SOAP (I'm a powershell scripter, still haven't gotten the time to fully understand groovy). I'll expand on this post as I get further down adding additional datasources from the Serviceability APIs. All of these require a cucm.user and cucm.pass entered on the devices that are the call manager servers. Cisco CUCM Device Status- DXZRTZ - Reports the following datapoints for all devices registered to CUCM - StatusReason, IsRegistered, IsNotRegistered, NumOfLines, RegistrationAttempts - Uses https://$($hostname):8443/realtimeservice2/services/RISService70?wsdl - Needs to have servicability features turned on (you can check just by going to this URL) Cisco CUCM Service Status- 3LNYR9 - Reports the service status of all services running for CUCM servers - https://$($hostname):8443/controlcenterservice2/services/ControlCenterServices?wsdl Cisco CUCM Statistics ZN494P - Reports about 100 datapoints from CUCM's PerfMon "Cisco CallManager" object - Uses https://$($hostname):8443/perfmonservice2/services/PerfmonService?wsdl
  23. 1 point
    Can I also make a feature request to retain the custom thresholds / attributes (user optional, probably by means of a toggle button to choose between overwrite or leave as is ) while updating LogicModules? I did notice related requests from the past and it seems that it is not yet released.
  24. 1 point
    @Sarah Terry Please address urgently. These new verbose error dialogs expose the WMI password. Ideally I'd like a Settings options to disable such verbose error messages, or restrict them by role. (Also can these dialogs be more responsive, no a 1920x1080 screen these appear as narrow panels in the middle.)
  25. 1 point
    There is two main types of SNMP checks. There is your SNMP Get/Walk and there is SNMP Traps. They work very differently. SNMP Get/Walk is where LogicMonitor will directly query your device for state/performance data, this is what most of LogicMonitor wants to use, is the best option and what !snmpwalk does. There is also SNMP Traps where you setup the device to send out alerts to the monitoring system. The setup for each of these are completely different. Many devices support both but some devices only support SNMP Traps (looking at you EMC). If the device supports SNMP Get/Walk, there is likely a section for this on the device config separate from the SNMP Trap section. Also you may need to white-list the IP address of the collector on the device. If the device only supports SNMP Trap, you can still set it up in LogicMonitor but it's far more limited: https://www.logicmonitor.com/support/eventsources/types-of-events/snmp-trap-monitoring/
  26. 1 point
    I would like to see an ability to select instances for widgets (and alert rules, etc.) based on other factors than the instance label or glob pattern. It should be possible to select based on an applies-to-like function using instance properties, instance description and instance group (latter I requested separately some time back). A good example for this would be a widget that displays all WAN or DIA links in one place. Selection would be determined either by instance description match (with instructions to the client to ensure those are labeled to cause inclusion) or perhaps by an instance property. I can do some of this with an API-based tool for widget maintenance, but that would not address all places this should be possible.
  27. 1 point
    Alert processing happens outside the detection point (in "the cloud") -- there need to be triggers to an event handler that operate in the collector context. One possibility would be to create datasources that don't actually collect data, but do the check and repair operation, with a datapoint as a side effect. It would be easier if datasource code could cross-reference other datasource/instance datapoints without having to replicate the same API code into each (e.g., code library support), but it is feasible. Triggers would be much cleaner.
  28. 1 point
    Having spoken with John via chat session, and discovering the cause of reports generating with blank data, it was discovered that a user that was no longer with the company was the last person who had modified a report. The feature being requested is to simply allow reports to run and send out as they are scheduled, regardless of the user that put them in place. While simply touching the reports and saving them with a different user does allow it to function, does not fix the root cause. While the rest of LM datasources, dashboards, and graphs function regardless of creating user, the reports should continue to function as configured as well.
  29. 1 point
    I see that you've added this to Github . I've created two PRs to address issues I've been having with I'm assuming the Public directory scripts are "bundled" when releases are created and uploaded to PowerShell Gallery.
  30. 1 point
    Please apply a mono space font to the Query field when the data source is using JDBC. This would hugely improve readability and editing of the SQL.
  31. 1 point
    Any update on this? - April 2018?
  32. 1 point
    Hello all, This is my first time posting a question in the forums. I wanted to know if there is a limit to the number of Dashboards that can be added to a Slideshow dashboard rotation? I work in my company's NOC (network operations center) and have somewhere in the neighborhood of 70 dashboards that my group needs to keep track of. It seems the number of dashboards I can add before a previously added dashboard is removed is 26. Does anyone know if this is a hard limit or is this a setting that our LM administrators can adjust? Looking forward to responses, Steve V.
  33. 1 point
    We would like to use our dashboard at a Kiosk. Any way of passing the credential via the URL using the API key or User credential?
  34. 1 point
    The entire RBAC mechanism is way too coarse. I had a client ask yesterday why they can't disable alerts for a device group. As far as I can see, that comes along with Manage, and I see no reason why this should be true -- I don't want them to have that level of control but it is all or none -- RBAC granularity improvements are sorely needed.
  35. 1 point
    Assume I have two web servers behind a load-balancer. There are three web apps that I would like to monitor -- abc.company.com, qrs.commpany.com, and xyz.company.com. The web server expects a host header value since all three web apps are bound to both 80 and 443. I know I easily add a web service or ping check (either external or internal) using the DNS resolvable hostnames and collect both up-time and response times depending on what check I create. My concern is that since it is going through the load balancer (and potentially hitting different nodes each time or even the same node every time if affinity is turned on), I would not have statistics on up-time and responsiveness of each node. How should I go about collecting this additional data? One approach would be adding additional steps to the check to check each node. But would I be recording individual node data or just the check as a whole? What about name resolution? The original abc.company.com resolves to the load balancer ip. How do you target the individual node ip? (Remember, host header is still needed for IIS to know which web app (app pool), the request is for.) Can I set up multiple collectors in the same location where the collector has a local host entry to override the load balance ip with one node ip? (1 collector targets 1 node) Can a single step in a multi-step check use different collectors? I guess I could define multiple checks with each check using a different collector that targets a different node... But how does that impact license count usage? (I assume each one takes up a license.) Thanks for any info you can provide.
  36. 1 point
    Sorry -- to clarify - one graph shows logic monitor 1 minute polling and the other shows 10 second polling intervals from manage engine
  37. 1 point
    When updating datasources from the repository, there should be an easy way to maintain your existing AppliesTo and alert threshold settings.
  38. 1 point
    PLEASE PLEASE bring this as a feature. This has caused alerts to not be routed because of an accidental change. Considering the way alert routing works by folder structure it is stupid that folders can be dragged and dropped WITHOUT confirmation.
  39. 1 point
    Would like to be able to retrieve the "Alert Rules" and "Escalation Chains" data that you see in "Alert Settings". Either a report, API, or export would work. Also would like to be able to list out what devices those alert rules pertain to. We need this information to audit our current environment and to see what device alerts are going to which groups. Our Logic Monitor environment is distributed and has hundreds of rules and escalation chains. Recent reorg's, and shifting of responsibilities, has made it necessary to identify what is monitored and who gets those alerts. With this information I hope to reorganize device groups and collapse and reduce redundant and unused rules and escalation chains, and move to a more centralized and manageable system.
  40. 1 point
    We have found cases where the default SNMP AD method is woefully inadequate (e.g., try to discover interfaces on a Cisco ISR4K with voice enabled; you will discover no interfaces and get no indication this has failed due to the full ifTable download taking way way too long). To workaround this, we plan to convert the interface AD to a script, but found the SNMP library provided does not support bulk operations (or they are undocumented). Can these please be added? If not, will look into something like http://btisystems.github.io/snmp-core/site/project-summary.html as an alternative via Grape. Not sure yet how well that handles SNMP version abstraction, which is a nice feature of the LM library. Ideally we could get the bulk operations included in the LM library. Thanks, Mark
  41. 1 point
    Just ran into this today, needed to add a %. I'm also not a fan of the default dash between unit label and bottom label. If I don't want to use a unit label, my bottom label always begins with a dash.
  42. 1 point
    The data sources below will gather the service packs from Windows Servers and alert if the system is not on the latest service pack. As new service packs are released, you will need to update the alerting level. Since you do this manually, it gives you the opportunity on when to start alerting, or you can set it to the latest to create a list of servers in your organization that need upgrading. 2003 - CLDJYT 2008 - TYAYTD 2008 R2 - 97DY2F 2012 - HMKPFL 2012 R2 - W2HKGR 2016 - YZWNTM
  43. 1 point
    Like most of you, I have a long list of "To-do's" in our LogicMonitor deployment. One that I just recently crossed off is capturing Netflow. More accurately, capturing sFlow from Juniper EX/QFX switches. It's worth noting that the actual implementation was surprisingly easy, aided by LM's netflow doc page and Juniper's references (see links below). I've included the needed Juniper commands for those of you who are also in non-Cisco environments. I can't stress enough though that you pay close attention to LM's suggested best-practices, carefully consider Juniper's caveats, and put a lot of effort into planning the details of your deployment (which switches, which physical interfaces, etc) so that you understand what data you are actually getting from sFlow and how it is being delivered. Remember: more data at your disposal does you no good if you can't place it in proper context. ###Global sflow enable, globally define polling interval and sampling rate, define sflow source address and agent id; define the sflow collector and export port set protocols sflow agent-id set protocols sflow polling-interval 1 set protocols sflow sample-rate ingress 100 set protocols sflow sample-rate egress 100 set protocols sflow source-ip set protocols sflow collector udp-port 6343 ###enable sflow sampling on individual switch ports (note: polling and sampling values set here are not required and will override global values) set protocols sflow interfaces ge-0/0/9.0 polling-interval 1 set protocols sflow interfaces ge-0/0/9.0 sample-rate ingress 100 set protocols sflow interfaces ge-0/0/9.0 sample-rate egress 100 set protocols sflow interfaces ge-0/0/11.0 polling-interval 1 set protocols sflow interfaces ge-0/0/11.0 sample-rate ingress 100 set protocols sflow interfaces ge-0/0/11.0 sample-rate egress 100 http://www.logicmonitor.com/support/monitoring/networking-firewalls/netflow/ http://www.juniper.net/techpubs/en_US/junos14.1/topics/concept/sflow-ex-series.html http://www.juniper.net/techpubs/en_US/junos14.1/topics/task/configuration/sflow-ex-series-cli.html Next up is piping Netflow into our collector from Juniper's MX routers. From a LogicMonitor perspective there will be almost no difference. Unfortunately the degree of difficulty is substantially higher in terms of Juniper configuration. I'll put that into a separate post; check back later if you are interested.
  44. 1 point
    I felt the same way for a long time, but I have grown to love the new UI. I harassed every LogicMonitor support person for a year or more, telling them over and over that it was just terrible. But no more! The current version does everything that the original one did, and much more. My favorite feature is that the UI allows you to easily set thresholds on virtually any grouping that exists in the system. You can set thresholds for any group that you can imagine (just keep track of conflicts!) The previous UI was often confused as to what group you wanted to set thresholds on. The new device tree in the left pane is super powerful. I'd also encourage you to click around in the breadcrumb tail along the top. Bottom line for me was the learning curve, since there are so many new features.
  45. 1 point
    Just in case someone else runs into the same: The key thing here is that the user needs manage rights on his collector(s) and device group(s). Antohter thing is that he has te select the group where the new device has to be added to. Otherwise the default (root) group is used, and there the user does noet have any permissions... Thanks to David from LM support who helped me out! Jeroen
  46. 1 point
    We have also run into this issue. A group was accidentally moved and any datasource that was being applied based on that group no longer worked and data was lost. A work around we used was to edit the datasource to be applied based on a system category added to these groups rather than actual group. I'd still like to see LM be able to track when a group is moved and edit any resources referring to that group. As it is now it is a pain to fix dashboards and reports once a group has moved.
  47. 1 point
    It would be very helpful if we can run the traceroute command from the Collector "Debug Command" window. This is helpful with complex networks, and troubleshooting some routing issues. Optionally, the traceroute implementation could take arguments similar to those found in Linux or Windows traceroute utilities.
  48. 1 point
    Switch from WMI based event log monitoring to PowerShell based. The command Get-WinEvent does not require PS-Remoting to be enabled and can use WMI pass-thru credentials (wmi.user and wmi.pass). This is going to be much more efficient than using WMI to grab event log data. Example to get via JSON (though I was not yet able to get this to work properly with script based eventsource). $events = Get-WinEvent -ComputerName $hostname -Credential $remotecredential -LogName Application $events | Select-Object @{Name = "happenedOn"; Expression = {$_.TimeCreated}}, @{Name = "Severity"; Expression = {$_.LevelDisplayName}}, message, @{Name = "Source"; Expression = {$_.ProviderName}} | ConvertTo-Json $arr=@{} $arr["events"] = @{} $arr.events = $events | Select-Object @{Name = "happenedOn"; Expression = {[string]$_.TimeCreated}}, @{Name = "Severity"; Expression = {$_.LevelDisplayName}}, message, @{Name = "Source"; Expression = {$_.ProviderName}} $arr | ConvertTo-Json
  49. 1 point
    I upvote this,and suggest the same for web services.
  50. 1 point
    I would like to propose an idea that Logicmonitor needs a better way for external systems to input data into the Logicmonitor system. Similar to Zabbix Sender https://www.zabbix.com/documentation/2.2/manpages/zabbix_sender Use case is: suppose alongside LM, a company runs an APM like New Relic, a log monitoring tool like Elasticsearch/Splunk, a custom Data warehouse for analytics. As the NPM, LM should be the one source of alarming and trending. I believe the best way to integrate is to allow a direct api to send data or allow ability to interface with the collector to send data. This way any application, no matter custom or common public applications can input data into Logicmonitor. For examples - if elasticsearch/splunk found a critical error in its munching of logs, it will open connection to logicmonitor and send data that this error log occurs 5 times in the last 2 mins. Logicmonitor is configured to alert if > 1 so there is an alert to our NOC - if the APM finds that a website has an immense increase in traffic from one location causing performance issues, it will open connection to logicmonitor and send data that this is occured. - if the mining of our data warehouse finds that customers interest/purchase of one of our products has dipped 20% in the last month, it will open connection to logicmonitor and send this data.