Search the Community

Showing results for tags 'alerts'.

More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


  • From LogicMonitor
    • Product Announcements
    • LM Staff Contributions
    • Community Events
  • LogicMonitor Product Discussion
    • Feature Requests
    • LM Exchange
    • Ask the Community

Find results in...

Find results that contain...

Date Created

  • Start


Last Updated

  • Start


Filter by number of...


  • Start



About Me

Found 53 results

  1. Extending alert information from LogicMonitor to other 3rd Party systems is pretty common for us, however, the available tokens today to describe the alert is missing a few bits of data (we feel). It would be incredibly helpful to have an alert token that contains the LM User responsible for Acknowledging the alert, and a separate token for the Ack comment. Having these tokens allows us to better map alerting details to upstream and downstream integrations.
  2. We have a team that handle all alerts escalations which spans 3 different shifts. If in case an alert can't be immediately corrected and requires a followup, a note is entered for the alert on the alerts dashboard with basic info: date, person contacted and any information the the shift can review to determine whether or not a follow up is required from them. Unfortunately, once multiple notes are entered, legibility decreases and unless you enter your name, there's no way of easily determining who entered the note. The ability to enter a note where a record of the time of entry and user would improve functionality; more like a log for the alert itself.
  3. Hey guys! So I wanted to bring up the idea of clearing alerts manually. I searched the feature requests threads and haven't really found an answer or a thread that matched what I was looking for so I thought I would take a shot at doing one of these. Apologies in advance if this has been discussed already.. Or if I don't make much sense. I'm fairly new to using the platform so I might not be fully up to speed with all the lingo. So let me explain a bit of what brought me to this request.. I have set up monitoring on our virtual machines to monitor CPU usage by percentage (x\100). I then have an alert setup to indicate a stuck process which would shoot out an alert if a data point hasn't changed (+/-3%) on the next 3 intervals (which is set to 3 minutes). The alert clears if it changes after the next 4 intervals. The process above has been working great so far but I quickly realized that we didnt really care about anything stuck between 0-50%.. we only wanted to focus on values that were stuck at 50% or above. I then changed the valid value range to be between 5000-10000 (50-100%) which produced a lot more productive results. I did notice that CPU's which did end up being stuck within the 50-100% range, then clear to a value outside of the valid value range (X<50) then this would produce NO DATA thus having the initial alert stay in limbo forever. You could manually clear them by going to the device and toggle alerting on the device off and on again.. but doing that for a large amount of alerts takes a lot of time. I'm okay with the way I have it set up (but I do believe that the above may be a bug..) I just kind wished we could manually clear alerts from the alerting window without having to take extra steps. Maybe something next to the acknowledge button? I might have jumbled this up so please ask if I need to clarify any of the information above. I can provide screenshots if needed as well. Thanks for taking the time to read this! TL;DR = Let us manually clear alerts from the alert window without having to go into specific devices and toggle alerting.
  4. Please add an option so that when a device is in the IdleInterval state (HostStatus DS), then all other alerts are automatically removed. At the moment some devices retain their ping loss alert even though the HostStatus DS has triggered the IdleInterval alert (no data being received). Our users are finding it confusing when some devices have both alerts, while others have only the IdleInterval alert.
  5. Use Case: Provider I am a provider with a substantial amount of customers being monitored by the platform. A single customer requests monitoring to be suspended for 14 days as they do a physical DC move. The move will be 1:1 so all systems will come back up in same logical location and only move physical locations. Requests are filed, meetings are had and the day comes to move and NOC turns alerting off for customer. Uneventful days go by and on the day that alerting is supposed to be turned back on a regional event happens that the provider NOC is responding to for other customers (You can insert any normal well defined chaos that happens in a NOC here) and alerting does not get re-enabled for the customer with the physical DC move. Enterprise: Oracle team notifies the NOC that a weekend upgrade will be happening on the Oracle customer and the upgrade team does not want to be notified of any alarms as they will have their hands full with the upgrade and they will call back when the upgrade is complete. NOC turns alerting off and upgrade team never calls to say that they are done working. Request: Much like SDT enable calendaring and scheduling as a option for enabling/disabling alerting as a backup option in case of failure in manual processes.
  6. Hi, Every morning i have to clear a couple of hundred alerts from my inbox that come in while our customers servers are running backups. We often get 'disk latency' 'network latency' type alerts while the backups are running. As they are run outside of hours, we do not really need these. Please could you add a way of creating SDT based on alert severity or better yet, build a mechanism to schedule backup window times to filter noise alerts like disk latency. I'm sure I'm not the only one to encounter this issue. Kris
  7. Please document which properties of alerts are searched by the Search function of the alerts view. We sometimes see results for search strings that we cannot explain why they are included in the search results. For example, if I search for "WMI" I get alerts that do not contain "WMI" anywhere that I can see. Perhaps the a list of the properties that contain the search string could be displayed in the UI as a tooltip to the search string input textbox.
  8. You must have set up your Alert Rules & Escalation Chains hoping that it is setup correctly. What if it was not set up accurately and it does not Alert the right group or even worse it does not alert at all? The worst thing is for you not to receive an alert when a device is down or let's say you have a disk which is filling up due to logs which have been set to a verbose mode which one of your teammates did not change the level back after troubleshooting. In this article, you will be guided how to setup an effective Alert Rule & Escalation chain. In addition, we will show you how to deliver a live alert without creating any impact to the system in question. Before diving into the troubleshooting steps, below are the difference between Alert Rules and Escalation Chains. Alert Rules are used to tag the respective Escalation Chains when a certain device reaches the defined severity level. You could define this Alert Rule to use an Escalation chain only when a certain data point is reached. Escalation Chains are used to set the delivery method for Alerts. This could be set to deliver your alerts via email, sms, ticketing systems, custom HTTP integrations, etc. You may also set your Escalation Chain to be routed to different groups of people during different times/days. This is useful for different sets of standby engineers for a 24x7 operation. Alert Rules & Escalation Chains are very powerful if used correctly. To begin, we will first create an Escalation Chain. For this example, i will create it for Windows devices. We recommend enabling rate limit as you will not want to receive a flood of alerts. By doing so, it limits the maximum number of Alerts delivered in the defined time. If you are wondering, i created 3 stages for different delivery methods (email, Hipchat & voice). The duration that it takes to move from one chain to the other is defined within the Escalation Interval of the Alert Rule. This is an optional section where we have the ability to route alerts to different people depending on the time and day. It is quite simple, just select the days & timing for the respective stages. This section below for the creation of Alert Rules requires good planning. Alerts are triggered based on on the priority level. It will start from the lowest to the highest number. It should start with the most granular to the most number of wildcards. A common use case is: Create an Alert rule to send Interface related Alerts to the network team Create an Alert rule to send hardware or performance Alerts to sysadmin team Create an Alert rule to send Exchange Alerts to the messaging team Create an Alert rule to send all other alerts to the sysadmin team Another essential portion which we need to focus on is the Group which it is applied to. We get this question asked countless times. It’s an easy fix but it is knowing what to fix. If you set it to * it will apply to all groups - which is great. However, we know that we can’t apply the Alert rule to all devices. We might need to apply different alert rules to a different type of devices (e.g: Server, Switches, Routers, WAN Links, etc). Let's say you have a router “wan01” which resides in the group “Infrastructure -> Critical -> Networking -> Routers -> WAN”. If you apply the Alert Rule to “Infrastructure/Critical/”, your device will not pick up this Alert Rule as it resides in subtree. The fix is simple, just apply the Alert Rule to “Infrastructure/Critical/*”. This will Apply to all subgroups under Critical. Now, once you have set that up, I'm sure you would like to verify if that if the Alert Rule is picked up by the datasource or instance in question. To do so, navigate to the datasource or instance in question. Click on the COG button and it will show you the Alert Rule, Escalation Chain and delivery method for each stage. This is how you can determine if your Alert Rule or Escalation chain is picked up. The next thing is to validate the delivery of an Alert. Yes, we could click on the “Send Test Alert”. I’m sure we prefer to have an actual alert to see how it works. My favourite datasource to use is the Ping datasource with the PingLossPercent datapoint. To trigger an alert, we could change this value to “>=0”. What this will do is to send an Alert when the Ping Loss is more than or equal to 0. To do so, it’s quite easy too.Click on the pencil icon within the line of PingLossPercent. Click on the + sign as this will create an instance level threshold. What you want to do is to set the value to 0 for critical. You should receive the Alerts quite soon after. Once you have received the alerts and verified its all working, remember to remove it as you dont want to get flooded with alerts. I hope this article has provided you with sufficient information on how to setup an alert, test and trigger the Alerts.
  9. Please can you make the "No Alerts" check mark icon fill at least 90% of the widget height for improved visibility at a distance. Otherwise it looks like a widget is not showing anything.
  10. It would be good to be able to set AlertRules based on DataSourceInstanceGroups and DataSourceInstanceProperties
  11. Per discussion with Jeff Woeber, I want to submit this as feature request in LogicMonitor end as each alert threshold within each datasource (e.g. Tomcat ThreadPool- ) can have its own wiki troubleshooting page. It’s be a great feature if LogicMonitor enables user to specify it’s own troubleshooting page as optional field for each datasource. User can customize specific wiki page as recommendation whenever an alert is sent to PagerDuty.
  12. After receiving a call about an alert, and ack it, I do not need a call to tell me once it is resolved. This is particularly true when 20 things go down, I get 20 calls, I solve the issue and then I get 20 calls the issue is cleared. Please add the ability to disable cleared calls
  13. We have regular event log entries which on their own are not substantial, however in bulk they become a problem. It would be beneficial to have a mechanism whereby I can say tell Logicmonitor to alert me differently if it detects a certain number of a specific alert in a time window. I.e. trigger escalation chain '500Errors' when we see <100 errors of event ID "2070" in last 1h.
  14. Alert history is currently limited to only 30 days of data. This severely handicaps identification of false positives/negatives, the understanding of alert history and its implications across the organization, and severely limits the value of reports. Please increase alert and event retention to match the "Data Retention" length explicitly stated on LogicMonitor's official pricing page for a given plan. Thank you!
  15. We have a datasource (in this case a ESX host) that has several ESX Virtual Machines in it. We have painfully separate the the instances of VM's into two different instance groups 'dev' and 'prod' this allows us to easily disable alerting for an instance group. But what we really want to do is set up two different alerting schemes. For any alert coming from the 'prod' instance group I want to blast out emails and escalate, while for all alerts from the dev instance simply displaying them in logicMonitor is enough. I have looked at the pages on using globs for groups, but that appears to only work from groups, I would have been unable to uses the same logic to apply an alert rule to just a single instance group. With out this, I wonder what the point of instance groups are.. Why would I want to group my instances into groups if I can't act on those groups independently? Visual Example: Device: vSphere Group: Data Center 1 ESX Virtual Machine: Instance Group: Prod VM: A VM: B Instance Group: Dev VM: X VM: Y VM: Z
  16. While viewing a list of Alerts, I would like the Value field to be clickable / sortable, exactly the same all of the other column values in the Alert list. I was surprised when I discovered this field cold not be sorted. Use case 1 - I am looking at Alerts for Disk Volume usage. I want to sort on value to focus on the volumes with the least amount of free space (or most amount of used space). Use Case 2 - I am looking at Alerts for CPU utilization. I want to sort on value to focus on the processors with the highest utilization Thank You. Todd
  17. Today we were flooded with hundreds of alerts in our alert dashboard. AWS was having an issue in the ap-southeast-1 region with launching new instances. The "AWS Service Health" datasource found this issue and then alerted on it for each instance & ebs volume we had in that region. That was too many alerts, especially since the issue wasn't with our existing devices in that region. I would like this alert to happen on the AWS Device Group itself - per region, so that we can know about it, but it won't generate an excessive number of alerts with the same exact information.
  18. Hi! Is there a chance that in the near future alerts can go in an email along with a png or jpeg of the moment the alert was triggered? Nowadays we're only getting an external link, which is kind of a pain to check over a phone or something. Thanks! from Mercadolibre S.R.L.
  19. I have at least one case where it would be handy to use an external integration in a filter-only mode. The idea is we could pass the tokens into the integration, and the integration would pass back a formatted message suitable for inclusion in an alert. In my case, the goal is to build a more powerful alert templating tool for the standard email method rather than requiring the integration also deliver alerts. One major result of this is that the integration server reliability becomes less critical -- if it is not responding, the default presentation could be used instead, for example. Hopefully that makes sense, please let me know if I can answer any questions. Thanks, Mark
  20. Scenario: Alerts are acknowledged or SDT'd either via email or from within LM web interface but no description or identification is entered when ack'd/SDT'd. Some alerts are long term due to the nature of the issue but could go un-noticed and cause real problems if incorrectly ack'd. Determining who ack'd an alert and why would be paramount in Root Cause Analysis. Query: How can we determine which user acknowledged or SDT'd an alert from within LM? Email ack/SDT responses show the user account, but LM does not and not all auditing parties would necessarily receive the emails sent when someone acks an alert. Request: Add a small field to the Alert Page when clicking on an alert to display the username or email address of the user who performed the action. Either that or auto-append said username/email address to the Notes field. Forgive my mspaint skills.
  21. Hi, It appears that all alerts generated from eventsources are cleared in 60 mins by default. We had the setup of LogicMonitor -> PagerDuty but the 'cleared' alerts didn't get cascade cleared in PagerDuty. Here are the examples: 1. 2. Pls verify and see whether there is a quick way to add this feature to cascade clear in PagerDuty. Thanks! Horace
  22. Hello, I recently had a case where we were trying to find out if there was a way to disable alerting on specific instances during active discovery. Currently we have instances that are discovered via snmp that we do not want alerts to be enabled since some networks we are monitoring are test networks. To turn these alerts off, we have to manually find the instances and turn off alerts from there. For LogicMonitor Support, the case number was 54822.
  23. Hi, I had a discussion with John Matthews this morning and would like to raise this as a feature request. Let's assume a Linux VM is monitored by LogicMonitor via SNMP. If this disk space exceeded 90%, I (IT administrator) want to issue a SNMP set query to execute a script to either troubleshoot that VM or run some clean up routines. As of today LogicMonitor only allows user to post this alert into external system ( Is there any chance this feature request can be enabled for future LM release? Thanks & Best Regards, Horace
  24. On the Alerts page it sometimes gets hard to scroll left/right with long lists of alerts. Currently the scrollbar is at the bottom of the page and it is static. FR: Please make a persistent horizontal scrollbar that is separated off the main table.
  25. Hi, I've discussed with John (in Austin) from the support team. The 'value' & 'datapoint' fields are not displayed in the alert details window currently. It would be nice if end user can see these two fields in alert detail windows (i.e. lot more intuitive). Thanks & Best Regards, Horace