Leaderboard


Popular Content

Showing content with the highest reputation since 01/18/2018 in all areas

  1. 5 points
    I would love to see LM implement a new feature for taking a built-in, self prescribed, action on an alert. To minimize any exposure that LM might have in an action gone awry, the actions taken could occur as the result of a script that one could upload into the Escalation Chain. Ideally you could define multiple actions or multiple retries on an action and whether that occurred before or after the recipient notification in the notification chain. This would allow for very basic alerts (disk, service restarts, etc) to be resolved programatically. Also being able to support various scripting languages such as PowerCLI, Ansible, etc would allow for some very creative ways to integrate with solutions such as VMWare or Ansible Tower for very complex actions to be crafted by more expert skill level folks.
  2. 5 points
    When there is a legitimate reason for disabling alerts for a device, it would be very useful to be able to leave a note as to why (and by whom). This would prevent confusion with teams, where the case of "why would this be disabled" would come up frequently. For example, there is a known bug with a certain version combination of ESXi and HPE servers that triggers a false-positive hardware alert internally, so we disable alerts for that instance on servers that meet the criteria as we encounter them. Or, some QNAPs will give false-positive alerts that their disk is full when in fact it is "full" due to a RAIN configured as a LUN (we thus rely on the server alerting when the iSCSI volume is actually full). However, another technician may log in and flip alerting for these instances back on, assuming it was a mistake or something, and then we would get flooded with these false-positive alerts, prompting technicians to look into them; as you can see, this causes a loop of wasted time. Simply putting a note associated with the "Alerting Off / On" switch and tagging it with the user invoking it would easily solve issues like this. Something like what is shown for Acknowledgements would be adequate. Perhaps even an admin option to require a note or not?
  3. 4 points
    We've recently run into issues with users accidentally changing a setting or deleting a device and would like the ability to allow users to Create new devices, but not be able to delete anything or change alert settings. I'd like to either split Manage into Write/Delete groups or add a deny action role that would allow me to give users manage access with a deny delete:*
  4. 4 points
    As we move towards a DevOps model, we increasingly have a need for small teams to have full admin access to the tools they use to manage their IT services. When it comes to LogicMonitor, this has proven difficult with the existing role permission model. DevOps pods would like to be able to manage their own datasources, alerts, and escalation chains but this isn't possible unless we give them broad access rights to those areas, which could cause significant harm to other groups of monitoring users. For example, an inexperienced DevOps user could inadvertently modify a datasource that applies to all Windows devices or they could create an alert rule that causes alerts not to be delivered to other users. To solve this problem, I'd propose that LogicMonitor offer alert groups, escalation chain groups, along with the existing datasource groups. Then, LogicMonitor could provide the ability to restrict roles to manage these specific groups. DevOps pods could be given the ability to manage their own custom subset of datasources and set up their own alerts in a rule range after the main set of rules.
  5. 3 points
    We have a multi-tenant MSP environment, We find that we cannot use the same display name for our clients even though the similarly named systems are in different Child folders. So \'DC01\' for Domain Controller has to be unique across all of our clients. Please consider changing this.
  6. 3 points
    We use the Debug Console a lot, and usually to debug an issue on a particular device, which means more often than not, we access the Debug Console from the device in question's Raw Data view. We would like for there to be a token (##THIS## or ##DEVICE##) that could be used with Debug Console commands. So if we were to issue the command !ping ##THIS##, the console would automatically interpolate ##THIS## with the IP address of the device on which we opened the Debug Console. It's a small productivity gain, but when you're doing this a lot it makes a big difference.
  7. 3 points
    I'm currently new to LogicMonitor. I think LM has done a pretty good job on their monitoring tool. I love how we could manage our network devices thru SSH remote access from the management console. Without LM, we would have to VPN to our internal network, and then SSH to the network devices from there. One downside about LM is that we couldn't manage network devices thru HTTPS (GUI). Nowadays, new technologies like firewalls are managed by HTTPS/SSH. I would love to have HTTPS remote access right from the management console. It would be much easier and faster to gain access to the GUI interface. I have tested Auvik monitoring software on the HTTPS, it was nice and smooth. This would be one of the nice feature that LM should provide. Thanks, Pao
  8. 3 points
    Don't know if anyone else noticed, but MS released a pretty slick script that enables WMI access remotely without admin rights. I have done a brief test with LM and it seems to be working well. https://blogs.technet.microsoft.com/askpfeplat/2018/04/30/delegate-wmi-access-to-domain-controllers/ That's the article. I created an AD group instead of a user to delegate, and I put the LM collector service in that group. Everything else I've followed as documented. I haven't tested anything else, but this alone is a huge step in the right direction.
  9. 3 points
    Please add the option to alert on "no data" condition to the instance level Alert Tuning configuration dialog. We don't want to generate "no data" alerts for everything and we don't want to split the data sources (extra maintenance when updating), so it would be easier to have this as a instance level override.
  10. 3 points
    Hi All, We have really been enjoying the Remote Management feature of logic monitor. For sites that we don't have a direct interconnect with its great being able to quickly SSH onto our devices to make adjustments or check config without having to open up a separate VPN tunnel. However with HTTP/HTTPS management becoming common with Firewalls, Controllers, Routers etc... I feel there is a huge opportunity to have logic monitor be able to fit almost every management use case by implementing an HTTP/HTTPS remote session functionally in the same way RDP and SSH remote sessions work. We as a company would primarily use this feature for help managing networking Equipment, but functionality would extend to Printers, IPCameras, Security Systems, Phone systems, UPS and many more. Let me know your thoughts, Thanks, Will.
  11. 3 points
    There are currently far too many opportunities to commit errors in LM from which is is difficult to recover since there is no version tracking. Ideally, it would be possible to revert to a previous version of any object, but especially very sensitive objects like logicmodules, alert policies, etc. I have created my own method of dealing with this, which leverages the API to store JSON streams of all critical elements regularly, changes committed via git (certain adjustments to the original results are needed to avoid a constant update stream). Recovery would be very manual, but at least possible. This would be far more useful within the system itself. Thanks, Mark
  12. 3 points
    I would appreciate it if datapoints marked for alerts on No Data were indicated in the Alert Tuning page with the designated alert level displayed. Right now, to know this, you have to dive into the datasource definition to find out. Thanks, Mark
  13. 2 points
    Allow devices to be dependent on one another. If a router goes down, the switch behind it will most likely go down or have an error as well.
  14. 2 points
    It will be great for the MAP widget to be able to display website. We monitor over 2000 devices and would be good to be able to display the geographical outage of our devices.
  15. 2 points
    I'd love to be able to record the actual status code that a website check returns, so that we can easily look at the raw data and know that we're getting a 200, 404, 502, etc and pass that directly to an alert, instead of having to figure out what an "8" or whatever means. This would be especially handy when sending alerts to teams that don't spend time in LogicMonitor regularly.
  16. 2 points
    It would be nice to have an export button on any alert table. When we're doing research on an issue and we've finally narrowed the criteria to see the info we need, it helps to have an export button right there, rather than having to go to reports and reconfigure all the parameters to hopefully get the same data.
  17. 2 points
    Hi, with Meraki enabling Webhooks, can LogicMonitor receive Alerts for any of the events you enable on the dashboard? https://meraki.cisco.com/blog/2018/10/real-time-alerting-with-webhooks/ Additionally, is it any different to poll devices Meraki devices directly versus receiving information from the dashboard?
  18. 2 points
    Most often, when people export to a csv or excel format their intent is to receive table data in a tabular format because they're going to pivot it out, chart it, or conduct some sort of analytics/BI function. It would be nice if your csv exports didn't require manipulation of the data to remove erroneous data/whitespace for consumability as a table datasource. This is specifically a problem in Website Overview reports.
  19. 2 points
    So I know this is an old thread, and the above community locator isn't really needed now that the EIGRP peers datasource is in the LM repo.... but I thought I'd post here in case someone else hit a similar issue and wanted a fix. The built-in groovy discovery script was using HEX addresses for the peer addresses in our instances, and it was a pain to decipher them every time one went down. So I added a hex-to-decimal conversion to the discovery script so that he instance names look like "172.20.0.50" instead of "AC:14:00:32" Here is the updated block of the discovery script // get all IP addresses of connected peers. peerAddr_walk.eachLine { line -> regex_pattern = "(${peerAddr_OID})" + /.(.*)\s=\s(.*)/; regex_match = ~regex_pattern line_match = regex_match.matcher(line) handle = line_match[0][2] val = line_match[0][3] //val here may be SNMP data in format of "Hex-STRING: AC 14 00 32" and arrive as AC:14:00:32 instead of an ip address in the instance name if (val.contains(":")) { tempaddr = val.split(":") def newaddr = [] tempaddr.each { hexint -> newaddr.add(Integer.parseInt(hexint, 16)) } val = newaddr.join(".") } Hope it helps someone else
  20. 2 points
    The alert "template" system has no way to reference another instance datapoint value currently. Why would this be good? If you have a datapoint that alerts for status, you want to insert the value of the thing you care about, not the status value. I see the contortions LM datasource developers have gone through to workaround this (e.g., Cisco Nexus Temperature). Please make it possible to reference datapoints at least within the same datasource within alert templates. There are many other issues with alerts, but I will stop here for now :).
  21. 2 points
    I think it would greate if you add some headers to your mails. This will helps to mail program to create conversation for every alert and clean message for it. Now we have only separate messages: LMD... critical - Host1 Ping PingLossPercent LMD... critical - Host2 Ping PingLossPercent LMD... ***CLEARED***critical - Host2 Ping PingLossPercent LMD... ***CLEARED***critical - Host1 Ping PingLossPercent In my opinion, it will better if this message will create conversation for every alert: LMD... ***CLEARED***critical - Host1 Ping PingLossPercent LMD... critical - Host1 Ping PingLossPercent LMD... ***CLEARED***critical - Host2 Ping PingLossPercent LMD... critical - Host2 Ping PingLossPercent As I know, the header is Thread-Index https://excelesquire.wordpress.com/2014/10/17/use-excel-to-count-the-number-of-emails-in-each-email-chain/ https://stackoverflow.com/questions/5506585/how-to-code-for-grouping-email-in-conversations
  22. 2 points
    Please add the name of the collector to the debug console screen. The ID in the URL is not very user friendly.
  23. 2 points
    Hi, GRAPE is a dependency manager for groovy scripts. I know logicmonitor is working on adding a screen to add custom JARs for scripts to use. An even more forward thinking approach would be to support GRAPE. Actually, it partially works right now, however it seems there are still a few errors that would need to be worked out. The benefit of this would be that devops teams could really develop the groovy scripts more freely and deploy them to larger environments without having to do manual processes with importing custom JARs, a process which simply does not scale. In my opinion, it's one of the few remaining barriers to Logicmonitors extensibility and scalability, and it's very close to working. Please investigate and support
  24. 2 points
    While monitoring of websites from pre-defined LogicMonitor servers does provide a good bit of insight, it doesn't really give the full flavor of what end users are experiencing along with insight as to performance issues with specific browsers and versions. What I'd love to see from LogicMonitor is extension of functionality to include APM collection of metrics from the end user's browser itself (this is accomplished using cookies by other monitoring suites.) Further, the metrics should be used to calculate the Apdex (http://apdex.org/members.html) score for the health of the site.
  25. 2 points
    It does seem like Sarah's idea would work, just have to aggregate results across all devices to find datasources that have no associated device instances (except it looks like instanceNumber is the field). That said, sometimes the indication of no instances found for a DS is itself a problem and this is hard to know this within LM currently. I have a script I am working on that will evaluate what LM discovers against a template, specified by a property. This will allow detection of faulty AD logic or problems on the device itself. For example, a switch should discover interfaces, preferably all of them, but at least one. If for some reason SNMP is busted (happens due to platform issues, or in LM when a device produces too many results and discovery times out due to lack of bulkwalk support), I would want to know that and not be surprised later when the data is needed and not available. I wrote this Perl snippet for yours, which shows the instance count for each device/datasource pair, and a final report of any datasource with no instances: my $lmapi = WM::LMAPI->new(company => $COMPANY) or die; my %foundanyinstances; if (my $devices = $lmapi->get_all(path => "/device/devices", fields => "id,displayName")) { for my $device (@$devices) { if (my $devicedatasources = $lmapi->get_all(path => "/device/devices/$device->{id}/devicedatasources", fields => "id,dataSourceName,instanceNumber")) { for my $devicedatasource (@$devicedatasources) { my $dsname = $devicedatasource->{dataSourceName}; $foundanyinstances{$dsname} = 0 if not exists $foundanyinstances{$dsname}; my $numinstances = $devicedatasource->{instanceNumber}; $foundanyinstances{$dsname} = 1 if $numinstances > 0; next if defined $MININSTANCES and $numinstances < $MININSTANCES; next if defined $MAXINSTANCES and $numinstances > $MAXINSTANCES; printf "%s: %s: %d\n", $device->{displayName}, $devicedatasource->{dataSourceName}, $numinstances; } } } } for my $dsname (keys %foundanyinstances) { print "$dsname: NO INSTANCES\n" if $foundanyinstances{$dsname} == 0; } 65,1 Bot
  26. 2 points
    We are a global company with resources in Minnesota, New Jersey, Australia, Ukraine, and India all using the Logic Monitor tool set. It would be incredibly useful to be able to set the timezone at a user level instead of only at the company level.
  27. 2 points
    It would be great to have the granularity to be able to change the Alert Trigger Interval on an instance in the same way a threshold can be modified on an instance. Example: Disk Usage on a netapp There are 100 volumes but one of them needs to be over the threshold for a longer amount of time to cause concern. So the archive log volume will fill up no matter what space is allocated but it is not a concern as long as it is back below the threshold within 1 hour. A custom datasource will not work for this because we would like to alert immediately on all other volumes on this device when they go above the threshold.
  28. 2 points
    A workaround using the !groovy command: def sout = new StringBuilder(); def serr = new StringBuilder(); // Shell command for your OS def proc = 'cmd.exe /c tracert www.google.com'.execute(); proc.consumeProcessOutput(sout, serr); // Adjust timeout as needed proc.waitForOrKill(20 * 1000); println "out> $sout err> $serr"
  29. 2 points
    We would like be able to query historic SDTs to determine if a device,group, or instance has recently come out of SDT via API.
  30. 2 points
    WALDXL - Download Speed This datasource will run a PowerShell script that downloads a 10MB file and then figures out the speed in Mbps that it was downloaded. CAUTION: This datasource will download a 10Mb file for every Windows machine specified in the applies to field(default is not applied),every poll(deafult is 20 minutes), depending on your environment this could raise the price for your monthly ISP bill. Specifically if your ISP speeds ramp up when needed. I would recommend applying this to: hasCategory("speed") and isWindows() The of course you just need to add the system.property of speed to any Windows machine you want to monitor Download Speed on.
  31. 2 points
    Lets start with an example, I have a router which has many ports and I want to put some ports to SDT. I have 2 options, put SDT port by port or put SDT on whole group and remove some ports from it. Current logic is: choose a device and apply what do you want with it. It'll be good to have it somehow (ie clicking tree-structure) other way around - first I choose what I want to do and after I select where to apply it. It can speed up some tasks...
  32. 2 points
    We would like the ability to generate alarm on events generated within the vSphere event log, I really can't believe this isn't in place already.
  33. 2 points
    There currently doesn't appear to be a way to create an eventsource through the REST API. This functionality would be very useful as part of an CI/CD pipeline.
  34. 2 points
    Hi, I just got done working with VMware support on an issue where our ESXi 6.5 hostd process would crash during a booting phase. We eventually traced it back to a bug in some vSAN code that LM monitoring is polling.. It doesn't matter if you're running vSAN in your environment or not. Our work around has been to disable host level monitoring in LM for our ESXi hosts for now and it's been stable ever since. The expected fix is scheduled for release in Q3 2018 from VMware.
  35. 2 points
    Amazon Web Services (AWS) announced a free time synchronization service at re:Invent today. The announcement is at https://aws.amazon.com/blogs/aws/keeping-time-with-amazon-time-sync-service/. Interestingly they recommend that people uninstall NTP and install chrony instead. I tried this on a local Linux host and LM and get the following error: Alert Message: LMD2375 warn - 127.0.0.1 NTP ntpqNotFound ID: LMD2375 The ntpq binary was not found on the agent monitoring 127.0.0.1. Please install the ntpq binary (typically by yum install ntp, or apt-get install ntp) If LM customers running in AWS follow the recommendation their NTP monitoring will fail. Could / will LM support chrony in addition to ntpq?
  36. 2 points
    NOC is an acronym for Network Operation Center. Heads up monitors are typically called NOC monitors, so you can see where the widget being called a NOC widget is useful for those types of dashboards.
  37. 2 points
    After spending some time trying to display instance level auto properties on an Alert widget on a dashboard, it has just been confirmed to me by Dave Lee that it is not possible and there is currently no way to display instance level properties on a dashboard. Since auto properties can hold very useful descriptive information detected during active discovery it would be nice to be able to include this on dashboards as additional information.
  38. 2 points
    Please add a token for alert notes so that we can include an alert's notes in the email notification when an alert clears.
  39. 2 points
    Hey Joe, I'm not sure that this exactly meets your needs, but I think it's a good start. Basically, you can call hostProps.toProperties() method which spits out an array that you can now dig through and filter using regex. Something like this: def allProps = hostProps.toProperties() allProps.each{ if(it ==~ /.*\.databases=/){ println it } } Let me know if this doesn't address what you're trying to accomplish.
  40. 2 points
    We have a use case to show "Response Times" from a subset of configured Websites. Ideally I'd like this to be in the Big Number widget. We also want to able to chart a subset of my Websites' response times over time in the Chart widget. Anyone found a useful workaround to achieve this? Would LM consider "upgrading" widgets to allow the presentation of Website data? Currently only the SLA widget seems capable of handling Website data.
  41. 2 points
    Hello LM Team, It would be great if the NOC widget in the Dashboards have the possibility to filter out inactive devices and show only the ones that actually have alerting enabled. For example we have VCenter that have 1500 virtual machines, but only about 10% of them are to be monitored at all times, so by default alerting is disabled for all VM machines from VCenter but these we actually need. Unfortunately the NOC widget will show us everything from that Datasource and it's problematic. Thank you.
  42. 2 points
    I would love to get my devs more involved in creating LogicModules, and incorporating it into a pipeline. Since LogicModules can be expressed in either pure XML or JSON, I would love to provide them documentation on the construction of LogicModules in these structured data formats natively.
  43. 2 points
    I see a need in the design to alert on deviation from rolling average: example 1: Temperature in hardware is based on fixed baseline (default or manual adjusted) or based on fixed Delta. In real world application it would Make a LOT more sense to alert on Deviation from a 5 day or 30 day rolling average Temp of the box. Reason is, units alarm on the weekends because the office shuts off the AC during the summer. or they alert During the week 9-5 because in the winter the offices crank the heat. All of these ignore nuance of RANGE and Average expectation for the location...The alerting should just be how FAR outside the average Range for the site is. My Nashville facility hovers from 56 to 59 all week. I have it set on 57 so I get alerts at least once a weekend. I could move it to 59...but that's a band-aid. The REAL solution would be to have the software TRACK the last 30 days, and alert when we're outside the NORM for that location. furthermore....with hardware it is not the specific temps that kill the hardware....its the RATE at which the temp changes. so, the alerts SHOULD be based on the average range the system has seen in the last 30 days, and alert ONLY when the rate of change accelerates...but I imagine THAT request would be more challenging to reduce to an algorithm. Example 2: PING times.....I have sites where the Latency range is EXTREME (Mumbai, Johannesburg, Taipei etc...) I'd wished the PING would track the 30 day range and common deviation from norm and alert when the sites see latency that is way outside the expected fluctuation range. 30ms typical 90% of the time + 200-500ms spikes 10% of the time. when Ping times hit 300 ms for more then 10% of the last hour of sampling....then notify warning to inform of change in TREND....not fixed threshold in immediate sample
  44. 2 points
    As we expand pass 2600+ devices, and having to manually spread and adjust devices quantities over multiple collectors, a nice option to utilize our collector resources better would be the option to set the "Preferred Collector" for a device or group to be a "Collector Group" and then during device creation, the LogicMonitor backend logic just assigns each new device to the collector within the group to the collector with the lowest device count. It seems like a rough way to balance the collectors, but in our experience, that's how we're currently doing it and it's not the most perfect, but it has been working very well to allow the collectors to be utilized well, without causing Collector Task Count queuing, unavailable to schedule errors or failed active discoveries. Eric Feldhusen
  45. 2 points
    Please add the ability to select first, second, third, fourth weekend or day of month. So, for example, we need to be able configure: o Third weekend of the month, 0200 to 0400 - meaning both the third Saturday and Sunday of the month o Second Tuesday of the month, 0400 to 0600 And so on. At the moment we would have to manually add multiple entries and keep updating.
  46. 2 points
    Good Afternoon. Would it be possible to set up some sort of color/theme options for the portal? My biggest issue is having to triple check when I am in our production environment vs our sandbox. Sure, I can look at the url - just wanted to suggest that there would be some more options to customize the color scheme either at a user level or to differentiate between production and sandbox. My suggestion from support was to create a new logo, which I can and will do temporarily. Thanks!
  47. 2 points
    Since scheduled reports are the only way to deliver results via email, please add a "Run Now" button to trigger as though the scheduled time arrived. This comes up in many situations. For example, a new monthly report scheduled after the normal date or a change to an existing report you want to run, both with the delivery options used for scheduled mode. I have to hack this now by setting the date a bit ahead of now, then back. Much easier to use the computer for what it is good at instead of fighting with it :). Thanks, Mark
  48. 2 points
    So I am currently having a issue with my NOC and their love of putting alerts in SDT to stop them from escalating to our alert aggregator (PagerDuty). I believe the applying SDT privileges are tied to the acknowledgement privileges, however I still want them to be able to acknowledge and leave notes just not set SDTs. Making applying SDTs like applying alert acknowledges should fix this issue.
  49. 2 points
    4 months later, any news? I have over 800 changed data sources in the LM repository and it would be a pain to handle in the current UI
  50. 1 point
    Hi @Ali Holmes My priority is item 1 on your list. We can live with administration, configuration and time stamps in alert messages being in UTC, but having local times in the presentation layer would address most of the complaints I receive from my users.