Mike Moniz

Members
  • Content Count

    111
  • Joined

  • Last visited

  • Days Won

    18

Community Reputation

24 Excellent

1 Follower

About Mike Moniz

  • Rank
    Community All Star

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Hmm, how would you then reset $alerthasbeenraised back to false once set true? If it's once the alert has cleared, then that is how the system works now. If you never reset it, the same alert will never occur again, even if if the same problem occurs months later. If it's after the alert has cleared in the system for x number of checks, then you just re-implemented Alert Clear Interval. You might want to look at the AI Ops stuff like Dynamic Thresholds which seems to be more of their focus to limit flapping. Or look if the ticketing system itself can handle auto-merging tickets or the like. Sounds like you're saying that keeping an alert from clearing will keep a new ticket from being created, yet... Exactly, if you don't clear the alert until the cause of alert is really fixed, it will not create extra tickets because there isn't any new alert instances to create tickets for. So prevent the flapping from occurring rather than deal with them afterwards. LM will not send an Active message until after the alert clears. And by "clears" I mean is no longer active in the system, not that it sends a clear message. If you're still not sure what I mean, perhaps you can let me know how you think/expect the integration works (with example) so I get a better idea where there might be confusion.
  2. No problems. So instead of trying to merge multiple alerts into one, you can just make just one long that doesn't clear until it's really fixed. DataSource alert basically would work like, this as an example: You have a CPU check with a threshold of > 90 that runs every minute. Has "Alert Trigger Interval" configured for 3 and "Alert Clear Interval" for 2. 1:00pm: CPU is at 40%. No alerts 1:01pm: CPU is at 100%. LM notes it's over threshold but is waiting for Alert Trigger Interval to hit 3 (at 0) 1:02pm: CPU is at 100%. LM notes it's over threshold but is waiting for Alert Trigger Interval to hit 3 (at 1) 1:03pm: CPU is at 100%. LM notes it's over threshold but is waiting for Alert Trigger Interval to hit 3 (at 2) 1:04pm: CPU is at 100%. Warning alert created. Active message sent to integration creating new ticket. (at 3!) 1:05pm: CPU is at 100%. Alert ACK. ACK message sent to integration. 1:10pm: CPU is at 40%. LM notes it's under threshold but waits for Alert Clear Interval to hit 2 (at 0) 1:11pm: CPU is at 40%. LM notes it's under threshold but waits for Alert Clear Interval to hit 2 (at1) 1:12pm: CPU is at 40%. Alert Cleared. Clear message sent to integration (at 2!) 1:15pm: CPU is at 100%. LM notes it's over threshold but is waiting for Alert Trigger Interval to hit 3 (at 0) 1:16pm: CPU is at 100%. LM notes it's over threshold but is waiting for Alert Trigger Interval to hit 3 (at 1) 1:17pm: CPU is at 100%. LM notes it's over threshold but is waiting for Alert Trigger Interval to hit 3 (at 2) 1:18pm: CPU is at 100%. Warning alert created. Active message sent to integration creating new ticket. (at 3!) So what is happening here is that the CPU was only ok for 5 minutes before it flapped the alert. And LM was told to only wait for 3 minutes before it should clear the alert, allowing a new ticket. If you increase Alert Clear Interval to say 10, it will wait 11 minutes before clearing the alert hence just considering it a single alert and not create a new ticket. So you can customize the "flapping timeout" (for lack of a better name) by changing the Alert Clear Interval. One possible problem is that this is a per-DataPoint option and not system wide. Now perhaps you prefer LM to just keep using the same ticket, regardless of delay between flapping, until the ticket has been resolved. But LogicMonitor does not know the state of the ticket, it doesn't know if a ticket is still opened or not. To do this you would need to provide this within the ticket system itself or using some special system between the two. It's the sending of the Active message that is creating a new ticket, not the clear message. LM doesn't track ticket so it's not going to only send Active alerts if it has also sent a clear alert. They are all independent and it's more straightforward: An alert has occurred = send Active message. An alert has escalated = send ACK message. An alert has been ACK = send ACK message. An alert has cleared = send Clear message. Hope that helps clears things up a bit
  3. What do you mean by status flaps? Do you mean a datapoint that alerts, then clears, then alerts, then clears, etc or the literal "StatusFlap" datapoint (de)escalating that some interface checks have? I can't speak for ConnectWise Integration itself since I don't use it, but most of the integrations (without client-side addons) work the same way. I would expect the former to cause a new ticket each time but the later to update. An alert that occurs would trigger the "Active" action in the integration which usually creates a new ticket. LM doesn't consider how long an alert has been cleared before it sends an Active message on re-occurring alert. It doesn't matter if the alert cleared 1 minute ago or 3 years ago before the alert occurred again. I believe the point is to change the Alert Clear Interval to make sure the condition has been really cleared and stable before clearing the alert.
  4. None that I know of, I typically do want to auto-open. Perhaps some browser javascript modifier plugin that can block onClick events? No idea. Typically when I need to go down a list of items manually that auto-open I would start from the bottom and work my way up. So that even if it opens it doesn't get in the way from clicking on the one above it. Also if manually going thru a list of devices is a frequent thing, you might want to look into using/learning the API and automating the work.
  5. Not that I'm aware of. I haven't looked too closely at collector polling details but from my understanding tasks get queued at the collector. You can see this queue by using the !tlist debug command. So when it runs could change depending on the workload.
  6. So PowerShell is converting it to a String causing encoding problems. If you switch to Invoke-WebRequest the Content property will be a proper Byte array which you can save to file. See https://www.reddit.com/r/PowerShell/comments/719oip/downloading_a_file_via_http_directly_to_a_variable/dn9snst/ . Note that the pipeline version was extremely slow on my system but this version was quick. $Response = Invoke-WebRequest -Uri $url -Method $Verb -Header $headers -ContentType "application/binary" -Verbose $File = [System.IO.FileStream]::new('lm-filestream.exe', [System.IO.FileMode]::Create) $File.write($Response.Content, 0, $Response.Content.Length) $File.close()
  7. It seems to work for me if I use the -OutFile parameter with Invoke-RestMethod but didn't figure out having it assigned to a variable that is then outputted. I also changed the content type to binary instead of json since you are not looking for json data, but seems LM will send binary data anyway. Invoke-RestMethod -Uri $url -Method $Verb -Header $headers -ContentType "application/binary" -OutFile "lm-out.exe" -Verbose
  8. You can take your cloned copy and basically modify the autodiscovered filter so it ONLY contains your Loopback0 interface and that can be a dedicated "snmp64_loopback" (or whatever name you want) DataSource. You can then setup global thresholds on this new DataSource which will only apply to Loopbacks. You can still keep the original snmp64_if for primary and other ports with it's own separate thresholds.
  9. I'm not sure if I follow but I don't see how you can setup an ignore since LM SNMP Datapoints just use GET requests only (afaik) which means it needs to know the exact OID to use. So if I understand correctly, can you do this? Setup Active Discovery to use 1.3.6.1.4.1.41916.9.2.1.6.4.64.68.161.220.4.107.199.135.25.2.12426 using wildcard discovery. That will get you wildvalues of 12386, 62520, etc. Then have a DataPoint using a OID of 1.3.6.1.4.1.41916.9.2.1.10.4.10.0.41.2.4.107.199.135.25.2.12386.##WILDVALUE## ? The fact that the base ActiveDiscovery OID and the DataPoint OIDs are different are normal, or are you saying that the OID for Latency can have all sorts of different groups of numbers instead of just the last number changing? So that you need a wildvalue of more then just the last number? I might be misunderstanding since I've never seen OIDs that long before. Also at worse you can use scripting to do SNMP (see https://www.logicmonitor.com/support/terminology-syntax/scripting-support/access-snmp-from-groovy/ ) and make the wildvalue or the check itself whatever is needed, including using Walks.
  10. I haven't dealt with this issue myself but did find a reference to the same issue and someone using a 3rd party tool to track it down: https://community.spiceworks.com/topic/2199890-nagios-xi-windows-logon-errors?page=1#entry-8292751
  11. Have you looked at the data APIs? I haven't used them myself but seems to fit the request. https://www.logicmonitor.com/support/rest-api-developers-guide/v1/data/get-graph-data/#Get-widget-data https://www.logicmonitor.com/swagger-ui-master/dist/#/Data/
  12. When you add snmp.meraki.com to LogicMonitor, with the correct SNMP details from the cloud website, LogicMonitor should pull in all the devices hosted in that cloud account. It will look something like below. You can also add the devices directly, assuming the collector is on the same network, which provides a bit more detail like listing each interface for switches, but also some overlap with the cloud version. I would suggest setting the cloud version first, then perhaps add local versions of each type of device and review differences. You can always do both and disable checks that overlap if needed. Also note that if you have multiple Meraki Cloud accounts, they each need to be on their own collector OR you can use DNS tricks like discussed here: Meraki Multiple Organizations https://www.logicmonitor.com/support/monitoring/networking-firewalls/meraki-cloud-wireless-access-controllers/
  13. Some possible workarounds: I think LM Config might be the better option for this, although I haven't played with it personally yet. It's designed for config files, but I don't see why the "config" file can't just be the version info. Also rather than attempt to use properties to store state, you can try using a file on the collector, for example "../tmp/firmware_${hostname}.LastRun" then it would be easy to read in the script. There might be an issue if resource bounces around collectors though. You might also be able to using Auto Discovery Instances with DataSources, which lets you set auto properties, but that might be tricky to implement.
  14. If your are not aware you should be able to edit the soap xml values to fill in whatever fields you need. I don't use Autotask myself but it looks very similar to other Integrations. When you setup an integration, you will fill some values on the top half and then click on a Generate button. This will then auto-populate the HTTP Delivery section based on your values. You can edit the various parts of the HTTP Delivery section and modify the default Soap XML request to add in any other field you want to add. You can use various LM Tokens or hard-coded values.
  15. When you create the Escalation Chain, you need to create Stages. In the escalation chain choose the "+" next to Stages to create a new stage. This will popup a "Add new recipients" window. Click on the "+" which asks for a user, pick/type-in a user as mentioned above. Once you pick a user, it will offer a "Contact Method" box to the right where you can pick your PD integration. This will only show up if you pick a normal user, does not work for groups nor api-only users. Click on the white save button then the blue save button.