Michael Rodrigues

LogicMonitor Staff
  • Content count

    58
  • Joined

  • Last visited

Everything posted by Michael Rodrigues

  1. Groovy Expect Scripting -- "]$" prompts

    @Joe Tran you were close, you should be able to match it with this: '\\[.*\\]\\$' The expect() method actually takes a Java Pattern which is compiled down to a Regex object. You can see if your Pattern converts to the expected Regex by using: println Pattern.compile('yourpatternstring') We should really add a method that takes a plain old regex, sorry for any confusion. Let me know if the above doesn't work for you.
  2. How to calculate IOPS in my dashboard

    Hi @Archana, if you have total read/writes for 2 minute intervals, you should be able to make a graph with a Virtual DataPoint that divides reads and writes by 120 to get average IOPS for those intervals.
  3. 497 days and counting........

    Hey @Kwoodhouse, sorry for the confusion. The fix does rely on your host reporting system uptime as defined in the Host Resources MIB (specifically, hrSystemUptime at .1.3.6.1.2.1.25.1.1.0). If that doesn't OID doesn't return anything, we fall back to using snmpEngineTime. This isn't necessarily the uptime of the system, but rather the uptime of the snmp agent, and it will reset with the agent even if the system does not reboot. The fix was never ported to the module that retrieves Engine Uptime, but it should be easy enough to do. I've put a fix in with the ME team to get this done. I did go ahead and update the alert message in the meantime. Thanks for bringing this to our attention!
  4. Issues With Creating A Datasource

    Taking Mike's advice, you might just try swapping out your filter string with the URLEncoded version: "startEpoch%3E%3A1538370000%2CendEpoch%3C%3A1541048399%2Ccleared%3A*"
  5. SNMP Trap Event Consolidation

    Reviving an old thread, but we're currently reevaluating EventSource suppression logic. Some of the other EventSource types already use a timeout like mechanism to avoid duplicates, but we don't do anything like that for SNMP traps. The general idea right now is to let the user decide which duplicate fields indicate a duplicate event, and suppress anything within the "effective interval" of the original alert. I think it makes sense to have the timer reset logic be optional. I also like the idea of providing more visibility on how many events were suppressed. We've also had a fair number of requests for a mechanism like the DataSource "trigger interval", where we only trigger an alert if we see the same event N times in the interval. Anyways, any additional feedback is appreciated.
  6. What Is My IP as found from a Google search

    @wanabeninja@helient it should be out out of the review holding cell now. I can't imagine Google would have done anything to break this. It works for me. Those hosts/customers aren't behind the same NAT gateway, are they? Or using a shared proxy? Have they tried other sites that do this to see if they get the same result? I was always partial to ifconfig.me, and ipchicken.com, though please don't take that as an official LM endorsement :).
  7. 497 days and counting........

    @Kwoodhouse the one that includes the fix is SNMP_HostUptime_Singleton. It requires the addCategory_snmpUptime PropertySource to work without manual intervention. "HostUptime-" (no space) is deprecated and no longer in core. Unfortunately there's no way for you to get that information in your account currently. SNMPUptime and SNMP_Engine_Uptime- are more or less duplicates. They both get the uptime for the agent, not the host. This seems to be an oversight. Originally, we just looked at the uptime counter with a gauge datapoint. If the value indicated uptime of less than 60 seconds, we'd alert. Of course, this happens during a counter wrap. To fix it, we started tracking the uptime counter with a counter. Given that the rate of time is constant, we should always see the rate of 100 ticks/second coming back from the counter datapoint if the host hasn't been rebooted. The logic in the UptimeAlert CDP looks at both that tick rate, and the raw uptime to determine if the host has rebooted, or the counter has just wrapped. If it's just a counter wrap (no reboot), we'll see 100 ticks/second, even if we see less than 60 seconds of uptime with the gauge. If it's rebooted, the UptimeCounter datapoint could return either No Data (counters need 2 consecutive polls), or, it will return a huge value because no polls were missed, and LM assumed the counter wrapped when it was really reset due to reboot. This is explained in the datapoint descriptions, but is admittedly a bit difficult to grok without an intimate understanding of how LM's counter/derive work. I do still think it's a rather ingenious solution. We use "102" instead of "100" ticks/second in the CDP to avoid false positives, as the collection interval isn't always exactly a minute. I recommend this blog if you're interested in learning more about counter/drive: https://www.logicmonitor.com/blog/the-difference-between-derive-and-counter-datapoints/ I will talk to the Monitoring team about removing some of those duplicates, and getting a public document up explaining it all.
  8. DataSources_List PropertySource

    @pperreault it should be out of security review now.
  9. Monitor File System - extend the built in UNC monitor

    This module should be out of security review now.
  10. Generic RSS EventSource

    This is a generic RSS EventSource. Set rss.url on a host with an RSS URL and it will start monitoring it. Of course, for an LM EventSource your events must include key/value pairs for "happenedOn" and "message". If your RSS feed doesn't use these keys, you can override them with the rss.event.map property. For example, if the event timestamp is labelled pubDate and the event message is labelled title you can use happenedOn:pubDate,message:title for your rss.event.map property. You can also use rss.event.map to add other attributes. Locator: YHM79Y Feel free to clone/rename the EventSource if you want more context in the name.
  11. VMware VSAN

    We have some vSAN LogicModules in the pipeline but we've been waiting to complete and release them until we after we release our update for the base VMware modules. There isn't currently a plan to pull the vCenter-defined alerts through directly. We'd prefer to pull the metrics out and define the alarms within LogicMonitor to avoid noise and allow configuration within the product. That being said, if there's enough interest in just pulling VMware's alarms through we can look at that too.
  12. Scripts for deleting datasource instances

    Hey @BrianG, you're talking about device instances, right? If they share a common property or name you can make a Dynamic Group that holds them all, then deletes that group. When you delete it, you can delete the devices from the account along with it.
  13. DNS Domain registration expiry

    @Mike Graham, try this updated version: 37WCMA
  14. Hi @Archana, check Visual Average for the general trend over the month. Given you're looking at CPU usage, it's probably also worth taking a look at the VaST version to see if you're getting lots of spikes. You won't see the spikes with Visual Average, but it will be harder to see the trends in VaST view.
  15. Hi @Archana, I think we do what you're looking for, but the feature is sort of hidden. Open the expanded graph, then expand the instances pane at the bottom. On the right side of the instance pane, there's another downward pointing arrow. If you click on that arrow and select "Show Boundaries' you'll see min/max/avg for each instance for the selected time range. You may have to aggregate instances into one to get the average across multiple instances.
  16. Hi @Archana, there's a great blog entry about VaST here: https://www.logicmonitor.com/blog/vast-opportunity-with-logicmonitor/ Generally, you'll want to go with VaST when you're looking at a longer time frame, and if you're concerned about seeing peak utilizations that would otherwise be hidden with "visual average". The visual average option is better when you're just looking for trends.
  17. Cisco EIGRP Peers

    Hey @Richard Collisn, thanks for this, we'll include this fix in the version in core.
  18. Scripts for deleting datasource instances

    Hey @BrianG, I don't have a script for you, but I wanted to see if you could expand on this. Are you clearing out old instances or something? AD filters and instance deletion options should be able to solve most instance deletion needs, so I'm curious about what's going on on your environment.
  19. Certificate Expiration Notification

    Hey @ugamike, we do ship a DataSource called "SSLCerts-" that will do this for you. If your hosts with the management webservers are already in monitoring, the SSLCerts- module should already be applied and monitoring them.
  20. How does data are stored in Logicmonitor

    Hi @Archana, we store the data in our proprietary TSDB: https://thenewstack.io/logicmonitor-debuts-time-series-database/
  21. Download Speed

    @mkerfoot, @Nathan Sanders this module is through review, so you should be able to import it now. Thanks for contributing!
  22. VMware_Status Datasource update

    Hey @jrhoat, great addition, thanks for sharing, and keeping with the style! I will put in a ticket to get this added to core offering.
  23. FYI: LM can trigger ESXi 6.5 hostd to crash

    We've released an updated version of VMware_vSphere_HostPerformance. It breaks backwards compatibility with the version 1 series. It also only applies to vCenter by default, to further mitigate vSAN calls triggered directly on the host. When applied directly, AD now triggers 5 vSAN calls as opposed to 30. If you want to keep the historical data before upgrading, you can rename version 1.x of the DataSource and then disable it. The locator code to get version 2 is 99EKKN
  24. FYI: LM can trigger ESXi 6.5 hostd to crash

    @Eric Singer, @Ryan B, @PatrickATL, I've got an update on this, I appreciate your patience. First off, I want to note that neither the collector nor any DataSources are explicitly calling vSAN methods, whether you have it installed and enabled or not. We ship the vSAN SDK with newer collector versions, but there aren't any core DataSources using it just yet. The vSAN queries that brought down @Eric Singer's device appear to be triggered on the server when we call methods to get hardware and version information about a given host. This is based off seeing the same behavior both with the official VMware SDK, and the opensource YAVIJAVA SDK. This means we can't avoid making at least some inadvertent calls to vSAN unless VMware changes this, but we do have a mitigation route. Specifically, these three things can trigger the calls: Auto Properties identifying your ESX host and updating version info (this runs infrequently and doesn't generate many calls, likely not an issue) VMware_vSphere_HostPerformance's AD script - This is the biggest offender, and kicks off about 30 vSAN calls in our test environment. A fix is in the work, but it won't be backwards compatible with the current version as the instance names will change. The fix currently only triggers 5 vSAN calls for each AD run when applies directly to ESX. VMware_vSphere_HardwareSensors AD script - Only triggers once per call, likely not an issue The effect is larger when the modules are applied to ESX directly. When those modules are applied to vCenter, some vSAN calls are still made on the host, but not as many (1-4). Based on the great info we got from @Eric Singer and VMware, we're confident that the changes to VMware_vSphere_HostPerformance will sufficiently mitigate this issue. We haven't yet been able to reproduce the crash in our lab by rebooting and forcing AD repeatedly. @PatrickATL, I appreciate the offer. You might check /var/log/hostd.log for floods of calls to vSAN. Luckily the conditions for a crash seem fairly difficult to come by. I will update this ticket when the fix is released and make sure our Customer Success team gets this info to the other 6.5 users. In the meantime, you should consider disabling VMware_vSphere_HostPerformance on ESX hosts you expect to reboot; you can still safely monitor them through vCenter. Expect a fix early next week. Thanks again for your help on this. Please reach out if you have additional questions or concerns.
  25. FYI: LM can trigger ESXi 6.5 hostd to crash

    Hey @Eric Singer, thanks for bringing this to our attention. We've got our Collector Team looking into how to mitigate this now. We're also working to identify customers monitoring ESXi 6.5 so we can notify them proactively. I will update this thread as we learn more.