Brandon

Members
  • Content Count

    76
  • Joined

  • Last visited

  • Days Won

    25

Community Reputation

33 Excellent

4 Followers

About Brandon

  • Rank
    Contributor

Recent Profile Visitors

956 profile views
  1. Hi @Nicklas Karlsson and @Jonathan Hill, So you guys are running into an issue where the datasource isn't finding any instances. That's no good! The datasource uses remote Powershell commands from the Windows collector to the target server. I would probably start by using an RDP session to a collector and running the script from an elevated PS session. Before you run it, you'll need to modify the script to use whatever creds you have set up on that target server. For a quick test, just check to see if these two wmi calls return anything from the remote system: $hostname = <target Server> gwmi -Namespace "root\MicrosoftDFS" -ComputerName $hostName -Query 'SELECT * FROM DfsrReplicatedFolderInfo' gwmi -Namespace "root\MicrosoftDFS" -ComputerName $hostName -Query 'SELECT * FROM DfsrConnectionConfig WHERE Inbound = "False"' If these two work, DM me and we can try to figure out what the issue is. I'm guessing we're running into a character limit and I need to account for that. Thanks!
  2. Hi @John Biniewski, The alert thresholds are largely going to depend on how large your share are. You can set the alert threshold to something quite low like 10, and wait for an alert, but unfortunately, this is one of those times when you'll need to revisit your thresholds to determine what is normal for your environment(s). Wish I could be more help here.
  3. We have begun implementing a tagging standard in our cloud accounts to better control discovered resources and route alerts accordingly. I would like to be able to route alerts by default based on the value of a tag. I'm aware that I can already set up specific users and then achieve exactly what I'm requesting, but I would much prefer to have a blanket rule that uses the tag's value as the recipient email address(es) directly. Some examples below: system.aws.tag.MonitorAlertEmail=ThisIsAnExample@PleaseOfferThisFeature.com system.aws.tag.SendAlertsHere=AnotherExcellentExample@WeWillPayExtraForThisFunctionality.gov AnotherEmailAddress@OkayNotReally.net See the screenshots below for a visual example of how I'd like to structure this automation.
  4. I have built a generic StatusPage.IO datasource to allow for monitoring the status of various services we use. Since so many companies are using StatusPage.io, I figured it's a good idea to have a heads up in the event there is an outage with one of our many service providers. This has worked well as an early warning system for our service desk guys to know about issues before they start getting calls from end users. LogicMonitor actually uses StatusPage, but of course there are many, many others. Attached is a screenshot of the Box.com StatusPage data that we've collected from https://status.box.com. This datasource should be universal to any statuspage.io site. So far it has worked against every site I have tested it against. NYJG6J
  5. I have been using a custom datasource to collect the metrics for each resource and method (excluding OPTIONS) behind a API Gateway stage. It has been extremely useful in our production environments. I would share the datasource via the Exchange, but the discovery method I'm using will not be universal, so I think it would be best if that discovery were to work natively. If possible, could we please have a discovery method for AWS API Gateway Resources by Stage? *Something to note - This has the potential to discover quite a few resources and thus, create a substantial number of cloudwatch calls which might hit customer billing. For this reason, I added a custom property ##APIGW.stages## so that I could plug in the specific stages I wish to monitor instead of having each one automatically discovered. The Applies To looks like this: system.cloud.category == "AWS/APIGateway" && apigw.stages Autodiscovery is currently written in PowerShell (hence why not everyone can take advantage of it) $apigwID = '##system.aws.resourceid##'; $region = '##system.aws.region##' $stages = '##APIGW.Stages##'; $resources = Get-AGResourceList -RestApiId $apigwID -region $region $stages.split(' ') | %{ $stage = $_ $resources | %{ if($_.ResourceMethods) { $path = $_.Path $_.ResourceMethods.Keys | where{$_ -notmatch 'OPTIONS'} | %{ $wildvalue = "Stage=$stage>Resource=$Path>Method=$_" Write-Host "$wildvalue##${Stage}: $_ $Path######auto.stage=$stage" } } } }
  6. I agree. Adding instance groups with auto-inclusion rules similar to device groups would be extremely useful for alerting, graphing, etc.
  7. We're experimenting with netflow now and we are also struggling with these very real limitations. It would be great if we could get a response as to whether or not enhancements to Netflow are going to be prioritized. Currently we're finding that we have no other choice but to rely on multiple tools to gather this data.
  8. It would be immensely helpful if I could see and test alert routing from the Cluster Alerts page at the device group level similar to the existing Alert Routing button on the Alert Tuning tab. As we begin to more heavily utilize this functionality, it's critical that we can verify that alerts are routed correctly wherever we set it up.
  9. Wow @mnagel! Thanks so much for this! I'm going to look into running this so I can get that list put together. I don't have any plans to systematically delete the datasources - I'm just wanting to compile a list so I can review them. I'll feed the obvious ones into a script as a one-time purge and once I've done that, I can take a closer look at those that should be working, but aren't for whatever reason.
  10. HI @Sarah Terry, Thanks for the tip! I also saw that as a workaround, but unfortunately it wouldn't really help with what I'm attempting to accomplish. What I'm trying to do is find datasources with applies-to functions like "isWindows()" or "isLinux()" that aren't actually discovering any instances. It's almost always because the datasource is built to monitor a product or service that we don't use and likely never will. I'd like our datasource list to only contain datasources that are actually in use and applicable to our systems/services. Similarly, there are datasources that apply to specific hardware that we don't own. I'm currently going through manually and removing them so we don't have to scroll past them when browsing our list of datasources. If and when we ever deploy new hardware/software, I'll go and re-import those (updated) datasources from the LM Repository. I hope this makes sense. Thanks again for your response!
  11. I'm trying to clean up datasources that are in our account that do not have any instances associated with them and likely never will. Currently I have to do this manually by inspecting each datasource in the GUI. It would be really great if the datasource instance count was returned as a property. Even better would be if the instances and associated device ID's were returned as well, but for now I'd be happy with just the device/instance counts.
  12. Have you tried enabling rate-limiting? At least until it all gets sorted out - I'd consider setting up a duplicate escalation chain and an alert rule specifically for some of your syslog eventsources and enable rate-limiting on them. Before my LogicMonitor days, I had this happen a few times and it sucks dealing with a crippled Exchange server while also trying to work out a firewall issue. Syslog is unpredictable sometimes.
  13. We have several clustered devices where metrics are gathered on each node. However, the instances across each node are identical. When attempting to graph this data, this means that I would need to add a new datapoint for each instance and use a glob pattern to select the devices from which to pull those instances. This can mean that a lot of time goes into creating these graphs if there are several instances to monitor. Examples: Three Solr nodes - each servicing search requests for the same 10 collections. In order to see the total number of GET requests for each of those collections, I would need to create a graph that has 10 individual datapoints. Instead, I would like to add one datapoint and have the graph intelligently aggregate all instances that have the same name, regardless of the node. Several device groups exist under a parent group. If I want to see the average CPU utilization across each of these groups on a single graph, I would need to add a separate datapoint for each group. A potential solution could be to allow the integration of regex instead of glob patterns to allow for capture groups. Otherwise a simple checkbox for "aggregate instances by device group" and "aggregate instances by instance names" when selecting aggregated graph types would be extremely useful and time-saving.
  14. @Eric Singer - Any chance VMWare provided you with a KB that documents this as a known issue / bug? I'd like to provide as much context as possible to our ESX admins. Thanks!
  15. DO NOT comment out the applies to field on the datasource! This will remove all historical data - which I can only imagine most of us want to keep. You can disable the datasource by creating a device group (if you don't have one already) and populating it with all of the ESX hosts. Then, at the group level, select the alert tuning tab and uncheck the box next to the datasource. This disables polling and alerting, but allows you to keep historical data.