Recommended Posts

I would love to see LM implement a new feature for taking a built-in, self prescribed, action on an alert.  To minimize any exposure that LM might have in an action gone awry, the actions taken could occur as the result of a script that one could upload into the Escalation Chain.  Ideally you could define multiple actions or multiple retries on an action and whether that occurred before or after the recipient notification in the notification chain.

This would allow for very basic alerts (disk, service restarts, etc) to be resolved programatically.  Also being able to support various scripting languages such as PowerCLI, Ansible, etc would allow for some very creative ways to integrate with solutions such as VMWare or Ansible Tower for very complex actions to be crafted by more expert skill level folks.

  • Upvote 5

Share this post


Link to post
Share on other sites

I'd like to second this... We need this functionality (like nagios event handlers)

If ____ fails, restart service
If restarting service _____ times and _____ is still failing, THEN send an alert.

or worded example.. let's say dns resolution is failing.... rare but usually named needs to be restarted and life goes on... why do I need to be paged to do that at 2 in the morning when LM can do it for me ? :)

  •  "dns resolution has failed"
  • try to restart the service twice (user specified)
  • If dns resolution has not resolved itself, then page that the service is down and that we have tried to restart the service twice without resolution.
Edited by Mike Suding
typo

Share this post


Link to post
Share on other sites

Hello,

We're currently looking to evaluate LogicMonitor as a potential replacement for Microsoft System Center Operations Manager (SCOM) and prior to SCOM being our enterprise monitoring tool we had IBM Tivoli Monitoring (ITM) in place within our organization.

So we come from well over 10+ years of being able to take corrective actions without the tools themselves in response to various alerts that are raised and based on our initial demo with the LogicMonitor team, we understand that's not a feature of the product as they don't want to be in config management business which I understand.

However, we can't be the only organization that has this issue so I'm curious how others have worked around this that would be willing to share their solutions. 

Here are some simple things we do today:

  1. 1. Windows Service Restarts (we only alert in most cases if the corrective restart action fails)
  2. 2. Linux Process Counts (we'll attempt to restart the process or execute some type of other scripted action)
  3. 3. IIS Application Pool failures (we'll attempt using builtin Windows functionality to recycle and AppPool)

Appreciate the responses, thanks!

 

 

Share this post


Link to post
Share on other sites

Hey @NBM,

Because our collector can act as a script-runner, we have some folks that have written scripts that look for those conditions, act on what they see, and then report back to LogicMonitor after taking one of those actions.

Check out @Mike Suding's blog post and DataSource for Windows Services as an example:

http://blog.mikesuding.com/index.php/2016/09/20/restart-a-service-alert-if-restart-fails/

Best,

Kerry

Share this post


Link to post
Share on other sites

@Kerry DeVilbiss Thanks a ton!  We're working towards an official POC and appreciate the communities response on some of these initial questions we've had.  I'll definitely want to try this out to see how it functions once we get further along. 

Share this post


Link to post
Share on other sites

I've created two more DataSources. One that restarts a Linux service and one for Windows that runs specified commands when it detects an alert.

Monitor a Linux Service and restart if needed:  http://blog.mikesuding.com/2018/12/27/service-restart-for-linux/

Generic run some PowerShell (Windows) commands or a script if alert happens:   http://blog.mikesuding.com/2018/12/03/automatic-action-triggered-by-an-alert/

Share this post


Link to post
Share on other sites

This would be an INCREDIBLY USEFUL feature in my environment right now.  Mike's script looks great, but I can't seem to make it work.  I downloaded it, installed it, and added the monitored service, but it only ever shows a "4" and I get the errors below.  I thought it might be a permissions error or something so I jumped onto the collector using the same credentials and ran the commands in Powershell.  This worked fine.  Any ideas what might be wrong here?

 

 

Get-Service : Cannot find any service with service name 'hidserv'.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:8 char:20
+ ... e_status = (Get-Service -Name $service -ComputerName $hostname).Statu ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (hidserv:String) [Get-Service],  
   ServiceCommandException
    + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell. 
   Commands.GetServiceCommand
 
get-service : Cannot find any service with service name 'hidserv'.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:11 char:6
+ if ((get-service -name $service -ComputerName $hostname).Status -eq " ...
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (hidserv:String) [Get-Service],  
   ServiceCommandException
    + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell. 
   Commands.GetServiceCommand
 
Get-Service : Cannot find any service with service name 'hidserv'.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:15 char:32
+ ... ervice -InputObject $(Get-Service -Computer $hostname -Name $service)
+                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (hidserv:String) [Get-Service],  
   ServiceCommandException
    + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell. 
   Commands.GetServiceCommand
 
Start-Service : Cannot validate argument on parameter 'InputObject'. The 
argument is null or empty. Provide an argument that is not null or empty, and 
then try the command again.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:15 char:30
+ ... ervice -InputObject $(Get-Service -Computer $hostname -Name $service)
+                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidData: (:) [Start-Service], ParameterBindi 
   ngValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.Power 
   Shell.Commands.StartServiceCommand
 
get-service : Cannot find any service with service name 'hidserv'.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:21 char:8
+   if ((get-service -name $service -ComputerName $hostname).Status  -e ...
+        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (hidserv:String) [Get-Service],  
   ServiceCommandException
    + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell. 
   Commands.GetServiceCommand
 
Get-Service : Cannot find any service with service name 'hidserv'.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:25 char:38
+ ... ervice -InputObject $(Get-Service -Computer $hostname -Name $service)
+                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (hidserv:String) [Get-Service],  
   ServiceCommandException
    + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell. 
   Commands.GetServiceCommand
 
Start-Service : Cannot validate argument on parameter 'InputObject'. The 
argument is null or empty. Provide an argument that is not null or empty, and 
then try the command again.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:25 char:36
+ ... ervice -InputObject $(Get-Service -Computer $hostname -Name $service)
+                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidData: (:) [Start-Service], ParameterBindi 
   ngValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.Power 
   Shell.Commands.StartServiceCommand
 
get-service : Cannot find any service with service name 'hidserv'.
At C:\Program Files (x86)\LogicMonitor\Agent\tmp\scr1001001-Service_restart-Ser
vice_restart-Human_Interface_Device_Service.ps1:31 char:14
+ ...        if ((get-service -name $service -ComputerName $hostname).Statu ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (hidserv:String) [Get-Service],  
   ServiceCommandException
    + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell. 
   Commands.GetServiceCommand

Capture.PNG

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.