Horace Cheung

Members
  • Content Count

    10
  • Joined

  • Last visited

  • Days Won

    2

Community Reputation

3 Neutral

About Horace Cheung

  • Rank
    Community Whiz Kid

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Hi, Our team recently has certain error scenarios found in multiple production sites. As of today we're monitoring specific exception (via keyword match or Regex expression) via LogicMonitor and trigger alert to be generated. This solution has few drawbacks: 1. Requires us to know ahead what're the specific exception(s) to monitor in each log file (e.g. Tomcat, ActiveMQ) 2. Requires us to download all the logs from each production site that has this issue (some of our customers requires VPN/Secure access and it's very inefficient to download these logs from each site to analyze) Our team then run a quick log streaming POC and discovered datadog is one of the vendors that provides a decent log streaming solution (to the cloud) and allow us to search & perform analytics (see https://www.datadoghq.com/log-management/). It'll be great if LogicMonitor can implement something similar to enable us to elasticsearch these logs in the cloud to enable faster troubleshooting analysis. Thanks & Best Regards, Horace
  2. Hi, Per discussion with Russ G. & Kenyon W. & Jake C. yesterday, I would like to submit this as a feature request to the DEV team and see whether there is any way to add this feature into future roadmap. In short, it'll be great if end user can configure multiple incident/alerts into 1 group and generate only 1 alert (with highest severity). Here is an example of Tomcat being shutdown which shows a number of alerts generated: 1. Tomcat shutdown ‘critical’ alert is generated (1 alert) 2. ActiveMQ consumer count of specific queue alert has reached zero ‘Error’ alert (about 10-12 alerts for our case) In this case end user would like to be able to configure such that LM will consolidate all alerts into one critical alert (i.e. all AMQ 'Error' alerts are cleared)? I saw something like this in PagerDuty and must say it’s a great feature to have in LogicMonitor to reduce # of alerts being processed by the TechOps team: https://www.pagerduty.com/blog/alert-triage/ Thanks & Best Regards, Horace
  3. Our team has verified that secure syslog forwarding (via TLS) is not supported currently and would like to submit a feature request to LogicMonitor DEV team to asses whether secure syslog forwarding can be implemented. An example will be syslog-ng forwarding secure (i.e. encrypted) syslog messages to LogicMonitor collector. https://www.balabit.com/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/concepts-tls.html This will enable centralized logging server to forward secure syslog messages to LogicMonitor collector then. Thanks & Best Regards, Horace
  4. Thanks Mark & Mike for your feedback!! (been working on another project & just caught up on Mark's update). I still think this is worthwhile to pursue LogicMonitor DEV team to implement this as every customer will need to have this custom wiki troubleshooting page configured for each specific alert threshold within a datasource. I'll bring this up to my technical account manager to discuss further.
  5. @Mike Suding I can't access http://confluence.acme.com/whatever so I'll try elaborate one more time. Let's assume I only have 2 alert thresholds setting: Data source | Alert threshold (critical case) | Wiki Page Value 1. WinVolumeUsage | 90% | http://confluence.arthrex.io/WinVolumeUsagealert 2. TomcatLogDirSize | 2 GB | http://confluence.arthrex.io/How+to+Adjust+Log+Cleanup+Interval I want to customize in LogicMonitor so that whenever each alert is generated, it'll include as a property "recommendation" field into the JSON message that is sent to PagerDuty. Can you educate me how can I customize this? (e.g. with a new screenshot).. the screenshot above shows a generic JSON message format that is sent to PagerDuty and I did not see any option to customize the recommendation field other than hard code it for now.
  6. Per discussion with Jeff Woeber, I want to submit this as feature request in LogicMonitor end as each alert threshold within each datasource (e.g. Tomcat ThreadPool- ) can have its own wiki troubleshooting page. It’s be a great feature if LogicMonitor enables user to specify it’s own troubleshooting page as optional field for each datasource. User can customize specific wiki page as recommendation whenever an alert is sent to PagerDuty.
  7. Hi, It appears that all alerts generated from eventsources are cleared in 60 mins by default. We had the setup of LogicMonitor -> PagerDuty but the 'cleared' alerts didn't get cascade cleared in PagerDuty. Here are the examples: 1. https://arthrex.logicmonitor.com/santaba/uiv3/alert/index.jsp#detail~id=LME152627&type=eventAlert 2. https://arthrex.logicmonitor.com/santaba/uiv3/alert/index.jsp#detail~id=LME152625&type=eventAlert Pls verify and see whether there is a quick way to add this feature to cascade clear in PagerDuty. Thanks! Horace
  8. Hi, I had a discussion with John Matthews this morning and would like to raise this as a feature request. Let's assume a Linux VM is monitored by LogicMonitor via SNMP. If this disk space exceeded 90%, I (IT administrator) want to issue a SNMP set query to execute a script to either troubleshoot that VM or run some clean up routines. As of today LogicMonitor only allows user to post this alert into external system (https://www.logicmonitor.com/support/settings/integrations/custom-http-delivery/). Is there any chance this feature request can be enabled for future LM release? Thanks & Best Regards, Horace
  9. Hi, Recently I had a chance to chat with LogicMonitor support team & they recommend a new feature request to be submitted to LogicMonitor DEV team. In short, there are times when a specific exception is thrown & logged into the Tomcat log file, we would like to monitor not just that line that throw the exception but also multiple lines before and afterwards. Right now LogicMonitor can display that "exception" line only. It will be really helpful in both production and QA environment if LogicMonitor can display multiple lines before and after that exception line being monitored via Regex. Here are 2 examples: 1. Production - (Pattern to match: "HTTP/1.1failed with response Service Unavailable") 2016-08-04 10:13:49,372 ERROR [NmsThumbnailProvider] GET http://10.101.84.12:8080/barco-webservice/rest/NetworkWall/proxy-source/dvi1-1-mna-2530007307/thumbnails/snapshot HTTP/1.1failed with response Service Unavailable 2016-08-04 10:13:50,733 DEBUG [NmsEventMonitor] longPollNmsEvent response [{"id":1273,"properties":{"attribute":"MODE","object":["OFFLINE"],"type":"Device","name":"NETVIZDONGLE"},"value":["OFFLINE"],"values":["OFFLINE"],"affectedAttributes":["MODE"],"uuid":"d9b13d40-0870-1c02-e000-0004a5281cd0","elementID":"d9b13d40-0870-1c02-e000-0004a5281cd0","source":null,"device":true}] code 200 2016-08-04 10:13:50,733 INFO [RoutingEventServiceImpl] got NMS event: [{"id":1273,"properties":{"attribute":"MODE","object":["OFFLINE"],"type":"Device","name":"NETVIZDONGLE"},"value":["OFFLINE"],"values":["OFFLINE"],"affectedAttributes":["MODE"],"uuid":"d9b13d40-0870-1c02-e000-0004a5281cd0","elementID":"d9b13d40-0870-1c02-e000-0004a5281cd0","source":null,"device":true}] 2016-08-04 10:13:50,733 INFO [RoutingServiceImpl] onDeviceChanged: controller = 10.101.84.12Production: 2. QA testing - (Pattern to match: "Caused by: java.lang.NullPointerException") at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:992) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.NullPointerException at com.arthrex.synergy.routing.nexxis.RoutingControllerServiceImpl.attemptToTelnetRoutingController(RoutingControllerServiceImpl.java:66) at com.arthrex.synergy.routing.nexxis.RoutingControllerServiceImpl.tenetToNmsService(RoutingControllerServiceImpl.java:46) Please let me know whether this can be turn into a feature for future release. It will help reducing amount of troubleshooting time. Thanks & Best Regards, Horace Cheung
  10. Hi, I've discussed with John (in Austin) from the support team. The 'value' & 'datapoint' fields are not displayed in the alert details window currently. It would be nice if end user can see these two fields in alert detail windows (i.e. lot more intuitive). Thanks & Best Regards, Horace