Allen Chan

Members
  • Content Count

    40
  • Joined

  • Last visited

  • Days Won

    5

Community Reputation

7 Neutral

About Allen Chan

  • Rank
    Community Whiz Kid

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Allen Chan

    Polling interval < 1 min

    LM Team, Any plans to support polling intervals < 1 min (ie 30 secs, 15 secs)? Various teams in my org has request such functionality.
  2. Our NOC is complaining there are a lot of transient no data alarms that are hard for the Ops teams to troubleshoot. Please allow administrators to set consecutive polls of no data before alarming to decrease alerts and to engage Ops team for sustained issues
  3. Allen Chan

    WMI testing utility

    Anyone know of a good and safe utility to test WMI from a logicmonitor windows collector to windows servers? I know the collector has features in debug mode to test wmi but i usually like to validate with 3rd party tools as a 2nd source of information. I currently use an utility WMI Query tool by ben coleman but it is hosted on semi-shady sites that i kind of worry about visiting. If anyone has good experience with another utility, please let me know. Thanks
  4. Allen Chan

    Filter instances by those that are alarming

    Logicmonitor team, Any response on this request?
  5. Allen Chan

    New GUI pop up legend creates a problem

    This is a problem for us too. We cannot easily tell what server some of the lines are related to if there are a huge number of datapoints. Please fix ASAP
  6. This kind of feature would be amazing
  7. Allen Chan

    No data threshold

    Mike, are you saying that if we set Alert trigger interval (consecutive polls), then the no data alert will honor that as well? IE if trigger interval is set to 2, the no data alarm would only happen if the collector receives no data after three consecutive polls??
  8. Allen Chan

    No data threshold

    I would like this as well
  9. Allen Chan

    Pause Monitoring per device

    Steve, does the fact that you call your solution a hack mean that you guys are working on a better way to do this?
  10. We have network devices that has hundreds if not thousands of interfaces. We do not let LM automatically delete instance when they go down because typically servers do not reboot a lot. It is a kind of an event that needs to be informed. We once in awhile need to go clean up the instances on these network devices. Scrolling through 26 pages of instances to delete the ones in alarm is tedius task. Feature request is to allow us to add a filter to display instances that are in alarm. Then we can check the empty box to select all and do a group delete. This would save a lot of time for us.
  11. Allen Chan

    Pause or halt polling

    Most monitoring systems has a way to suspend monitoring of a host. Use case: We had an outage and the Ops team blamed monitoring. They asked to stop polling to prove their theory. Unfortunately, right now the only way is to delete the host from monitoring or hack at it by changing the IP to some fake IP. Neither are solutions. Please add this basic feature.
  12. Based on the scaling collectors page, it sounds like bumping threads is the start of scaling of collector capacity. It would be nice to provide metrics based on # of configured threads and # of used threads for the popular collection types. With this information, we can see when we start getting close to running out of the collection type threads and add more.
  13. I have brought this up before and was shot down with the "works as designed". We 100% agree with this statement "Second, when a alert crosses a threshold the second time a week after the original acknowledgement (as we saw in my first post) I think it is safe to assume that should be considered a new "alert session." We have cases with the following conditions: 1. alert triggers on warning threshold 2. NOC acks with "monitoring" 3. alert crosses error threshold 4. NOC escalates to SME 5. NOC acks with "escalating to SME" 5. alert crosses critical threshold 6. NOC acks with "incident created. Management informed" 7. SME remediates just enough to move the alert down to warning 8. SME informs NOC issue fixed 9. NOC closed incident and resumes watching the alert page 10. alert crosses error threshold 11. No notification 12. alert crosses critical threshold 13. No notification 14. server crashes 15. People ask why no alert.... As a monitoring service, over communication is 100x more acceptable than a server crashing.
  14. We used to have ability to poll a lot longer than the current times. Please bring back polling intervals of 6 hours 12 hours 1 day.
  15. Allen Chan

    Logicmonitor monitored items report

    Instead of every single host, wouldnt it be more efficient to iterate through the active datasources (those that have associated hosts > 0) and print out the details needed? Then we get a concise list of monitored items ( that relate to our infrastructure) and it is up to the administrators to explain the appliesTo. IE datasource 1 appliesTo datapoint1 description threshold datapoint2 description threshold .... datapointN description threshold