Steve Francis

LogicMonitor Staff
  • Posts

    267
  • Joined

  • Last visited

Everything posted by Steve Francis

  1. The new UI shows the dashboard title when in full screen mode.
  2. So option 2 would be disabling alert evaluation and display during the period of SDT. We will be adding that option.rnCan you elaborate on option 3? Are you saying you want an alert based on SDT duration, or something?
  3. So, firstly, apologies that this has taken so long to get to...What I describe above (allowing filters per rule, for Ack, clear, SDT, etc) is still planned, but due to various dependencies on things changing in the UI, still a ways out. What I would like to do in the short term is to simply make, as dfischer proposed, the Suppress Alert Clear option also suppress the Ack and SDT messages. So if this is selected (and probably renamed something like Suppress alert status updates) an alert will still escalate (through multiple stages); an increase in severity will still be sent; but messages regarding Acks, SDTs and clears will not. To my mind these are logically similar. If you do not want a notification of an alert ending (so you no longer have to deal with it), you probably also do not want to know that someone put it in SDT, or Acked, so you also do not have to deal with it. So this does not address people that use the first stage for ticketing systems, but should cover most other use cases. Feedback?
  4. So, firstly, apologies that this has taken so long to get to...rn rnWhat I would like to do in the short term is to simply make the Suppress Alert Clear option also suppress the Ack and SDT messages. So if this is selected (and probably renamed something like Suppress alert status updates) an alert will still escalate (through multiple stages); an increase in severity will still be sent; but messages regarding Acks, SDTs and clears will not. rnTo my mind these are logically similar. If you do not want a notification of an alert ending (so you no longer have to deal with it), you probably also do not want to know that someone put it in SDT, or Acked, so you also do not have to deal with it.rnSo this does not address people that use the first stage for ticketing systems, but should cover most other use cases. Feedback?
  5. So, firstly, apologies that this has taken so long to get to...rnWhat I would like to do in the short term is to simply make the Suppress Alert Clear option also suppress the Ack and SDT messages. So if this is selected (and probably renamed something like Suppress alert status updates) an alert will still escalate (through multiple stages); an increase in severity will still be sent; but messages regarding Acks, SDTs and clears will not. rnTo my mind these are logically similar. If you do not want a notification of an alert ending (so you no longer have to deal with it), you probably also do not want to know that someone put it in SDT, or Acked, so you also do not have to deal with it.rnSo this does not address people that use the first stage for ticketing systems, but should cover most other use cases. Feedback?
  6. This often comes up, but its a tricky issue, because the things we graph (say, total disk space, and used disk space, in GB) are not necessarily the things we alert on (such as percentage of disk space in use). So we could do it for those graphs where the alert threshold is set on a datapoint that is graphed, but that would be inconsistent. Were still mulling this.
  7. So, firstly, apologies that this has taken so long to get to...rn rnWhat I would like to do in the short term is to simply make the Suppress Alert Clear option also suppress the Ack and SDT messages. So if this is selected (and probably renamed something like Suppress alert status updates) an alert will still escalate (through multiple stages); an increase in severity will still be sent; but messages regarding Acks, SDTs and clears will not. rnTo my mind these are logically similar. If you do not want a notification of an alert ending (so you no longer have to deal with it), you probably also do not want to know that someone put it in SDT, or Acked, so you also do not have to deal with it.rnSo this does not address people that use the first stage for ticketing systems, but should cover most other use cases. Feedback?
  8. This is done in the most recent netflow refresh, in the new UI. Note that the As percentage of line speed idea is not - Ill open a separate ticket for that.
  9. The new Netflow now stores the top 1000 flows for each minute, and does flows by port, destination, etc.rnCame out in January...Check it out and let us now what you think.rn
  10. I dont think this is a feature well be adding in (the capacity for confusion seems pretty high..) but you can achieve a similar effect.rnYou could leave the polling at 1 minute at all times, but have time based escalation chains, so that during the work hours, when you would have a 15 minutes when work could be done, you have a stage that does nothing with the alerts, and a 15 minute escalation interval. So for 15 minutes, the alerts show in the UI, but do not get routed to anyone. After that, they get escalated. The rest of the time, the first stage would immediately route to someone.rnThis is not ideal, as you still see the alert in the UI during the 15 minutes, but its close...
  11. There is nor currently an API call to update instance descriptions - but if you just want to stop them from being removed automatically, thats easy enough.rnJust edit the datasource and remove the OID from the description field.rnThen you can manually edit the instance descriptions, and they will never be wiped out.rn(Of course this means that if you do have switches that do support interface descriptions - those will not be pulled in.)
  12. Actually, you can do this now.rnThere is just no clue that you can in the UI, unfortunately.rnBut when adding a metric to a metric trend report, for example, instead of adding InOctets, you can make it InOctets*8, say, to convert to bps.rnOr any expression you can do in a datapoint.
  13. We will be adding this in - but in the interim, you could use the LogicMonitor/Pagerduty integration (http://help.logicmonitor.com/integrations/pagerduty/) which can open/clear issues, but does not send SDT alerts.
  14. This works if you reference the root element first. i.e.:rn$[0].valuernworks to extract the value element of the first member of the array.rnThere is more documentation at http://help.logicmonitor.com/using/datasource/creating-datasources/defining-datapoints/post-processor-methods/JSONrn(Caveat: I tested this in v54, but there were improvements to JSON processing in v54. Its possible this may not work in v53 - but youll upgraded to v54 within a week or two.)
  15. This is part of release 54, which should be active on your account within 2 weeks.rnhttp://help.logicmonitor.com/using/event-sources/windows-event-log-monitoring/ComplexEvents
  16. You can do it now via groovy scripting, but look for an easy integrated way to do this (and much more amazon related) to be announced next month, around re:Invent.
  17. Would this data be actionable (used proactively, possibly with alerts on)? Or just used in cases of troubleshooting?
  18. Does this control make more sense at a per-user level, or a per-chain level? i.e. do I as user Steve want to say for all alerts, regardless of how I received them, I only ever want to get the Acks or Clears via at most SMS (even if I got them via voice call). Or for all alerts, regardless of how I received them, I only ever want to get the Acks or Clears via email (even if I got them via sms or voice call). This would affect that user, for all chains, but be controllable by that user. Or does it make more sense to have this control be at the chain level?So whoever creates the alert escalation controls which mechanisms are used for Acks/Clears? This would be specific to a chain (so a less critical set of alerts may have the acks sent via email, while a more critical set will have acks sent via SMS, too). But users would not be able to individually control this themselves (unless they had rights to change the chain.) If it was changed, it would affect all users that are destinations of that chain.
  19. Hmm -that seems like a mistake. rnWell fix that. If alerting is disabled, so the alerts go away, it should not show as dead...
  20. yes, this (and a whole bunch of other workflow improvements) are definitely being thought about, but wont be tackled until after the initial rollout of the new UI.
  21. We dont make this easy to determine - Ill open some enhancement requests now.rnBut, in this case, there are two issues:rn- there is a dead host being monitored by that collector - TWM... Each thread has to timeout for each of the things being collected on that host. Deleting it will help things a bit.rn- the bigger issue is that you have the wmi.user and wmi.pass properties set on all the hosts monitored by this collector. This means that the collector has to check its authentication every single time, for every request for every host. Its much faster if you run the collector services as a domain account, and do not set wmi.user/wmi.pass. Removes several windows authentication steps..
  22. Currently the most frequent polling interval is 1 minute. (There is actually a bunch of backend end changes that have happened to enable greater frequency sampling and better historical resolution. But these are still going through regressions, running in parallel with the current system. THey wont be out for a few months.)rnSo.. I cannot think of a good solution to this, except to also collect the Latency Session Deviation.rnYou could plot the deviation plus the average to get the absolute worst possible case that latency could have been.rnMaybe there is a better idea, though...
  23. In the collector config file, agent., there is a line:rncollector.wmi.threadpool=50rnYou can increase this to 100, then restart the collector service.rnYou can also do this from within LogicMonitor - from Settings...Collectors, select the tool icon for the collector, then Restart Collector, check Override Agent.conf, and edit the line there, and submit.rnOne issue that causes this is monitoring dead devices (they may have 100 checks associated with them, each of which now ties up a thread as it timesout...)rnTheres a bunch of improvements coming in this area soon, so there will be much more scalability with much less tuning required...
  24. Yes, we are guilty as charged here. We yet make it easy to extend textual data from the devices. It is planned (but not scheduled yet.) But, there are two workarounds, one more kludgy than the other... The less kludgy one is that we may already be collecting the data (e.g. the sysinfo on network devices usually includes the software version. This is already collected, and you can run a report on this. Its visible in the system tab on the devices page.) The more kludgy one is that you can write a datasource, which discovers the info you want via SNMP, and uses that as the name for an instance, but doesn't actually do anything or collect anything on that instance. I can provide more details, or an example, if you want...