Eric Egolf

  • Content Count

  • Joined

  • Last visited

Community Reputation

0 Neutral

About Eric Egolf

  • Rank

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. If I understand this approach correctly, it may also solve another item I have been looking at for years. The concept of comparing the average CPU usage for a period of time...lets say the average of an hour on monday with the average of the previous monday and alert if it is over that amount by say 30%. The same holds true for bandwidth on my 300 some odd customer firewalls where we always seem to find out after a customer calls saying internet is slow...then a quick check to Logicmonitor shows they are using way more bandwidth than usual. Would much prefer to simply have a datasource called "Internet Usage Increase" that calls these things out.
  2. Cole, are you suggesting that I could create a datasource, say "smoothed CPU". That datasource would be a powershell script datastore. Then in the manner described in your post, that script would pull the last 10 datapoints from another datasource, say CPU Process Queue via api and then do the averaging/smoothing?
  3. We have datapoints that are very spiky by nature. In order to see the signal through the noise so to speak we need to average like 10 datapoints together... effectively smoothing the data. For example if we took 1 minute polls of CPU Processor Queue or CPU ready we would want to plot the average of the past 10 datapoints. If anyone has suggestions on how to do this or how they approach datasets that are inherently to noisy for threshold based alerting I would love to hear about it.
  4. Perfect thanks Cole. This worked very well for me. The only comment is that I had to find the location of the applications and services logs. I found this article that helped.
  5. Thanks Cole...great approach...any chance you can share your Powershell code or Datasource?
  6. Background - We have a fairly large citrix environment(70 customers, 1200 users). Each customer has 1 or more xenapp servers depending on how many users. The environment is setup in a manner that often times the first step in troubleshooting is having the users logon/log off(which obviously creates an event id). We would like to plot the number of logon/logoffs(via event ids) per every 10 minute period and look for anomalies(periods of high logons/logoffs relative to normal or relative to number of users in environment). First step for us is simply plotting the data. Any ideas ideas on the best way to approach this problem. My initial thought is simply to write a powershell script to search for the eventids over the 10 minutes and return the number...then apply this to each xenapp server in logicmonitor but maybe there is a better approach? I also don't know the best approach to aggregate by customer or even factor in the number of users...assuming we would need to export to excel to handle some of that. Ideas welcomed.
  7. Is there a way to use Wildcard or something else to select host groups in the NOC widget. We have all our customers listed in customer/* and it is an ever growing list and about 80 long. Is their a way to automatically add them. Eric Egolf President, CIO Solutions
  8. 100% agree. 99% of the time no data is either 1.) Credential Issue 2.) Services on the Server Issue 3.) Datasource Instance Issue 4.) other configuration issue Only 1% of the time is it an issue with the server that could impact it functioning properly. This means that the path for resolution is much different than an alert. As an MSP handling that path differently, i.e. a different skill set and priority, would be nice.
  9. We find many instances where even with Group Policy Set to auto restart the services that they still hang. In addition group policy works when 1.) You have a very logical structure, not as good for one offs 2.) You want it to be the authoritative location for settings/standards. For us Logicmonitor is the Authoritative place to define services we care about monitoring. It is Authoritative from a combination of Active Discovery and manual adjustments. Because it is authoritative it is also my defacto standard. To then have to create the standard in a document outside LogicMonitor and keep the standard up to date is something we will fail at. Every time we bring on a new customer we need to monitor LogicMonitor and then modify the group policy. Anytime that customer deviates from the standard I need to tweak LogicMonitor and my group policy. Simply put lots more work when you manage 80 some odd customers that are the same yet different and those differences are already being captured in the monitoring tool.
  10. More often than not i have a host centric view of the system rather than a datasource/alert view, i.e. i want to know what hosts are impacted then what datasources on that host. The current and past UI only allow the datasource/alert view of the work. If you allowed us to group on certain fields, in my case host name field or customer name(Group name represents customer name) that could collapse/expand to see the alerts. This would be huge for us and i\'m sure add value to a number of other use cases. Something like + Server1(2) - Disk Space Alert - CPU Alert.. -Server2(4) -Server3(1) +Server4(3) - CPU No Data - Memory No Data - Disk Volume No Data
  11. The normal SDT works for some scenarios but not all. I see at least 3 different SDT options to use when scheduling an SDT SDT-Option 1-Scheduled Maintenance - This is what SDT\'s are currently for. I usually use this when when we know either through scheduling or once a problem has been reported on a host we are working on. It still displays the alerts but doesn\'t trigger emails or create tickets. SDT-Option 2 - Normal SDT + disable visible alert - This option would NOT display the alerts for the given datasource or host while in SDT. Usually we do this when we know we won\'t care about the datasource or host for a month or week. I recongize that alot of the reports and views have options to show alerts in SDT or not but when you have 10 or 15 people using LM the fact that alerts are still in most of the views by default(coloring servers, icons at the top) give the apperance of a messy and unkept system which inturns leads those 10 people to think they should leave it unkept. SDT - Option 3 - Option 2 + alert trigger If still in alert state - This would option would be the same as option 2 but cause a specific alert to trigger if the host/datasource is still in SDT.
  12. Steve, I emailed you about this but thought I might post it to draw from the collective. As we are adding more and more customers integration of their network map into logicmonitor would be very helpful. Here is what I am thinking…maybe a tall order but it doesn’t hurt to think big. If there was a separate Maps Tab that had a directory of our customer maps. Each map object could be given a unique name that would also have a .jpg associated it with. Then any datasource instance could have single or multiple maps objects they associated with and the X,Y coordinates on that maps that they are associated with. When you click on a map object in the maps tab then it would render to JPG with Red/Yellow/Green lights on the X,Y coordinates of the associated datasource. You would also need to provide a visual way to associate host datasources with the right locations(x,y) on the maps. Borrowing some ideas from intermapper might be helpful as well. Bonus: If there was a “slide show†widget that would rotate through all our maps we would put it on the NOC LCD and allow everyone in our support department to hold each other accountable for updated/updating maps. Bonus 2: If the map object could be an http link with credentials then we could actually have the map object point to our confluence page which is the authoritative location for customer network maps. Bonus 3: If the map object also had an optional section that had something like the following categories with icons then I could associate customer server datasource’s to logical categories. This combined with the map would effectively provide a unparalleled “quick view†into both maps and app status of all our 50 some customer environments. - DB - Email - App Server - Firewall - Network - Internet(Or Network) Speed – - Disk Space - Bonus 4: Where the is located the ability to provide a datasource value such as % percent free space. If you provided integration into something like iperf then could also indicate network/internet speeds gathered from Iperf periodically.