Kwoodhouse

Members
  • Content count

    10
  • Joined

  • Last visited

Community Reputation

0 Neutral

About Kwoodhouse

  • Rank
    Community Whiz Kid
  1. 497 days and counting........

    Hi Michael. After spending two weeks troubleshooting this with LogicMonitor support it turns out that the updated datasources will not fix the problem for my devices. The initial support rep said these datasources would solve the problem and this blog post makes it seem as though updated datasources are all you need. I think maybe a more detailed write up here would be great letting people know that while LogicMonitor has updated datasources that are capable of resolving the problem the requirement is that your hosts respond to .1.3.6.1.2.1.25.1.1.0. If you update your software to the latest version and still don't get an snmpwalk response from oid .1.3.6.1.2.1.25.1.1.0 then these updated datasources will NOT resolve the 497 day uptime issue. At which point you need to work on an ssh/telnet datasource to check the uptime from the CLI or train your internal team to realize if the uptime was 497 days its likely a false alarm. Maybe having a better alert message containing info about a known issue regarding 497days uptime would also be good. Either way, having this info included here would likely have saved me two weeks of troubleshooting. Hopefully this helps someone else. The devices involved for me were Cisco fabric interconnect switches, UCS-FI-6248UP with the latest 3.2.3i software. Best Regards.
  2. 497 days and counting........

    Is it possible to add more detail on this topic? I tried to implement the fix last night for a couple Cisco fabric interconnect switches but it didn't seem to work. I have several "Uptime" datasources now and I don't know which one includes the fix or what device types they apply to. I also have a question about how this fix works. Is the fix purely on the LM datasource side or is a software update of some kind required on the Cisco side? I now have at least 5 different datasources that appear to monitor uptime. Whats the best way to identify which ones include the fix? Host Uptime- SNMP OID field says .1.3.6.1.2.1.25.1.1 HostUptime- SNMP OID field says .1.3.6.1.2.1.25.1.1 SNMP_Engine_Uptime- SNMP OID field says 1.3.6.1.6.3.10.2.1.3 SNMP_HostUptime_Singleton SNMP OID field says .1.3.6.1.2.1.25.1.1.0 SNMPUptime- SNMP OID field says .1.3.6.1.2.1.1.3
  3. Cisco WLC total connected AP's

    Thanks. This worked but I had to change the applies to field to the below for this to show up on our AIR-CT5508 and AIR-CT2504 units. system.sysinfo =~ "Cisco Controller"
  4. The alert acknowledgement email template seems to be the only one not available for editing in the settings section. It would be nice to add ##GROUP## to the subject line to match the initial alert and alert cleared email. The ack emails don't seem to list the group in the subject or body of the email making it difficult to search emails. You need to search based on the alert ID or IP address rather than being able to search by group. I find people search by group because that is included on all other emails and then have to help them understand why the ack emails aren't included in their search results. If there is some other way to accomplish adding ##GROUP## to the ack email template let me know. Hopefully this is an easy fix. Thank you!
  5. Internal Service Monitors

    Glad someone else already posted the idea. I would love to monitor our internal ticketing system via the services tab. I need some kind of graph that could show latency and downtime but cant use the services page because the source would need to be one of our internal collectors. I would love to see the services tab get some love and be able to use our collectors instead of the LM servers. rnThanks!
  6. netscan progress indicator

    Tom I agree with you 100%. I hope to see changes in the visibility of a netscan. Its a great feature but you have no idea how far along you are or the errors that were encountered along the way. I would add to this and say the following:rn rn1. The netscan seems to happen in stages. nsplist, nspdetails, etc etc. State the current stage, its progress, and possibly a time indicator until the next stage occurs. Currently the debug commands used to acquire some of this information display inaccurate data and give no progress indicator.rn2. Results page. How did the scan go? Which devices failed to respond to my snmp string? Which IPs responded to ping but not WMI? How long did the scan take?rn3. Similar page or progress indicator for Active discovery. Once the netscan is done you have your hosts in a group and now active discovery takes over and starts scanning the hosts to determine which polls to load on that host. Right now repeatedly refreshing the page or running debug commands are the only ways to see progress. But these methods give you no progress indicator showing how much time has passed, how much time is left or any errors that occurred during the discovery. rn4. Possibly a retry option for any of the hosts that responded to ping but failed SNMP. I wouldnt want to run a /16 scan twice only with a second SNMP string. How about letting me pick the items that failed to communicate via the first SNMP string and try them with a second SNMP string. rn5. Filters. Larger netscans can take many hours to finish. What if I only want my HP servers, or Cisco Switches. Maybe add some filter options to the netscan to increase the speed. For example only load hosts that respond and have a mac address for Cisco. Then you only need to run active discovery on those hosts. (the mac address was just an example, I dont know the best way to actually verify a specific OEM)rn rnEither way I think the Netscan process is very important for loading large amounts of devices or scanning entire subnets. There is lots of room for improvement in this area and I hope to see some development here. Happy to discuss this with any DEV should they have questions or need testing.
  7. Is it possible to add or enhance the new collector datasource and add the ability to see how many hosts that collector is monitoring? Right now I believe you need to go into settings, collectors, wrench, show associated hosts which will show all hosts that collector is polling. Currently an end user or customer has no way of seeing which collector is being used without access to the settings page. Is it possible to create a datasource within the new collector datasources that would show the number of hosts that collector was monitoring? This might provide more context to the new collector stats. It would be great to say, look at the graphs when I have 25 hosts being monitored by this collector. Now look at the graphs when I have 50 hosts. Possibly a \'\'Hosts Associated\'\' graph would work wonders.
  8. Inventory / Site survey tool

    Philop Schorr suggested i drop a comment here for a feature that Jeff Behl and i were working on. You can look at ticket #2722 or get my number from Jeff and call me if you want a run down on the details. Basically i am looking for a site survey or inventory tool. It would be great to drop a collector on a customers computer, run a scan on their local subnet or a range of IP addresses with one or multiple SNMP strings, have all the devices loaded into a group and then report specific output to an excel report. Things i would be interested in seeing on the report: ITEM ID SERIAL NUMBER SOFTWARE VERSION SNMP STRING IP ADDRESS\'s HOSTNAME UPTIME CISCO2801 FTXxxxxxxx 12.4(28)T8 IPBASEK9 Test 192.168.1.2 2801CMElab 1 day, 13 hours, 15 minutes It would be great to get a report of all servers and networking equipment at their site so you could generate a quote to monitor everything. As an added bonus all their equipment is already loaded into monitoring! You could offer health check services, inventory analysis for purchasing their equipment or offering upgrades to outdated gear, offer warranty services, check software versions for vulnerabilities, etc etc etc. (I could keep going here but you get the idea). Before you say Netformx or whatsconnected let me tell you i have used both and they have issues. Jeff already figured out how to load two new fields into our portal and it shows all Cisco devices serial numbers and part id\'s. \'\'Cisco Serial Numbers by number \'\' This MIB or OID or whatever he used has already proven better at detecting the correct PID and serial of not only the chassis but the sub modules (power supplies, HWIC\'s, NM cards, SUP720 blades, etc) then both Whatsconnected and Netformx. So he already figured out the first part and all i really need is a reporting function to export those fields listed above to an excel report and you have Netformx and whatsconnected beat. At least on the accuracy of part ids and serials that is. Sure they might have more fields they can report on but once you get the framework down im sure you could include any SNMP grabable field into the report/scan. Also worth noting is that the first thing most ticketing systems or hardware providers will want is the serial number of the problematic device. For us our ticketing system now links to LogicMonitor alerts and a ticket is created. We are working to have the chassis serial number populate on the ticket as that serial number 99% of the time is what brings up the warranty status and service levels for that customer. I could imagine this is similar for most networking equipment and server tech support companies. It just makes sense to have those serial numbers in LogicMonitor and in the alert emails if needed. Long post, sorry. Check out my portal, talk to Jeff, give me a call. I would love to work on it more with you guys. Cheers,