Michael Dieter

  • Content Count

  • Joined

  • Last visited

  • Days Won


Everything posted by Michael Dieter

  1. R2NNTG My group supports several hundred switches across all of our buildings and locations, but we don't always receive reliable or timely information about events that change the local density of wired connectivity needed to support constituents and their various devices. This frequently results in a significant amount of wasted switch port capacity and wasted electricity when a specific location is vacated or its use changed and we are not told, leaving a 48-port switch where 24 ports or less would be sufficient for example (worst of all is the scenario where we purchase additional switch hardware to support growth or expansion in a location while under-utilized switch hardware needlessly burns power and annual maintenance $ elsewhere. The referenced datasource here is expected to help this situation. It is not a switch-by-switch, port-by-port uptime measurement, rather it shows the percentage of ports with link (ie in use) over a range of time. Its value is expected to come from providing a longer-term trend of ports in use for specific locations by highlighting those locations where switches can be removed due to persistent over-capacity, but it can certainly be effective tracking use-cases with shorter term periods as well. It is extremely well-suited to presentation on a dashboard. Note that this datasource was built for Juniper switches [and includes specific interface filtering so you will need to adjust that (along with Applies To and any thresholds) to meet your needs] but it is most likely not difficult to substitute other vendors' MIB/OID into it. Much thanks to Josh L on the pro-services team who did all the real development work. R2NNTG
  2. Hey Andrey.. this use-case came to mind immediately when I initially saw the release notes announcing propertySources. but the reasons for not actually using a propertySource include: 1)as a dense, slow and dim-witted "network guy" I've had a hard time clearly and accurately understanding propertySources 2)my level of groovy-scripting competence is zero (see #1) and 3)I thought it would be easier to modify an existing datasource than to start from scratch (so basically I cheated, especially since Johnny Y took all my development notes and did the actual modification). I think it would be cool & hope that somebody does post this functionality as a propertysource; in the meantime hopefully other Juniper customers might get some value out of this one.
  3. Name: Name: Juniper Virtual Chassis_lmsupport Display Name: Displayed As: Juniper Virtual Chassis_HW Info Locator Code: YWWE74 I modified (actually, I had help from Support) the datasource Juniper Virtual Chassis- so that values that originally were displayed in the UI as only descriptive text on per-instance mouse-overs are now presented as properties. Juniper switches present difficulty to the datasource Device_Component_Inventory; this modification allows a single-step way to associate with stand alone switches and virtual chassis while getting inventory data on a per-member basis instead of on just whichever switch happens to be the virtual chassis routing-engine at the time of polling. And it comes with the huge bonus that using ILP as the instance grouping method produces a great presentation in the UI. It collects from jnxVirtualChassisMemberEntry (there are other values there that may be of interest to you, so walk it) for each member of a virtual chassis, but it does require that you enter any specific info you would like to see into the command <set...member n location [any desirable text value]>. We chose to enter building and room location along with asset tag number and it is stored by the property auto.vcmemberassetinfo Also, you will need to configure even standalone switches as a virtual chassis <set...member 0 location [any desired value]> The Descriptive text in the original datasource is not reportable, but properties are, and this lets us create a great report using the Device Inventory report. It gives us device name building, room number and asset tag for each individual switch in a virtual chassis serial number for each individual switch in a virtual chassis HW model info for each individual switch in a virtual chassis junos version for each individual switch in a virtual chassis Special thanks to Support Engineer Johnny Y for doing most of the heavy lifting (after recognizing that I was trying to pound a screw in with a hammer), all the other Support Engineers who patiently answered my questions in a series of cases, and to CSM Kyle for kickstarting me.
  4. Name: Juniper Router Inventory Displayed As: Juniper MX and SRX HW Info Locator Code: RJEAXH I cloned the Device_Component_Inventory and modified it to work with Juniper MX and SRX devices, since D_C_I didn't work with Juniper OoB. There is more information available within Juniper's jnxBoxAnatomy if you'd like to further modify, but serial number and box description were most valuable to us. We added snmp sysContact, which is how/where we've chosen to record Building Location and Asset Tag number in a router's local configuration file (we wanted to reserve snmp sysLocation for future use) and represent that with a property called "auto.assetinfo" and then we added the system.sysinfo property. We've used this very effectively with the Device Inventory Report, which gives us inventory with the following: Device Name (in Logicmonitor) Serial number Building, Room number, asset tag HW Description Junos version Thanks very much to my CSM Kyle and to each of the Support Engineers who provided answers to questions that helped me put this together.
  5. Any ideas, or in-use methods, for how to track switch port capacity? By this I mean tracking how many ports are actually being used now and in historical time-frames? Right now, this is a labor- & time-intensive manual process and is basically impossible to see any kind of trending. But it has an awful lot of value for a few reasons: being able to reclaim ports no longer in use to support additional wired connections is far less expensive than adding a new switch when demand for density n of new wired connections is generated -->new switch = purchase price + maintenance support + electricity + install labor being able to identify wired connections no longer in use can permit switch consolidation--> reducing the number of switches deployed lowers electricity bills, maintenance support bills and will have a big effect on reducing cost in the HW-refresh cycle switches are dense devices from a monitoring perspective; with fewer of them, collector deployments and resources can possibly be consolidated, reduced and/or simplified I've cloned and then customized the Interfaces datasource and then adjusted the filtering to return ports that are down. The default interfaces datasource filters returns (of course) ports that are up. But this is where I have gotten stuck: So now some math (addition, division, multiplication) is required to arrive at a "capacity percentage" (ports up/total port count [ie ports down + ports up]) * 100 = capacity percentage And then some way to keep a historical record in graphical or tabular format is needed. I know this functionality exists out there in the market; thought I'd ask here to see if anyone has figured out how to do it on their own before exploring a feature request. thanks.
  6. shamelessly replying to my own topic (even if its text is not totally accurate: there are OSPF and BGP datasources which provide valuable information about adjacencies and peers, respectively), just to bring it back to the top of the list in the hope that others might see it, find value in it and give it a vote.
  7. I've published these customized interface datasources for use with Juniper Networks' switches and routers. Combined with additional snmp configuration of the devices, these have helped make Juniper devices a little easier to deal with. I think there is much room for additional customization to permit further grouping. NOTES: in all three the Collection schedule is unchanged but the Discovery period has been reduced to the minimum (10 minutes), which may not be necessary for all use-cases/environments. none of the default datapoints (normal and complex) were removed or edited in any way "snmp64_If_juniper_VCP_interfaces" does not capture every single VCP port in VC larger than 2 members. Additional investigation is needed to understand how Juniper makes VCP accessible via snmp, and whether or not it is possible to discover and monitor every such instance. snmp64_If_juniper_logical: MM6C96 snmp64_If_juniper_physical: WZ2AZC snmp64_If_juniper_VCP_interfaces: YZ42H7
  8. OK, I've finally had a chance to validate this configuration and I can tell you that it works, with a few minor alterations....see below. I have deployed this on an MX-80 running Junos 13.3R9.13. One other relevant addendum to my original "you need to know your MX HW & SW in detail" caveat: I have 20 x1 GE and 2 x 10GE MIC-3D powering my physical interfaces; if you have anything else consult Juniper documentation for sampling support information. good luck with that set chassis fpc 1 sampling-instance NETFLOW-INSTANCE #####The above statement is valid for MX-240, MX-480, and MX-960 HW, though you will need to specify the fpc you want to use. Also, there are very likely some limitations with regards to the number of sampling instances per fpc that you can create, the total number of instances that can be configured per chassis, and whether any single instance can span multiple fpc. #####The below statement is valid for MX-80 HW. Given that MX-80 has a single tfeb, there are almost certainly much stricter limitations that govern the configuration of the number and deployment of sampling instances. set chassis tfeb0 slot 0 sampling-instance NETFLOW-INSTANCE #####From here down is the same regardless of MX model, though of course the physical and logical interfaces will vary. set chassis network-services ip set services flow-monitoring version9 template LM-V9 option-refresh-rate seconds 25 set services flow-monitoring version9 template LM-V9 template-refresh-rate seconds 15 set services flow-monitoring version9 template LM-V9 ipv4-template set forwarding-options sampling instance NETFLOW-INSTANCE input rate 1 run-length 0 set forwarding-options sampling instance NETFLOW-INSTANCE family inet output flow-server port 2055 set forwarding-options sampling instance NETFLOW-INSTANCE family inet output flow-server source source-address set forwarding-options sampling instance NETFLOW-INSTANCE family inet output flow-server version9 template LM-V9 set forwarding-options sampling instance NETFLOW-INSTANCE family inet output inline-jflow source-address set interfaces ge-1/3/3 unit 2630 family inet sampling input set interfaces ge-1/3/3 unit 2630 family inet sampling output
  9. We don't really actively do anything with LogicModules besides a few cases of cloning defaults, editing and then re-applying them but I think we'd at least like to be on the radar as far as the exchange goes as we would be interested in the possibility of contributing what little we might be able to. Feel free to contact us anytime.
  10. Hey James as I said, this takes a while to work through all the moving parts. I just recently completed an upgrade to JUNOS 13.3Rx and will be attempting this soon and I'm not really looking forward to it. What I pasted previously was from a working configuration that exported IPFIX from an MX240; I don't have my notes with me but I do recall that IPFIX and v9 were nearly identical procedures. As you reference, the HW difference between 80 and 240 does come with slight configuration differences. I expect to post again in the near future once I get it working (or maybe it won't work?), but in the meantime this juniper link may help. note also that LogicMonitor's netflow configuration documentation page does have specific caveats with regards to v9 and templates. good luck and post details of your results if/when you get the chance. https://www.juniper.net/documentation/en_US/junos13.3/topics/task/configuration/inline-flow-monitoring.html
  11. As a follow-up, see below for ideas that I should have included the first time. A good way to contribute towards improved Collector efficiency and performance, and potentially moderate the demands that polling exert on your Juniper devices (or any devices, really): Review the datasources that associate with your devices, to ensure that they are providing only information that has value and then customize them accordingly --> adjust Discovery schedules and eliminate any datasources or underlying datapoints that aren't providing value. Use Caution: don't delete a datapoint supporting a calculation elsewhere! Review datasource Collection intervals (especially multi-instance ones with a high-density) and increase them from their default values (either by globally editing the datasource, creating customized versions with different collection intervals, or setting group/device properties specifying collection interval for that datasource) where possible. For example, maybe its acceptable to poll the 48 10/100/1000 switch interfaces that connect end-user devices every 4-5 minutes, but the 2 fiber uplinks require 2 minute visibility. If you have 100 (or even 50 or 10) switches this can make a big difference. LogicMonitor gives you many different ways to combine settings to achieve this, so think it through to come up with the way best suited for your environment. Just don't forget that changes in collection intervals will impact Alert Thresholds, so make sure you account for that.
  12. I started working on the best-practice of implementing Collector dashboards, which revealed to me a number of undesirable conditions of which I hadn't previously been aware and suggest that I need to go through some Collector Tuning excercises. LogicMonitor has an excelllent reference page on this topic http://www.logicmonitor.com/support/settings/collectors/tuning-collector-performance/ For those of you who are monitoring any density of Juniper devices, 1)I feel your pain; 2)I would recommend that you use this Juniper reference to complement LM's instructions. https://www.juniper.net/techpubs/en_US/release-independent/nce/information-products/topic-collections/nce/snmp-best-practices/junos-os-snmp-best-practices.pdf And another note: you can look for my previous thread for details if you want, but I do not at all recommend using SNMPv3 to monitor either Juniper Virtual Chassis (switch stack) or Juniper routers with multiple Routing Engines. If you've somehow managed to consistently and successfully do this, I'd appreciate a post with your solution.
  13. Working through best practice of creating collector dashboards. The various data collecting tasks provide a wealth of info that can be customized as desired into widgets for such a dashboard. But there doesn't seem to be anything that can provide visibility into the underlying collector mechanisms (tasks, processes, thread, cpu, mem, etc) that support netflow operation. Would probably be nice to be able to see such info, and to be able to put it onto a collector dashboard particularly since its best to pipe netflow to a dedicated collector.
  14. Working through best-practice of setting up collector dashboards. Instead of navigating to Settings-->Collector would anyone find value in being able to create a dash widget that displays the collector version? Perhaps even some way to display current running version, other available versions and their status (R-GD, O-GD, EA), and timestamps of version upgrades that occurred?
  15. Hey Mike, despite the differences between my posted <set> commands and the <set> commands you issued to generate your config, you have a perfectly valid configuration and your results confirm that. What your config and results also demonstrate is that <polling-interval> and <sampling-rate> are variables whose values are not "one size fits all" so I highly recommend consulting Juniper documentation, experimenting, and then reviewing the results you see in LM to arrive at what works best for you and your needs.
  16. I also subsequently learned that the indicated patch for ESX 5.5 has been applied to our environment, and does not appear to have resolved the symptom. Given that the issue is cosmetic and not service-impacting its unlikely our VMWare team will be spending effort to fix it.
  17. I forgot to post that I discovered this appears to be a known issue for multiple ESX releases; one that is only cosmetic. Its interesting that a patch containing the fix was documented as of October 2015 for multiple pre-6.0 ESX releases, but while there have been 4 ESX updates to 6.0 since that time (including 3 so far in 2016) not one of them has included a fix. https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2052917
  18. I've recently discovered a few ESX hosts (ESXi 5.5.0) in order to evaluate whether or not to discover and monitor our entire ESX environment. Since day 1, both have been in yellow error condition for the datapoint "PacketsDroppedRx" under the ESX Host datasource (rate fluctuates between 25k and 125k per second). However, for datapoint "DroppedRx" under ESX Host Interface, the total indicated rate for ALL vmnic instances is always less than 10k per second, and unfortunately there is not a corresponding "DroppedRx" datapoint under each instance of ESX Virtual Machines (there is a PacketsTx and PacketsRx for each Virtual Machine but not one for drops). Strangely, our ESX Admin can find no indication of this "PacketsDroppedRx" value in vSphere anywhere and he is not aware of an ongoing degradation of service that might be expected due to excessive dropped packets. So I have 2 questions: 1)does anyone know to what "PacketsDroppedRx" can be attributed, or at least where to look under the ESX hood to find it? 2)does anyone know why there is not a direct relationship between "DroppedRx" on vmnics and "PacketsDroppedRx" on a host? -- I would expect that the sum of the each individual vmnic "DroppedRx" would equal the value of "PacketsDroppedRx" for that host.
  19. I previously posted about sFlow export from Juniper EX/QFX5100 switches. This time I'll post about exporting Netflow v9 from Juniper MX routers. I'll start by noting that little additional effort is needed as far as LogicMonitor goes and by strongly reiterating that you should pay close attention to LM's best practices and think through carefully the details of your implementation. Then I'll provide these disclaimers: 1)conceptually, this is very similar to the way lots of features are implemented --> create a definition and then specify how and where to apply it. In practice I think it is painfully tedious and Juniper's documentation contributes just as much to the difficulty as it does to providing an answer. Expect to burn a few hours to get this working 2)I actually can't put this particular method of flow collection into production because it has both HW & SW constraints but I am confident that it will work (it mirrors a working IPFIX configuration that was operating with a non-LM collector) as soon as I get to the appropriate Juniper SW release (Junos 13.3Rx). The takeaway here is: you need to know your MX HW & SW in detail. 3)I offer no guarantees, your mileage may vary and ABSOLUTELY use the <commit confirmed> option when you turn this on. NOTE also that I haven't included any links to Juniper documentation here, because well, there's too many references and which ones you will need is going to be driven by the MX router HW and SW that you are attempting to implement on. set chassis fpc 1 sampling-instance NETFLOW-INSTANCE set chassis network-services ip set services flow-monitoring version9 template LM-V9 option-refresh-rate seconds 25 set services flow-monitoring version9 template LM-V9 template-refresh-rate seconds 15 set services flow-monitoring version9 template LM-V9 ipv4-template set forwarding-options sampling instance NETFLOW-INSTANCE input rate 1 run-length 0 set forwarding-options sampling instance NETFLOW-INSTANCE family inet output flow-server port 2055 set forwarding-options sampling instance NETFLOW-INSTANCE family inet output flow-server source set forwarding-options sampling instance NETFLOW-INSTANCE family inet output flow-server version9 template LM-V9 set forwarding-options sampling instance NETFLOW-INSTANCE family inet output inline-jflow source-address set interfaces ge-1/3/3 unit 2630 family inet sampling input set interfaces ge-1/3/3 unit 2630 family inet sampling output
  20. If I interpret correctly, this is similar to a feature that would be valuable to us as well, EX: it would be great if there was a way to create a grouping of Interface datasource instances (individual interfaces) from disparate routers. This would allow creation of a group that included only router-to-router interfaces which typically deserve different treatment than the thousands of end-user interfaces we have. I'd love to hear ideas how others may have accomplished this in the meantime.
  21. Resolution reached: finally walked away in frustration from attempting to use SNMPV3 and Juniper EX Virtual Chassis. After extensive work with Juniper support, I discovered that even Advanced JTAC does not know how to make this work. While it is possible that LM SNMP operation might contribute to an interoperability issue, I am not interested in a line-by-line validation of adherence to RFCs. Regardless, I have serious doubts about Juniper's ability to preserve SNMPV3 communication across Virtual-Chassis Routing Engine mastership changes. SNMPV3 works very well with stand-alone devices...but I do not recommend it for use in Virtual Chassis.
  22. Can anyone share how they have solved the problem of maintaining SNMP v3 auth/Priv connectivity between LM collectors and a virtual chassis after there is a change in the RE mastership? Juniper offers 3 methods to set the local engine id: 1)enter no config, and automatically the default ip address of the RE at the time of configuration is used --> communication will fail as soon as this RE is no longer the master RE 2)set a value for the local engine id: this produces some interoperability issue between LM and the virtual chassis --> no snmp-discoverable datasources ever get discovered even though the switch logs no indication of SNMP credential failure 3)use the MAC of the management ethernet port: well, this one will change too as soon as there is a change in RE mastership. I have an open case with Juniper support but I am sort of getting the run-around from them and there brand -new documentation support site is the equivalent of a Byzantine Labyrinth. Any ideas, feedback or comments are appreciated. Thanks.
  23. Further clarification---and BTW Im not faulting LM for the seeming interoperability. I cant say where the break is occurring yet.rnIve documented this behavior with both EX 2200 and EX 3300 families and at JUNOS 11.4R5.x and 12.3R3 and higher.
  24. LM by default currently includes OSPF adjacencies and BGP peers as datasources when it discovers our Juniper routers--very nice. Could the VRRP status of interfaces on routers be similarly included by default? This would probably also make it possible to have a \'\'vrrp flapping\'\' alert, so OOB we could get alerts for VRRP flapping just like OOB there are alerts for physical ports flapping. vrrpOperState initialize(1)backup(2)master(3)
  25. Well, after experimenting with the input format of the lat/long coordinates, Ive found that a group or device will appear on a map where expected. So that part of my request is done!rnStep 2 then, is to see if LM can be set up to read the OID where our Juniper devices lat/long is configured and allow us to include them in groups dynamically based on lat/long?