Steve Francis

LogicMonitor Staff
  • Posts

    267
  • Joined

  • Last visited

Everything posted by Steve Francis

  1. Yeah - looks like it's simple enough. https://karloluiten.nl/chrony-status-command-ntpq-p-alternative/ We'll change the script to look for chrony as well as ntpq
  2. Is there a specific kind of alert that is not clearing? For example, we do have a known issue where alerts that are triggered on an instance, that is then removed (either manually or via active discovery), while the instance is in alert, do not have the clear notification sent to the integration. That is being worked on now. Is this a Windows events, as opposed to datasources, or batchjobs? Any more details appreciated!
  3. Hopefully were watching the release notes, and you noted this yourself, but property sources will allow you collect serial numbers. There are a few PropertySources in Core repository that get serial numbers - the NetApp_Product_Info is probably a good one to copy off of.
  4. Nope. Not something we cover. We try to be excellent at performance, availability and capacity monitoring, alerting and presentation - which necessarily means we cant cover everything. That's one of the things we don't cover for now, sorry. Tripwire was what sprang to mind for me, too....
  5. For the EqualLogic SANs - Dell does not expose the CPU or memory metrics at all. Dell's response is "No, there's no way to monitor the internal stats like CPU, cache, etc.. Given the design, a simple CPU/MEM graph wouldn't provide you with relevant information. This is especially true in multimember groups. What you care about is how the array(s) are handling the IO load under the various circumstances. " Regarding the 7610 - if we aren't monitoring it adequately, contact your account manager to open a datasource request. (And if you get the MIBs from http://en.community.dell.com/techcenter/storage/w/wiki/2691.mib-files-for-equallogic that would help.) Thanks
  6. As Kerry says, you could create a glob in the device field, like prod!(SQL*) which would include all servers with prod in the name, without SQL in the name after it....
  7. Two cases here: In the general case where you are not setting container memory limits - then containers will just get as much memory from the OS as they ask for. Which is fine, so long as the OS has memory to give. So you probably don't care how much memory any container has consumed - only whether the sum of all container memory requests can be satisfied without impact. And yes, as you note above, the way to monitor this is at the OS level - if the OS has to resort to active swapping, then you have exceeded your memory capacity. The standard Linux Memory Usage datasource will monitor and alert you of this - you dont need (or want) to monitor swap usage and activity per container. If you do have memory issues, you can then use the Docker Containers Overview graphs to see which containers are using the most memory, and maybe limit them; maybe move them to a different container host; maybe just reconfigure the applications in them (reducing the max heap size if they are java apps, say.) If you do care about the usage of a specific container (say - it's hosting an app known to have memory leaks) you could just set an alert on that instance of the Docker Containers datasource, when mem_working_set exceeds some limit. You probably would not want to express it as a percent of total OS memory - that's just another abstraction you dont need. (And the threshold would possibly need to be recalculated if you move the container to another host) If you are setting container memory limits - cadvisor will report those memory limits. The Docker containers datasource, however, does not currently read the limits. It could be trivially made to do so, however, so comment if you want that. It could then compare used mem to the containers limit of mem, and alert if it gets too close - although I'm not sure what your actions would be, other than investigate before the container gets killed by hitting the limit and requesting more...
  8. The current integration is documented at https://www.logicmonitor.com/support/settings/integrations/servicenow-integration/
  9. After Marketo's large outage due domain registration expiring, we created a DataSource that monitors the amount of time remaining on a registered domain. https://www.logicmonitor.com/blog/avoid-front-page-news-outage-like-marketo/ Locator ID: HCZPGR
  10. Further details https://www.logicmonitor.com/support/settings/collectors/collector-caching/
  11. The Gauge widget is a bit of an odd case - if in Gauge mode, it will always show the current (most recent) value of a datapoint. In this case, the min, max and average setting doesn't make a difference. If you switch to Sparkline mode, so you see a history - then it matters. (Each point on the graph, which represents multiple data samples, will be the minimum (or max, or average) of all the samples represented by that one graph point.)
  12. Thanks for letting us know. The page I referenced above used to document it - but that token doesn't work in alert messages, so was apparently removed from that page, but not added anywhere else.... It does work for data collection, so has been more appropriately documented at https://www.logicmonitor.com/support/datasources/creating-managing-datasources/tokens-available-for-data-collection/
  13. Yeah - that seems like a silly omission on our part - not being able to manually define ILPs in the UI. Sorry about that - I'll get that fixed.
  14. Hopefully you've noticed that since April, Netscan does allow you upload a CSV file to add devices.... https://www.logicmonitor.com/support/settings/netscan-policies/creating-netscan-policies/#Script-Scanning- (That anchor link is not exactly on the CSV part, so scroll down a little...)
  15. You can now do this, using Dashboard sharing. https://www.logicmonitor.com/support/dashboards-and-widgets/managing-dashboards/sharing-dashboards/
  16. Two factor authentication is now available in the current release, even without SAML. https://www.logicmonitor.com/blog/2016/12/15/enhancing-security-two-factor-authentication/
  17. It means it's on the list of things to do - not currently at the top of the list, though....
  18. Currently, the only way to do this is to associate the device with a dead collector..... An inelegant hack, we know....
  19. Fair enough - we are redoing the alert escalation chain/rule flow (adding things like allowing alerts to flow through and match multiple rules) - so we'll be looking at a way to address this issue too.
  20. Generally, this means your vcenter is overloaded, or under configured. "-1" is vcenter shorthand for "I can't answer this right now". It can also be returned when querying a VM, when the VM has been powered off. You'll get -1 for a few samples, then NaN (Not a number). Anyway to correct: as noted https://www.logicmonitor.com/support/monitoring/os-virtualization/esx-servers-vsphere/ You must ensure your Vcenter server is sized appropriately for the numbers of hosts and VMs it is managing. See VMware's documentation for size requirements. (Note that the Vcenter database requirements must be added to the Vcenter server requirements if they are on the same machine.) If you are using Vcenter 5.5 or lower, it is also often necessary to tune the Vcenter memory configuration, as documented in VMware's knowledgebase here and here. (VCenter 6 will dynamically adjust the memory allocation to the services if the VM running Vcenter is allocated more memory and rebooted.) If Vcenter does not have enough resources, it may refuse some connections to the API (HTTPS) port (easily seen in the HTTPS- datasource in LogicMonitor), or it may report a value of "-1" to some performance queries. Both of these situations will cause gaps in graphs.
  21. Nothing wrong - you just need to know the SNMP community you have configured on your UCS devices, so that you can tell it to LogicMonitor, so we can monitor it. For info on configuring SNMP on UCS, see http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/gui/config/guide/2-0/b_UCSM_GUI_Configuration_Guide_2_0/b_UCSM_GUI_Configuration_Guide_2_0_chapter_0110.html#topic_EBD2F0ECC71441B59BA6A8E155D7C470
  22. From the Devices page, just click Add Device (wizard). It should walk you through adding them in. For the cisco devices, you'll need SNMP credentials. For the NetApp, you will need both SNMP and API credentials.
  23. You can do this now. Clicking the cog to the left of your Dashboard’s name will take you to a configuration window in which you can change the Dashboard’s status from public to private (or vice versa),
  24. You shouldn't need to do anything - if you aren't seeing extra data and monitoring once you've imported the data sources, it probably means you also need to import the UCS OID mappings. (We're making the importing of objects with dependencies like this easier, but right now its a two step process.) Under Settings...Oids..Import.