Vitor Santos

Members
  • Content Count

    119
  • Joined

  • Last visited

  • Days Won

    18

Posts posted by Vitor Santos

  1. Not sure if you have the same need I had on my organization but, we've developed a simple DS a while ago that fetches some power related metrics.

    You can find it here

    Not sure if it helps you out. Thanks!

  2. I believe we might be doing what you want in our environment (it uses the service name & not its display name).
    Another reason for creating this custom data source was to don't consume so many WMI resources for each box (since it was doing pretty much 1 query per service/instance) on the out of box DS.
    We've came up with a custom one that does a single WMI query (via groovy) to get all the services & then process the output... 

    This reduced the WMI load substantially since we discover all the services running on a server (as an MSP this is way easier in terms of management).
    We can then use AD filters to filter stuff globally and/or make use of the token to filter stuff individually (group level/resource level).

    Not sure if it helps you out but, here you go

  3. Just now, Stuart Weenig said:

    For Groovy based datapoints, when SNMP is the collector, there is a variable set in the script called "output". You can parse through that output variable to get the data. If it's counters, you'd need to use collector script caching to get the previous poll value to run the delta manually.

     

    We were already doing that. However, we wanted values from other datapoints (gauge, derive, etc...).... Which as per the documentation isn't possible (validated on Friday)

  4. 21 minutes ago, mnagel said:

    Seems like a no-brainer to support that, right?  The only way you could as it stands is via API calls, which, without library support, is a non-starter for me.  It could be done if you want to maintain many copies of the same code across modules (which seems to be the norm now, except they are all slightly different based on who writes them as you would expect).

     

    Totally understand you...

    But, for what we want it would really need to access the actual datapoints value... Unfortunately, we'll need to leverage other options.
    I appreciate your reply anyway mate :) 

  5. Hello,

    We've several instance groups within the same DS (Ping, URL, etc...).
    Some of those profiles need to have common properties (per instance group) due to the integration we've with our ticketing system for example.

    It would be GREAT if we could add properties at the Instance Group level (that way those would be inherited by the instances & we don't need to be mapping those individually per instance).
    I tried that & didn't found that possibility... correct me if I'm being noob :(

    Thank you!

    • Upvote 5
  6. 20 minutes ago, Stuart Weenig said:

    Yeah, well, the instance would still be present for up to 30 days in case the process comes back (with the same wildvalue/wildalias). However, it wouldn't be included in polling since it's not showing up in active discovery results. So, the flow would be:

    instance is up and monitoring, process goes down, instance alerts until the next active discovery when polling for it is disabled. if the process comes back up with the same wildvalue/wildalias within 30 days, polling would resume. Otherwise, it would get deleted at 30 days, along with historical data.


    Gotcha! I guess for what we want here that's not good (since AD runs every 15 minutes)...

    Thank you for the help anyway Stuart!

    • Like 1
  7. Yeah, I really think that's the most reliable option to pursuit.

    However, I've coded a DS just to see how it behaves (WinProcessStats_Responsiveness).
    Just in case you want to have a look. Only problem I'm having with that DS is that I don't have Active Discovery erasing the Instances (therefore they'll stay there alarming if that process is no longer running - which is kind of what we want but, not perpetually).

    This is why we're really leaning towards just expecting a number of process & alarm if it's lower that that.

  8. 13 minutes ago, Stuart Weenig said:

    Have you looked at mine? https://github.com/sweenig/lmcommunity/tree/master/ProcessMonitoring/Linux_SSH_Processes_Select

    Duplicate processes with the same command line and same name will be a problem if you're ignoring PID. Under manual circumstances, how would you differentiate between the two between sessions? I mean, if you logged in once and looked and saw process A and process B. Then if you logged in again 1 minute later and saw the same list of processes, how would you know which was you had previously called A and which one was previously called B?

    The answer is, of course, to run them in separate containers, but that's a different discussion.

     

    Yeah I looked into your example but, that has the PID as wild value.

    That's exactly the tricky part of this. Cause nowadays our probe is able to catch those processes (even with same name) & alarm if those get stopped. I just don't know how to replicate this at LM.
    Was wondering if anyone might come up with a workaround that I'm not actually seeing.

    I've tried to come up with something that has the cmdline & then enumerating those as cmdline#1/#2 etc... but, if now there's 3 instances of the process running but later there's only 2... the third instance will return an alarm (cause we don't want to erase it, since we want historical data).

    I guess our only solution would be asking the client how many processes should be running with that cmdline & alert if those are lower than the expected.
    But this is downgrading the monitoring we're doing for him nowadays :( 

     

  9. Hello,

    In our current monitoring tool we monitoring Linux processe profiles using a Regex expression to match one or more.
    Example, we've a profile that will look into processes that contain /.*OpswiseAgent.*/ in their cmdline path.

    Once those are running, the probe picks them automatically and monitors their state (not actually having the PID in mind because that might change).
    In LM we would also not rely on PID since it might change (in terms of wildvalue).There can be also more that 1 process (with diff PIDs) running with the same exact cmdline (therefore it needs to pick those in diff. instances).

    I'm just unsure how to have a working solution having in mind all of this (unique wildvalue & wildalias).
    Can anyone assist here perhaps?

    Regards,

     

  10. 6 minutes ago, Michael Rodrigues said:

    Hey @Vitor Santos, it was changed in v140 as part of a bugfix. The page didn't support pagination, so users with enough Collector Groups weren't able to see them. We reused another similar component that already had paging, but it also changed the behavior to still show those empty groups. We can look into hiding those empty groups again when filtering.

     

    Thank you for the feedback. Is there any planning on implementing similar feature on the collector(s) page?
    Cause it's really annoying having to scroll up/down and/or changing page in order to reach the collector we want.

    Thank you!

    • Upvote 3
  11. On 9/30/2020 at 6:08 AM, JDV said:

    Hi Kyle, 

    Thanks I found these on exchange and implemented them and have tested if against our O365 lab, when I test the script, get an error - "There would be no instances discovered for the selected device" anything you can think of I may have adapted correctly to our environment. Has anyone else got these working? 

    Regards, 

    Justin 

     

    Yeah I actually ran with those issue(s) but then noticed that we need to do additional things.

    • Have the Powershell specific module installed at the collector level:
      • Install-Module PSWinDocumentation.O365HealthService -Force (in my case there was some issue with TLS at the collector level so I had to run the command below prior to the module install
      • -> [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
    • Enterprise Application needs to have the following role:
      image.png.778e029a78fbbba8483e51360140d4ef.png

    I've also tweaked the script(s) actual token(s) being used on the code since they don't actually represent LM OutOfBox one(s).

    It's working for me :) 

    Hope it helps you!

    • Like 1
  12. Also to add here... 

    According to the documentation the user account roles should be:

    image.png.41331f635fdcccfb68c8977ac78df212.png

    However, the DS below doesn't work only with those perms:

    image.png.64681c548418c179f4948e662444ca0f.png

    I had the need to add the role 'SharePoint admin' in order to make it work

    image.png.5a4b87773789c42a7c6be4106f05a6b2.png

    Tried all the 'Reader' roles & none is able to make it work :( 

    Is there any other way to have it working without the need to grant 'Admin' privs to the user in context?
    This is very important to us because we've clients that will not give us those type of privileges (100% sure).

    Please advise here folks.

    Regards,

  13. Hello,

    This is something we've noticed a while ago, but, only sharing now since I forgot in the meanwhile.

    When adding MS Office 365 env. into monitoring following the official documentation, at some point it's stated that one of the properties that should be set at the resource level is:

    image.png.9cc1cac05de7271a782aa7ba5ea9162e.png

    I've done this but, the property source for Office365Reports category fails when running (prop. source details below):

    image.png.e30e27643ec3e6ba4ceeb4e4fb0d627a.png

    The actual error states the 'tenant name' in context doesn't exist.
    This gets solved if instead of using the tenant name we pass the tenant ID instead.

    Just want to raise this to your attention to make sure if the documentation actually requires an update note or no.

    Thank you!

    Regards,

    • Upvote 1
  14. Hello, 

    Starting a few weeks ago we've lost the functionality of dynamic filtering Collectors using the search box (top right corner).
    I mean, we can still search collector(s) but, the results actually display ALL the collector group(s) anyway... 

    This is kind of annoying & not that useful since we've >100 groups. I know this was working smoothly in the past, not entirely sure why it got removed (maybe there's an explanation but, I don't recall seeing that published on the Release Notes - correct me if I'm wrong :( ).

    Thank you!

    • Upvote 5
  15. Hello,

    I've created a suite for Tegile API monitoring a few weeks ago.
    One of the DS captures the events for the Tegile Event table (using the API), maybe that's useful to you since you don't need to configure any event source, traps, etc...

    I've been trying to publish those in the exchange but, for some reason my page always stays stuck at the loading phase (I've already tried in diff. browsers but, it doesn't work).
    Use those at your own risk (in case they're helpful to you) since those weren't approved by LM 'officially'.

    In order to use them you just need to map the props 'tegile.user' & 'tegile.pass' (make sure the user in question has API rights)

    Also, I've disabled all the SNMP Tegile DS (since these make pretty much the same thing via API), just to don't have duplicate stuff in monitoring.

    Link Here

    Regards,

    • Like 1