SeanC

Members
  • Posts

    10
  • Joined

  • Last visited

  • Days Won

    1

Reputation

3 Neutral

About SeanC

  • Rank
    Observer
    Observer

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. We had an incident where instances didn't failover as expected so I raised a ticket. Support staff were the ones who advised me that collectors won't take instances over the threshold during a failover. I questioned the advice at the time as it was contrary to the way I interpreted the documentation but apparently the advice was verified with another engineer and is accurate. I guess I'll have to set up some tests and verify.
  2. Currently, if a collector goes down the devices it was monitoring will only failover to other collectors in the group if doing so does not put them over the defined Rebalance Threshold on the ABCG. This means you can't specify a Rebalance Threshold that allows instances to rebalance evenly across your collectors AND allow for instances to failover, it's one or the other which defeats the whole purpose and takes the Auto out of Auto Balance as to have collectors run in an n-1 highly available configuration means you have to specify a Rebalance Threshold value that is higher than the total instance count divided by the number of collectors minus one, then after a failover you have to adjust the Rebalance Threshold to a number close to the total number of instances divided by number of collectors, trigger a rebalance, wait, and then set the Rebalance Threshold back to total instance count divided by the number of collectors minus one. Madness! I propose keeping the Rebalance Threshold field but making it actually behave like it's name; when instance count is over the value, collector tries to offload instances to other collectors in the group. BUT, allow instances from a failed collector to offload to collectors past that threshold value so that failover continues to work. To address the concern of collectors being pushed past their capacity, add an additional field called Max Allowed Instances and only disallow instances to be offloaded to a collector if the instance count would push it past that value and trigger an alert in the event that that happens. This will allow us to have HA configurations AND auto balancing of instances work at the same time, as well as alerting us to the fact that instances have not failed over when it happens so that steps can be taken to increase the sizing/number of collectors.
  3. We desperately need audit logs that include: The object name The object ID What item(s) changed on that object What the new values are And if applicable, what the previous values were. Needed ASAP as currently there is no way to audit changes as the current audit logs are almost useless. They contain entries like this "Update the datasource instances, disable monitoring of instances : [<instance names>] " Great, now I need to go into each and every one of our thousands of devices and check to see which one had it's instances disabled incorrectly. Also, changes anywhere to monitoring (enabling/disabling instances etc) should trigger a prompt to enter a reason for the change like threshold changes do! LogicMonitor is great but there are so many inconsistencies in the way different parts of it work
  4. There is a token ##RESOURCEGROUP## that shows just the name of the device group a device is in but is there a way to show that device groups parent? Or that device groups full path? I have a device group structure that organizes things by location then by type. EG: Customer A/Site A/ESXi Hosts Customer A/Site B/ESXi Hosts If I use just ##RESOURCEGROUP## then I end up with NOC widgets that group by resource group just showing a whole bunch of 'ESXi Hosts' for the name of each grouping. I would like to be able to name the grouping with either the parent or if that's not possible, by the full path. There was a feature request made by Mosh several years ago but I don't have permissions to view it so can't tell what ever happened to it: Did that ever get any traction? Pulling my hair out here over these widgets, none of them seem to have consistent behaviour either....
  5. Please let us create properties that are arrays so that we can use the contains() method.
  6. Thanks Kerry, turns out my problem was er... me. I didn't realise that someone else had already added them and I wasn't aware they no longer appear in the repository if they're already added XD
  7. Hi Kerry I had a look there first and the only ones starting with 'Linux' in my list are: Linux Disk Inodes Linux_Sensors_Fans Linux_Sensors_Temperature Linux_Sensors_Voltage Do I need to look at a different server than v128.core.logicmonitor.com ?
  8. I can't find any of these except the one you gave the code for. Does anyone have the codes for the other linux SSH datasources?
  9. Hi Can we please get the 'Threshold History' and a 'Last Modified' date as optional columns in the Alert Thresholds report. This would allow a huge time saving in reviewing modified thresholds. Currently we have to run the report which results in hundreds of entries, each of which then requires someone to go to the corresponding device, locate the alert, go into edit the thresholds in order to view the history and time the change was made. This current method is not practical now when we've just started using LogicMonitor let alone when we add hundreds more devices.
  10. Hi guys I've notice that the audit logs don't capture all events all the time. EG When changing a threshold on an alert the accompanying note is only captured by auditing some of the time. This makes me worry about what else is not being captured reliably. Support suggested I log this as a feature request so please make auditing capture ALL events ALL the time. Thanks.