• 0

Avoiding duplicate instances



Throwing something out there for advice.

There are a number of datasources that are subject to duplication of instance because you cannot safely turn on the automatically delete as legitimate instances may go down for longer than 30 days.  This appears to be due to them being very dependent on the instance name, so if this is something the customer has the ability to change then it creates another instance during Active Discovery.

My question is if anyone has an ideas on a complex datapoint (or some other means) that can detect that this instance has matching property to another instance.  This could be say auto.device.mac_address or even the Instance Value as these always match between the duplicate instances.

This is really bugging me now so I really need to find a proactive way to address them.  The real downside is that you lose the historical data when it is actually the same device.  What the ideal would be is if you could merge instance or at least delete the new and rename the old one to the new ones name so its not discovered again.

Link to post
Share on other sites

4 answers to this question

Recommended Posts

  • 0

I would say if the instance can be uniquely identified with data on hand (as described above), then the datasource should be using that as the instance wildvalue, not some arbitrary other thing that could cause excess instances due to customer action or anything similar.

As far as data retention, I have found that decisions are often made that lead to loss of data and it is distressing.  I just had a case where I pointed out that a datapoint label had a typo.  Fixed, but the fix kills all old data for that datapoint.  Why must the label be the index rather than a label tied to a persistent index?

I see similar problems for DS replacements.  I suggested in a F/R long ago that it be possible at DS load time to upgrade from the previous DS version.  I fully appreciate that new datasources with alternate structures should be created, but if there was a migration function you could select the datapoint mapping to avoid losing data (currently best option is to run both in parallel until you get enough new data to not look foolish to your clients).  Preferably this would be builtin to the new datasource, so it would happen automatically or at least could provide guidance. That sort of mechanism could also handle my typo'ed datapoint issue.

Nuts and bolts stuff like that is hard to market, though :(.

Link to post
Share on other sites
  • 0

Thanks @mnagel So for now it sounds like I can't do anything smart with renaming or merging.  So that leaves me with the question about the alerting that its happened.  I was thinking a complex datasource with a small script but my groovy is more at butcher level rather than starting something from scratch. :)

Link to post
Share on other sites
  • 0
  • Administrators

This behavior is a vestige of the original design, which was meant to handle SNMP tables that would reorder the entries:

Value > Alias
.0       > Foo
.1       > Bar

-active discovery-

Value > Alias
.1       > Foo
.0       > Bar

So the alias persists, but the value changes over time. The value is used when reporting the data, because SNMP generally uses those index numbers to lookup related info in other tables. This has been carried all the way through to batchscript collection where the key/value pair output looks like ##WILDVALUE##.key_name. I'm fairly certain ##WILDALIAS##.key_name will fail, even if you build the output correctly.

One nice thing is, all modules released within at least the last 4 years, generally generate unique wildvalues anyways, so at some point, I could see adding a toggle to each DS that just allows us to indicate a behavior change: "Wildvalue is unique/static". This would indicate that the collector should use the wildvalue to uniquely identify a given instance, and would allow us to handle the case where a name was updated (as we could confidently differentiate it from an actually new instance).

Totally understand the frustration about some of the nuts and bolts stuff, and it's stuff that occasionally hamstrung me as a Monitoring Engineer. Maybe surprisingly, we don't hear a lot of complaints about this behavior.

Filling out a quick feedback request in your portal, or via your CSM, will help me make the case to get this improved :)


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.