Writing Custom Scripts to Clean Duplicates is a Bad Idea

Duplicate records in your CRM are among the most likely sources of data corruption, leading to inaccurate data and misleading reports. Duplicate data can be easily cleaned- and then prevented. But, that doesn’t seem to be the reality for many companies. The breakdown starts when processes to prevent the introduction of duplicates into the database are either flawed, inconsistently applied or absent. Human error is all too frequently the main culprit. Most software lacks the sophistication to detect any but the most blatant cases of duplication and can be too cumbersome or complex for any but the savviest of users to run. There are different ways that companies can deal with this problem. Unfortunately, in an effort to save time or money and often as a result of underestimating the actual complexity of the problem – some of the solutions that companies adopt to fix their data quality are not effective. 

A first approach often attempted by many companies on finding that their shiny new CRM is already riddled with duplicates is to create their own custom script. After all, that’s what their IT person is for, right? Such an expedient, immediate and affordable (some will think “Free”) solution seems like the perfect fix to a simple problem. That might be true if a/ it was actually a simple problem and b/ that perfect fix didn’t actually create a whole new set of problems.

Custom Scripts Are A Bad Idea

Writing a script is like creating a recipe. There can be many different ways of getting to the same solution and some will work better than others. Like recipes, scripts need to be tested to make sure that they work well. So, don’t be surprised when right off the bat, your immediate solution takes more time, and uses more of your IT resources than you expected.

In the software world it's common practice to spend significantly more time on testing than actual coding. Imagine the challenges with trying to test all the crazy permutations your scripts need to support!

Like recipes, scripts are often appropriate only at a certain point in time (remember jello molds?) and can be adversely affected by changes in the basic environment. Tinkering with scripts is like doubling or tripling a recipe for a large crowd – it’s not necessarily a linear process.

The biggest problem with scripts is that they are only as good as the person who wrote them. Scripts will let you know if there is a bug in the code, but the logic behind the program will be limited by just how thoroughly the programmer understands the nuances of the problem they are trying to solve. Being human, programmers can very easily miss a particular instance where duplication might occur, or make a mistake in their assumptions. These mistakes are hard to catch without a quality assurance team with a complete understanding of the problem, working alongside IT. (So much for fast, cheap and easy.)

Scripts are further limited by the fact that they are based on data as it stands when the program is written. With the inevitable release of an update or upgrade to the CRM system, the script might fail. And now you’re back to square one. And, as the data sets grow, the script can break if it wasn’t written to scale with data growth.

In addition, your scripts will be hard coded to the specific task at hand at this moment. Tomorrow, there will no doubt be a twist to your needs. Scripts can be very hard to generalize in any significant way without raising the complexity! The devil is in those things you don’t realize you’ll need tomorrow. Soon your scripts are out of hand and very difficult and time consuming to maintain.

Finally, scripts are time consuming to both create, test and run. If your IT guru goes on vacation, (you do let them take a vacation, right?) or even worse, leaves the company, will anyone else be able to revise or execute the script properly?

A simple in-house script to clean and prevent duplicates will not save you time. It will not save you money. In the long run it will not keep duplicates out of your CRM data. Instead, you’ll be caught in a never-ending loop - fixing the same problem, over and over.

Fix the Problem Once- Come to an Expert

ActivePrime is dedicated to data quality with solutions that work at dirty data from every angle.

*CleanImport scrubs your list before you import new leads. *CleanEnter checks your new entry against existing records to prevent you from creating a new record when - in fact- it already exists. *CleanCRM catches duplicates already in your CRM. They all work on the same logic. They all work in perfect harmony, and they will make your data clean.

Try them yourself for free with a 14-day trial @ www.activeprime.com.