CleanCSV Help: Oracle CRM On Demand

CleanCSV is a CSV-file cleansing and deduping tool embedded in Oracle CRM On Demand. CleanCSV can be launched from your Oracle CRM On Demand main Admin homepage under Data Management Tools. Click on Cleanse CSV to launch CleanCSV.

CleanCSV can also be launched from the Oracle CRM On Demand My Setup Page under Data & Integration Tools.

To use CleanCSV, first select the type of records you wish to dedupe (Accounts, Contacts, or Leads) from the menu in the upper left corner.

CleanCSV

After selecting which type of records to dedupe, you must choose your Settings. CleanCSV will dedupe records based on these Settings.

To define your settings, click Settings in the upper left corner. In this menu, you can select the CSV to dedupe, the merge rule for the master record, your custom match settings, the displayed fields, and concatenated fields. The menu can be collapsed by clicking Settings again or by clicking Close in the corner of the menu.

CleanCSV

Select the CSV to dedupe by clicking the Browse for CSV button. Choose a CSV file of records and click Open to upload the list to CleanCSV.

After the file is uploaded, map your CSV fields to associated field types. This step defines the internal logic that CleanCSV will use in analyzing your data. For each field, select the field type that corresponds with the CSV field. To skip mapping a field but include it for deduping, select No Field Type. To remove a field from matching and from the output CSV file, select Do Not Include.

CleanCSV

Choose a merge rule to define how master records in duplicate sets are chosen.

CleanCSV

There is one standard merge rule available, plus any custom merge rules your company may have ordered:

Most Complete Address: The master record in a duplicate set will be the record with the most complete address.

To select a merge rule, simply choose it from the dropdown menu.

The Master Record is utilized to determine how the final row of field values will be created. The master record will be used to populate the final field values; any missing values will be populated by the other records in the duplicate set.

After CleanCSV identifies duplicate sets, you may wish to re-run a different merge rule. To do this, simply choose a different merge rule from the menu and click Run Merge Rule.

Choose match settings to identify duplicates. All fields in your CSV file, except those that you have excluded while mapping, are available for matching. Each field has a sliding scale to select the logic that CleanCSV will use to match your records. The options are:

Don't Match This field will be ignored for matching.
Exact Match Data in these fields will be matched only if values are perfectly identical (case sensitive).
Starts With Data in these fields will be matched if values begin with the same first few characters. You can define the number of characters to match via the Advanced... button.
Fuzzy Match Similar data in these fields will be matched to each other. CleanCSV applies matching algorithms and domain knowledge to identify misspellings, abbreviations, and phonetic matches. This is the most used option.
CleanCSV

More fields selected for matching will result in a narrower search for duplicates. Fewer fields selected will result in a broader search. A few trial runs will quickly reveal the best combination of fields for matching your data. It is usually recommended to use the powerful Fuzzy matching capability that ActivePrime's tools are known for, which identifies several types of similar matches. Exact matching is typically used on fields that are distinct character-wise, such as email, website, or customer ID. Starts With matching may be valuable for specific data de-duping needs.

**Please note that when matching on phone number fields, your results will be most accurate using Exact or Starts With matching. Fuzzy matching on phone number fields may yield inadequate results.

If you are new to CleanCSV, try these basic match settings for Accounts:

Account Name: Fuzzy
Billing City: Fuzzy
Billing State: Fuzzy

If you are new to CleanCSV, try these basic match settings for Contacts and Leads:

First Name: Fuzzy
Last Name: Fuzzy
City: Fuzzy
State: Fuzzy

Click Advanced... next to any field to choose additional settings:

  • Blank field matching defines how CleanCSV will match empty fields. Blank field matching is set to No by default, meaning that empty fields from two records will not match with each other ("___ Smith" does not match with "___ Smith" with blank field matching for First Name off). You can turn on blank field matching by selecting Yes, so that an empty field will match with another empty field ("___ Smith" will match with "___ Smith" with blank field matching for First Name on). Note that neither option allows you to match an empty field to a non-empty field ("___ Smith" does not match with "John Smith" if you match by First Name). This functionality is important in preventing large duplicate sets resulting from fields with missing data. For example, this happens when matching only on Account Name and several records are missing an Account Name (in this case, it would be preferable to select No for blank field matching).
  • Case sensitivity for Exact matching defines whether the match will be sensitive to casing (upper case vs. lowercase).
  • Characters for Starts With matching defines the number of characters that CleanCSV will use to perform Starts With matching. The default value is 5 characters. Change this number by clicking More or Less. The minimum number of characters required is 3. Note that this option only appears when you have selected Starts With matching for the field.
  • Words to ignore defines which words CleanCSV will ignore when matching your fields. Enter one word per line. These entries are case-insensitive. Note that these words will not be ignored if you use Exact matching.

Choose the order to display your fields in CleanCSV. All available fields from your CSV file are listed in the Show these fields in this order column.

CleanCSV

To reorder displayed fields, select a field from the Show these fields in this order column and use the Move up and Move down buttons. To discard any changes you have made, click Cancel.

After CleanCSV identifies duplicate sets, you may wish to reorder the fields. To do this, simply make your changes in the Displayed Fields menu and click Apply.

There may be some fields for which you want to retain the values from each record in your duplicate set. CleanCSV lets you keep all this data by concatenating the values from each record into the final record field.

To concatenate the data for a field, check off the desired field and click in the dropdown to select the separator of your choice.

CleanCSV

You can save your settings by clicking the Save As... button at the top of the Settings menu. Enter a name for your customized settings and click Save.

CleanCSV

Your saved settings are listed in the Current Settings menu in the upper left side of the CleanCSV window. To load your saved settings at a later time, click on the menu next to Current Settings and select the saved settings.

CleanCSV

Once your settings have been chosen, click Find Duplicates. CleanCSV will find duplicate records from your CSV file. The results will be displayed in the main body of the CleanCSV window.

CleanCSV

CleanCSV's progress will be displayed at the top of the screen. Segmenting your data and finding matches may take a few moments, especially with large lists of records.


After CleanCSV has found duplicates, the results will be displayed in the main body of the CleanCSV window. Your results are organized into three categories:

  • Duplicates Requiring Review
  • Duplicates Not Requiring Review
  • Unique CSV Records

You can manipulate the data to fine-tune the duplicate sets before merging duplicate records.

Your matching results are organized into three categories:

Duplicates Requiring Review: Duplicate records with conflicting data in the displayed fields. These records should be reviewed before merging.
Duplicates Not Requiring Review: Duplicate records without conflicting data in the displayed fields. These duplicates are exact matches and require little to no review before merging.
Unique CSV Records: Unmatched or non-duplicate records in your CSV file as defined by your settings.

Click the arrow to expand the grids and review your duplicate sets.

CleanCSV

Click the arrow to expand the grids and review your duplicate sets. Duplicate sets are grouped and numbered in the grids. The Final record represents the final field level values for the surviving merged record for each set.

You can easily review your duplicate sets before merging:

  • Remove a record from a duplicate set by using the Remove link.
  • Change the master record of the duplicate set by reselecting the Master check box. The master record primarily determines the final record field data.
  • Fine-tune final record data by selecting check boxes in  highlighted  fields representing conflicting data in a duplicate set. You can also manually edit the final record by clicking in the final record data fields and typing your changes.
CleanCSV

These duplicates grids are densely packed with information.

  • Final record fields that appear in green can be edited by clicking into the field and typing your changes.
  • Duplicate sets highlighted in  yellow  indicate a display warning or error. You can view the warning by clicking view warnings in the Master field of the final record, or by clicking View Errors in the category grid title bar.
  • Your duplicate sets can be sorted by the data in any of your displayed fields. Simply click on the column header to enable ascending or descending sorting. Your duplicate sets will be sorted amongst each other by the final record data in the selected column.

If you are not satisfied with the match results, you can fine-tune your settings and re-run CleanCSV until it finds duplicate just the way you want.

Each category of grids can be exported to a spreadsheet. Exporting may be helpful if you require external review of duplicate sets. Click the Export button in the category grid title bar to choose a file name and type. You can choose whether to export the full grid of duplicate sets, or only the final records. You may filter by owners if desired. Click Save to generate the file.

You can see the statistics of your current CleanCSV run by clicking Statistics in the upper-left corner of the window. You will be shown the number of total duplicates, duplicate sets by category, and unique records (non-duplicates). These values help you do a "health check" of your data to see how much duplication currently exists in your original CSV file.


When you are satisfied with duplicate identification and have reviewed your data, you can export your cleaned data to a new CSV file. Click the Get CSV button at the top of the CleanCSV window. Choose your file name and type, then click Save to export the CSV file of all final record data.

CleanCSV

You can also export final records or duplicate sets for each category of grids. Exporting the duplicate sets may be helpful if you require external review of the duplicate sets before cleansing. Click the Export button in the category grid title bar to choose a file name and type. You can choose whether to export the full grid of duplicate sets, or only the final records. You may filter by owners if desired. Click Save to generate the file.

CleanCSV

Your Oracle CRM On Demand account administrator is responsible for maintaining your CleanCSV account options. Your account options can be accessed by clicking Options in the upper right corner of the CleanCSV window. Here, you can update your CleanCSV account, manage users, manage saved settings, manage email notifications, and make feature suggestions.

CleanCSV

This page is used to update your Oracle CRM On Demand administrator information. This is the account that manages CleanCSV. If the admin username or password for the account changes, please update it here so that CleanCSV can continue to work for your company.

This page allows you to manage the Oracle CRM On Demand users who are able to use CleanCSV. To enable a user, check the AP Enabled box next to that user's name. To disable a user, uncheck this box. You can add additional Admin users by checking the AP Admin box. You can only enable as many CleanCSV users as the number of licenses you have in your subscription. If you need more licenses, click Inquire About Adding More Licenses.

Any new users of your Oracle CRM On Demand account will appear on this list and can be enabled at any time, provided that you have enough user licenses.

For long lists of users, you may need to search for users. You can use the column filters to search for users. Enter your filter criteria and click Apply Filter to refine the list of users on display. You can also press Ctrl+F to find users.

  • Above CRM Status, you can select for active, inactive, or all users.
  • Role, Region, and Sub-Region have case-insensitive search fields. For example, enter "admin" above Role to search for your company's Administrators.
  • After updating users, you must click Save to save your changes. Clicking Apply Filter to search for users will discard any unsaved changes on screen.
  • Click Export CSV to export the CleanCSV user list on screen.

For more information, see this instructional video.

This page allows you to save your CleanCSV settings for future use. Click Save Current Settings to save your custom settings that are currently being used in CleanCSV.

All prior saved settings will also appear on this page. You can load any saved setting by clicking Load. You can change the default setting which CleanCSV will automatically load when it starts. You can also delete and rename settings here.

This page allows you to manage the email addresses that will receive notifications about CleanCSV. Enter one email address per line in the text box, and click Save Emails to save your changes.

Is there anything new you would like to see in CleanCSV? Anything that ActivePrime could do differently?

Click Feature Suggestion to send a feature suggestion to ActivePrime. This link will lead you to our support portal, where you can submit your suggestion to the ActivePrime team.


Visit the CleanCSV FAQs for answers to your questions about CleanCSV.

For further assistance, visit our Support Portal at www.activeprime.com/support/.

Interested in more data quality solutions?
ActivePrime has developed an array of complementary tools including ActivePrime Search, CleanEnter, CleanCRM, CleanVerify, CleanCSV, and CleanImport that are embedded in Oracle CRM On Demand. ActivePrime Search is a real-time, fuzzy, CRM search solution. CleanEnter is a real-time duplicate detection service. CleanCRM a batch data de-duping tool. CleanVerify is a real-time address, phone, and email verification service. CleanCSV dedupes CSV files of records. CleanImport dedupes CSV files against your CRM records and imports them. To learn more about these products, visit www.activeprime.com/learn/oraclecrmod/. Sign up for a free trial through ActivePrime FreeTrial.


Last Revision: 4/15/2014