Author: Monika Sahni

Links: Enterprise SolutionsBusiness Intelligence and Appliance Solutions

EPISODE 9: Removing Duplicate Records from USPTO data


In episode 9, we will focus on removing duplicate records from the staging table.

In episode 8, we covered inserting USPTO data from Dell Boomi into a SQL table. However, the table contains duplicate data. In order to remove the duplicates, you first need to create a staging table. The staging table allows you to remove duplicate data by running a stored procedure before you load the contents of the staging table into destination tables.

Create Stored Procedure to Remove Duplicates

Create Stored Procedure

Create a stored procedure by using the create database script. Using common table expression (CTE), duplicates can be removed from a table. CTE is a feature available in SQL 2005 and later versions of SQL server. This stored procedure removed duplicates from the stage table.

Link to code: SQL data deduplication script


Moving Data from the Staging Table to the Destination Table


This stored procedure has code to move data from the staging table to the destination table.


Calling Stored Procedure from Dell Boomi


Please refer to the final process in episode 8 for importing csv to database. Building on that process, you will need to add a program command.


Program Command


Drag a program command and enter the name of the stored procedure.



Then run the test for importing the csv into the database as shown in episode 8.  Then check the database in the staging table for any duplicates in the table.