Home > SQL Server, SSIS > SSIS Lookup as SQL NOT EXISTS

SSIS Lookup as SQL NOT EXISTS

2013/09/17

First of all: IF NOT EXISTS is not the best optimized TSQL : IF EXISTS is much more efficient.

After i have remembered this (how many stored i’m constrained to read with NOT EXISTS in the text…) a typical job is to import updates from an Excel file or another external source, verifying that are not imported duplicated data.

The external source could be a SQL temp table where some program insert new records.

In a stored procedure using TSQL we can use IF EXISTS , for example we can read from a temp table and insert with a syntax as

INSERT INTO Table
SELECT <fields> FROM TempTable WHERE NOT EXISTS (SELECT <fields> FROM Table WHERE <TableKeyFields> = <ImportTableKeyFields>

If we must use a SSIS package that reads for example from an Excel file we can implement a NOT EXISTS logic with the Lookup transformation.

Sometimes is required some treatment to the Excel file: for example i have an destination field for the Litres field in the database table (for trucks refuels) which is numeric and in the Excel file the corresponding field is padded with zeroes to the left in order to have a fixed length ( for example 0099.50 , 0119.67) and this is a problem with Derived Columns that i will search to resolve.

I have tried in the Advanced Editor of the Excel Source to specify a numeric DataType for the Output column, but there were still the leading zeroes (a bug?).

For the moment i found an simple solution using a macro in the Excel file with a “Sheet1” sheet and 179 rows to import :

Public Sub ChangeLiters()
    Dim i As Long
    Dim objRange As Range
    Dim strToSubst As String
    ' in this case i'm sure to always have values...
    For i = 2 To 180
        Set objRange = Worksheets("Sheet1").Range("E" + CStr(i) + ":E" + CStr(i))
        strToSubst = objRange.Text
        While Mid$(strToSubst, 1, 1) = "0"
            strToSubst = Mid$(strToSubst, 2)
        Wend
        objRange.Value = Trim(strToSubst)
    Next i
End Sub

With the pre-treatment my Excel file is ready for the import and the Liters column causes no more errors at the Derived Column transformation , that we see now.

After placing an Excel source is very useful an Derived Column transformation , for example we need to specify fixed values for some columns and we need to create columns that can be joined with the ones in the real db table having the same type.

For example in my case i created these Derived Columns


After this i placed an Lookup Transformation, which is the core trick.

First of all i joined the Derived columns from Excel to the real db columns (the ones with GTIASG010_ prefix)


And in the General properties i specified this:


This means that the rows from Excel that are NOT present in the final db table(we have joined the fields , in this sample 9 joins, in order to to find the existing rows that are equal to the incoming row from Excel so these are already existing and must be NOT imported) are “redirected” in the SSIS flow to the error output of the Lookup transformation.

The error output in this case has not the meaning of exception (as true runtime errors for data types or communications errors and so on) management, but is simply an alternate routing.

With an Conditional Split we can emulate additional SQL WHERE clauses:


So at the end this is the complete package:


Running the package a first time and then after adding a line in the Excel source, with another run we can see


In the real package there is a OLEDB destination, the Union All that in this post serves only to place the final Data Viewer, in order to show the result in the image.

Advertisements
Categories: SQL Server, SSIS
%d bloggers like this: