SQLTeam.com | Weblogs | Forums

SSIS Script I do not 💘 you


#1

Any other option to running SSIS Script? I want to do it outside of SSIS package cause I am tired of opening that script ever single time I have to debug, seems to be ancient and quirky and irritating ! :rage:


#2

SSIS Script = Data Import Mapping (DIM :slight_smile: )

What are you trying to do? ETL perhaps?

We still use SProcs for Data Loading - albeit they are mechanically written. We've never migrated to DTS ... SSIS ... nor something from a 3rd party

Here's what we do:

"Acquire" data - ideally a either direct SQL query to a remote DB or if that is not possible import-from-file - BCP usually.

If we have to import from file we pull that into an IMPORT table. Exact replica of the table/columns in the import file. We have a BATch file for that, it just cycles around a list of expected filenames and target Import Tables. Pretty good error logging - we MOVE the output files to an ARCHIVE01 folder, and rename (cycle) the last 50 folders - so we have History of the last 50 runs.

We then use a STAGING table. This is freshened up directly if remote DB connection available, otherwise from the IMPORT table. We have two additional columns - Action and ChangeDateTime. Action = Upsert, Delete, Reprocess. So each row in the Staging table is either new/changed data, or a row that is now marked Delete - no longer exists on remote. Staging table only contains columns that are needed downstream, but will also include columns that "might" be needed in future (to avoid having to pull all data again). We expect the remote to have ChangeDateTime column on all tables, but of course some don't - so for those we have to do every-column, every-row comparison to figure out what has changed. In that case we try to do that nearer to the Source - e.g. on the Source server, or on a server physically near the source. Then we can just pull "changed since last time" rows from Remote Staging Table to our Local Staging Table.

So now we have all rows, in original DB layout (tables and columns, but maybe not all columns) with a ChangeDateTime, so we then Transform and Load any changed rows into the Destination DB.

The ChangeDateTime is when WE acquired that record, it might have been changed "some time yesterday", so that is strictly chronological for us to process "New since last time"

For this purpose we create an "Import VIEW" on the Staging Table - all the columns needed with a presentation similar to the Destination Table - target column names, that sort of thing, probably also any TRIM, NullIf(MyColumn, ''), usually any COLLATION changes too.

CREATE VIEW MyTable1_Import
AS
SELECT MyCol1 = TheirCol7
     , MyCol2 = RTrim(TheirCol92) COLLATION MyCollation
     , MyCol3 = NullIf(TheirCol17, '')
     , ...
FROM TheirTable4_Staging
    JOIN TheirTable5_Staging
        on T5_Col21 = T4_PKey

Then we use an SProc to Upsert the actual table - basically just mapping, 1:1, the columns in Import View with the Target table. That Upsert SProc checks for all sorts of easily detected errors, such as column length in Staging Table's varchar column exceeds defined length in Target Table (truncate column to max len); FKey record not found (critical error, that row excluded from import). All errors are logged in a single "Data Validation Errors" table, which the User can then view and sort out.

Any critical errors (which prevent the row being imported) cause the Staging Table to be marked as Action=Reprocess. That row will be reattempted "next time" regardless of its ChangeDate. Action-=Reprocess is reset to Upsert if/when the critical error clears later (or will be overridden to Delete if the row disappears from Source DB))

Upsert Sproc basically does "Update anything that has StagingTable.ChangeDateTime more recent than LastTime", "Insert anything where PKey not found", "Delete anything (more recent than LastTime) marked as Action=Deleted in Staging table"

We can introduce bespoke Transformations in the Upsert Sproc but they are rare (if possible/practical we do it in the Import View because it is much easier to change that - deployment is easier (particularly the QA approval required) than a change to Upsert Sproc. Some things, such as "Column not a valid date" could be handled in Import View but it would be very verbose, so easier to do that in the SProc. Mostly the data we are importing is already correct data type, so our only "simple" validation is "Import data too long" - usually because the Source column DDL has been made longer and no one bothered to tell us!

We have reports to help analyse the Data Validation Errors log. On companies I have done consultancy at, and which use SSIS, they seem to spend all morning trying to figure out why an SSIS Job had failed ... our SProcs all log the fact that they started, and then "update" that log record when they finish successfully, so any "Started but not finished" log record would be an Sproc that failed. We can just run that Sproc (in a transaction, which we rollback) to see what the actual errors was (we have a @Debug parameter in the SProc which enables more detailed output), and then put a "catch" for that particular Bad Data /Whatever so that next time it reports into the Data Validation Errors table and the User can sort it out, not us! The "SProc Run Log" also tells us if something is taking longer than "normal", or if a run terminated, and so on. User has a report to be able to see the last-run-date of each import, so is self-sufficient for "Is my data up to date" ... indeed, some of our APPs say "Data import dd-Mmm-yyyy" so the user can see that, right there on the screen.

I have probably answered a completely different question?!!


#3

Tons. What is it that you're actually trying to do?

And, yeah... I agree with Kristen. You can do all sorts of ETL and other things that many can't even dream of doing in SSIS.


#4

Thanks folks @Kristen, you need to write a book! :grin: ok, what I am trying is extract attachments from emails (using Microsoft.Exchange.WebServices), move the email to another Mail folder. Ingest the content of attachments, archive. All is working well but the part that irritates me is opening that Script box.

Was wondering if I could maybe write my c# .net code as a console app and call it from SSIS.

Thanks


#5

You don't need to do anything so severe. A little VBA on the system that receives the emails should do. You can find some great examples of what folks have already done at the following GOOGLE search.
https://www.google.com/#q=automatically+save+outlook+attachments+to+hard+drive


#6

I thought I just did?!!

Yeah, I definitely answered the wrong question ... we use a Document Management System for that :yak:

FWIW we use iManage on top of Autonomy, may be more relevant for Law Firm type companies than other market sectors