I have a transaction table that contains a large number of rows (more than 40 millions) ,
I want to design SSIS package to transfer data from the transaction table into the data warehouse, which would be an incremental load.
which mean , IF transaction.ID not exists in the destination table --> then --> Insert new row
My question is : what is the best approach for the incremental load? while I need to transfer data every 15 minutes.
Query for the max id that's in the DW and then use that to determine which rows to pull over?
Thank you for your reply .but I can't use max id , the Transaction.ID is not inserted in a sequential order . any suggestion ?
I can't use Lookup transformation ,I have a large number of rows in Transaction table and I need to run the package every 15 minutes .
We have CreateDateTime and UpdateDateTime columns in all (pretty much!) tables. Solved the problem of getting "newer than last time" rows from one place to another.
However, to delete stale rows we have to pull all PKeys and do a NOT IN / NOT EXISTS type check. If this is important to you might be worth storing deleted row PKeys into a history table, specifically to allow a "delete all since last time" operation (being aware of any PKey deleted, and then recreated [with same PKey])
Thank you Kristen . in my case creation date represent transaction date . its different than record date . and some times the creation date not inserted in a sequential order.
Sounds like Transaction date is not the same as Creation date then You can maintain a Create Date using a trigger - and then use that to identify "new" records easily (assuming you do not ALSO need to know Modified Records?)
If you cannot modify the table structure then you could still have a Trigger, on the main table, but just insert the Create Date/Time and PKey into a different "These records need importing" table - even in a different database, if you like