So I have created an SSIS package that will truncate, recreate and modify a staging table and then insert or update to a final destination table. The table has 77 columns and there may only be one of those columns that has a change and needs to be updated. I am wondering what the easiest way is to match the data in my source table against my destination and update only those columns that have changed in the source data. I would like to avoid writing an if statement for all 77 columns that checks to see if there are any changes.
Try the Slowly Changing Dimension transformation
You can generate the CASE statements in the SET part of the UPDATE. But to determine which rows need updated, I'd use CHECKSUM() on all the relevant columns, or perhaps EXCEPT (which I believe may itself internally use some hash value like CHECKSUM to check for differences).
Thank you both - I will look into both options when I am able to get back to work on this.
Scott(or others) - I can use checksum() but here is my situation... I have as I mentioned 77 columns not all of which have data in my source table. I also have some columns where the data may be different but I do not want to overwrite it. I understand I can use the checksum on the row level and if any of the data in that row is different I can overwrite the row. But what if I have a yes or no column and the destination table is 'N' and the source is 'Y' as a default. I don't want the Y to overwrite the 'N'. As I am writing this I am realizing that I may be looking for a simple solution to what may be a more involved process than I was expecting.
Interested to hear your thoughts.
SCD, SCD, SCD, SCD -- you really want to use this transformation