SQLTeam.com | Weblogs | Forums

SSIS Advice


#1

Hi Gurus,

I need advice. Here what I am doing in SSIS.

Simple SSIS Package.

  1. Execute SQL Task (Insert file name in the table)
  2. Data Flow Task (My Source is Flat File, Flat file could be 20k to 100k records or may be more, Couple of Transformation (Derived Column & Data Conversion) and insert the data into SQL Database table as a destination.

Here is my question is, This Package could run 10 to 20 times a day. Destination SQL Table is my Production DB. What methodology should I use for avoid DEAD LOCK on the table,
Should I use LOCK TABLE on OLE DB Destination?
If I use SEQUENCE CONTAINER to roll back if for some reason package file, should be fine or not?
Any Other Advise to Avoid DEAD LOCK on the table would be great appreciate.


#2

If you enable TABLOCK on the OLE destination adapter, your insert will likely be faster. That would reduce the chance of deadlocks. Also you could implement a For Loop container and loop for some number of "tries" if you get errors. Alternatively you could leave your code as is and implement the retries in Agent.


#3

Thanks gbritton for your reply.
Also, you could implement a For Loop container and loop for some number of "tries"
Would you mind if you explain me in detail that part?

Two More questions Please..

  1. I know the best practice is read the data from Flat File and insert into Staging Table and then Staging to Production DB. Do you think that's the best way to go? and I know it will not faster than Reading data from a Flat file and Insert into Production.

  2. I am going to schedule this SSIS Package through SQL Server agent. The first step if the File is in the Folder run the package. Here my question is, It is easy for me to Schedule this job from Dev Server so I can easy control it, Drop my Package to Network Drive and Schedule on Dev Server to Insert the data into Production OR Drop the Package into Production Server and Schedule on Production Server to Insert the data into Production.

Please advice Thank You.


#4

Read up on the for loop container: for loop . The basic idea is, put your tasks in the body of the loop. if it succeeds, set the for loop variable to stop the loop, otherwise try again. if you are on 2012, you can use an expression task in the loop to set the variable, otherwise (2008R2 and below) you'll need a script task to do it.

The alternative is also easy: do it in Agent. Specify that the job step should repeat some number of times upon failure. In your case, that might be easier. You could try them both and go with the one you like best.

  1. yes, so if the production load fails, you don't have to process the flat file again, since you've already staged it. It's a good candidate for checkpoints in your package.
  2. as long as you have a DEV dependency, you'll never be free of the job. If you can promote it to PROD, however, your production team has responsibility and you can move on to other stuff.

#5

Heh... my advice would be to stop using SSIS. :laughing:

What is the clustered index based on? I ask because that's normally the key to high performance batch inserts. Generally speaking, it should be narrow, unique, immutable, and ever-increasing so as to prevent massive page splits when inserting large volumes of data into an OLTP table.

And, yes, I'd definitely use a staging table instead of trying to do any transformations on the fly. That way if something goes wrong, you won't have to do a rollback on a busy table if the data doesn't meet expectations.

I also seriously agree with gbritton. You should never have a Dev job schedule or job running data into production. Dev shouldn't even be able to see production never mind do anything with it. You'll never pass an audit that way and it could be the cause of reading about you and your company in the paper when someone steals your data for sale to the highest bidder.