SSIS vs LinkServer

Why the heck is Link Server so much slower than a connection in SSIS (OLE DB), going to a vendor database? It is the exact same server, database and credentials.

For this reason, I need to do this in SSIS.

The vendor database has about, lets say, 1,000 tables. I need to go through every table and grab the latest records which one thing in common for all tables is that it has a field: Processed_Date

So the basic query will be like this:

SELECT * FROM ?
WHERE Processed_Date = GETDATE()

The result of each table needs to be INSERT into the corresponding table or a SELECT INTO in my local database.

I cannot use OLE DB Source and Destination control because that would require manual mapping. I would like to keep this dynamic by table names.

I was able to achieve this with LinkServer & OpenQuery but I can probably run to Alaska and back by the time it finish.

USE [MyTestDatabase]
SET LOCK_TIMEOUT -1

--PURPOSE: Drop Tables
EXEC sp_MSForEachTable @command1 = 
'
DROP TABLE [MyTestDatabase].?
'
, @whereand = 'AND o.Name LIKE ''zTEMP_%''             '


--PURPOSE: Load Tables
EXEC sp_MSForEachTable @command1 = 
'
DECLARE @TableName			AS VARCHAR(255) = REPLACE( REPLACE( REPLACE(''?'', ''['', ''''),	 '']'', ''''),		''MySchema.'', '''')
DECLARE @MAXModifiedUtc		AS VARCHAR(100) = (SELECT MAX(ModifiedUtc) FROM ?)

DECLARE @SQL		AS VARCHAR(MAX) =
	''SET LOCK_TIMEOUT -1'' + CHAR(13) + CHAR(10) +

''SELECT * INTO [MyTestDatabase].[dbo].zTEMP_'' + @TableName + '' FROM OPENQUERY'' + CHAR(13) + CHAR(10) +
''(MyLinkServer, '' + CHAR(13) + CHAR(10) +
'''''''' + CHAR(13) + CHAR(10) +
	''SELECT * FROM '' + @TableName + CHAR(13) + CHAR(10) +
	''WHERE ModifiedUtc > '' + @MAXModifiedUtc + CHAR(13) + CHAR(10) +
'''''''' + CHAR(13) + CHAR(10) +
'')''

EXEC(@SQL)
'
, @whereand = 'AND o.Name NOT LIKE ''zTEMP_%''             '

it might not be the linked server itself. could be how you are using it after the connection happens?

Does this column have index: Processed_Date ?

Also you might want to do this a different way. Same sproc on each database that pushes out the data you need. Push instead of pull to a certain centralized non vendor database. What is the purpose of this? Analytics?

One reason this takes longer when using a linked server is that it has to be done in a single transaction. Using SSIS with an OLEDB Destination you control the batch and commit size which allows for smaller batches to be committed.

The other problem is how SQL Server chooses to process the query over the linked server. This query can be sent across as a cursor (returns one row at a time to be processed locally), or it can decide to pull all data locally then apply the filter - or it can choose another path.

Not sure why you think doing this dynamically is any better than defining the actual columns you need from the source system and only pulling those columns.

1 Like

Linked servers are a pain. When you put a WHERE clause in, does it pass that WHERE clause across the linked server? Or pull all the rows and process on this side.

Using OPENQUERY is good because you know exactly where every piece runs.

1 Like

Hi,

Yes, Link Server sucks! When I use it with OpenQuery (PassThrough), it basically pass the query to the vendor and get it processed on their server and then returns to our side. At least, that is what I assume. In some cases, the OpenQuery does super well when the result set is small but when it has to return more than a 1,000 records with 100 fields, the performance is just as bad as using the LinkServer naming convention.