I have an SQL query that updates 1 column in ~3000 rows on a linked server. The query takes about 2.5 hours to run, which seems like a really long time. I have drilled into waits on the source server and I see an OLEDB wait with the server name and the spid on the target. I drilled into the waits on the target server and I find FETCH API_CURSOR wait...
Also note that the source server is a standalone SQL Server instance and the target server is an Active/Passive SQL Server Cluster. I did notice that MSDTC services were started on both servers, but MSDTC is NOT a cluster resource on the target...
The linked server is in the same datacenter - gigabit network. Here is the update statement... Please advise as to what is needed regarding table structures... thanks...
UPDATE A
SET TEXTSNAM = CAST(xpx.pxp as CHAR(10))
, NAMECUST = xpx.pxp
, TEXTSTRE1 = CASE WHEN xpx.BillingAddress1 <> ''
THEN BillingAddress1
ELSE xpx.AddressLine1
END
, TEXTSTRE2 = CASE WHEN xpx.BillingAddress1 <> ''
THEN ISNULL(xpx.BillingAddress2,'')
ELSE ISNULL(xpx.AddressLine2,'')
END
, NY = CASE WHEN xpx.BillingAddress1 <> ''
THEN [By] ELSE xpx.City
END
, CODESTTE = CASE WHEN xpx.BillingAddress1 <> ''
THEN BillingState
ELSE xpx.State
END
, CODEPSTL = CASE WHEN xpx.BillingAddress1 <> ''
THEN BillingZip
ELSE LEFT(xpx.Zip, 5)
END
, NAMECTAC = ISNULL(xpx.ContactName,'')
, TEXTPHON1 = ISNULL(xpx.Phone1,'')
, TEXTPHON2 = ISNULL(xpx.Fax,'')
, EMAIL1 = ISNULL(xpx.PEmail,'')
, WEBSITE = xpx.PID
, CTACPHONE = ISNULL(xpx.Phone1,'')
, CTACFAX = ISNULL(xpx.Fax,'')
FROM
LinkedServer.DBNAME.dbo.AEAxjn A
JOIN
CCMX.dbo.xpx xpx on A.IdCt = xpx.CNo
WHERE
xpx.ChId = 1
AND
LEN(xpx.CNo) = 4
AND
xpx.BAct = 1
AND
A.IdGrp = 'LLJ'
AND
xpx.CNo IN
(
SELECT
CNo
FROM
CCMX.dbo.xpx
GROUP BY
CNo
HAVING
COUNT(*) = 1 --5000 rows
);
This is one of the nasty surprises you can get when using linked servers. In this case - SQL Server cannot determine the fastest method for getting the data because of the join across the servers - so instead it is doing the job row by row (hence the FETCH API_CURSOR waits.
To fix this you should put your procedure on the destination server so you are updating data locally. Then - instead of joining directly across the linked server - pull that data across the linked server and join to the local table. Now, whether or not you use a temp table - or a permanent table needs to be determined also...if you are using that data for other processes and do not need up to the minute data then you can build a permanent table and build a separate process to update that table on a scheduled (usually daily works).
If this is the only process requiring that data - then in your procedure pull the data into a temp table and join to the temp table.
Note: if you refresh the data on a daily basis it will be much better to use SSIS, BULK INSERT or BCP to refresh the data instead of a linked server. In all three of these there are options available to control the batch and commit sizes so you don't bloat your transaction logs and the load processes will take less time.
However, the query is a part of an overnight job that runs from the source server. I have considered a CTE or temp table... I am testing, but moving the code to run on the target server is my last resort, due to all of the other dominoes that would fall related to moving the code...
Also, yosiasz: the sub-select is in the query so that records are excluded when more than one CNo exists in the xpx table... thanks
IdCt is a PK with a clustered index of type CHAR(12) - (NO NULLs allowed)
xpx.CNo (NULLS allowed) is a VARCHAR(10) column with a nonclustered index on this column too...
try the following. sometimes it is best to break things down in batches to get some performance improvement and troubleshoot. You add this temp table creation in your stored procedure. if you want you can create a sample sproc to test it out, if you do not want to mess yoru current sproc and/or you do not have a pre-prod environment loaded with same data.
create table #tmp_AEAxjn(IdCt CHAR(12) not null primary key)
insert into #tmp_AEAxjn
select distinct IdCt
from LinkedServer.DBNAME.dbo.AEAxjn A
where A.IdGrp = 'LLJ'
UPDATE A
SET TEXTSNAM = CAST(xpx.pxp as CHAR(10))
, NAMECUST = xpx.pxp
, TEXTSTRE1 = CASE WHEN xpx.BillingAddress1 <> ''
THEN BillingAddress1
ELSE xpx.AddressLine1
END
, TEXTSTRE2 = CASE WHEN xpx.BillingAddress1 <> ''
THEN ISNULL(xpx.BillingAddress2,'')
ELSE ISNULL(xpx.AddressLine2,'')
END
, NY = CASE WHEN xpx.BillingAddress1 <> ''
THEN [By] ELSE xpx.City
END
, CODESTTE = CASE WHEN xpx.BillingAddress1 <> ''
THEN BillingState
ELSE xpx.State
END
, CODEPSTL = CASE WHEN xpx.BillingAddress1 <> ''
THEN BillingZip
ELSE LEFT(xpx.Zip, 5)
END
, NAMECTAC = ISNULL(xpx.ContactName,'')
, TEXTPHON1 = ISNULL(xpx.Phone1,'')
, TEXTPHON2 = ISNULL(xpx.Fax,'')
, EMAIL1 = ISNULL(xpx.PEmail,'')
, WEBSITE = xpx.PID
, CTACPHONE = ISNULL(xpx.Phone1,'')
, CTACFAX = ISNULL(xpx.Fax,'')
FROM CCMX.dbo.xpx xpx
WHERE
xpx.ChId = 1
AND
LEN(xpx.CNo) = 4
AND
xpx.BAct = 1
AND
xpx.CNo IN
(
SELECT
CNo
FROM
CCMX.dbo.xpx
GROUP BY
CNo
HAVING
COUNT(*) = 1 --5000 rows
)
and exists (select 1 from #tmp_AEAxjn src where src.IdCt = xpx.CNo) ;
So, I thought about the question about datatypes and decided to make the change to convert varchar to char in the UPDATE statement... the job step went from ~2 hours to 6 secs... I will still have to dig into my trace to see what went on during the 6 secs execution. I guess that I took it for granted that char and varchar would not make a difference, but I was wrong!
Thanks to everyone for offering their input and assistance... I will watch this for a few more days, just to make sure that the code change sticks.
An "odd" thing on this morning run of the job step - it took 45 mins to run... SMH... I will have to check to see if there was any blocking... From 2+ hours to 6 secs to 45 mins ???
yosiasz, I think that I will have to explore the code that you posted... thanks
Well, no one yelled at me, but I did some yelling... I was totally surprised by the 45 min execution time. I took a look at the link that you posted. I can only conclude that my daily statistics update could be the problem...
It's hard to tell at this moment... My trace didn't show any blocking at the time that the code was executing.... I checked my WaitsReport and noticed that the ASYNC_IO_COMPLETION wait type was nonexistent on the day that the code executed in 6 secs, but that ASYNC_IO_COMPLETION wait type was at 11755 secs (195 mins) on today's run...
I am pretty sure that this process is covered in todays wait stats... The question is 'what was it' - I am beginning to think that the target linked server may need more attention, too....
I am going to look at implementing the code that you posted also...