Query runs fine local but very slow on JOB

Hey.
I have a query that runs relatively ok locally 3-5 minutes but when put on a job it takes about 50 minutes.
From what I can see, set noncount is on , haven't touched any parallelism MAXDOP etc.
The query is run as a part of a longer list of queries in one go, so I was thinking if I use it outside of the list, individually would make any difference? Or any other thoughts?

For the record the query is this one:

DECLARE @365DaysAgo DATETIME = DATEADD(DAY, -365, GETDATE()); UPDATE tblDWConcession SET LoyaltyMemberCode = tblDWPayment.LoyaltyMemberCode, LoyaltyClubID = tblDWLoyaltyMember.LoyaltyClubID FROM tblDWConcession INNER JOIN tblDWPayment ON tblDWPayment.CCode = tblDWConcession.Code AND tblDWPayment.TransNumber = tblDWConcession.TransNumber AND tblDWConcession.SalesTransCDate > @365DaysAgo AND PaymentDate > @365DaysAgo --AND tblDWPayment.LoyaltyMemberCode IS NOT NULL INNER JOIN tblDWLoyaltyMember ON tblDWLoyaltyMember.LoyaltyMemberCode = tblDWPayment.LoyaltyMemberCode WHERE tblDWConcession.LoyaltyMemberCode IS NULL AND (tblDWConcession.IsProcessCompleted = 0 OR tblDWConcession.IsProcessCompleted IS NULL);

Maybe if I put the data of tblDWConcession to a temp table ? But the data is 42GB at that table.

The problem is more likely to be blocking and not the code being in a SP. You really need to do the update in batches of less than 5000 rows. To do this you need to know the PK of tblDWConcession. Of course you cannot be bothered to post ddl etc so you can google how to do this.

1 Like

Yes but the problem is not really the amount of the updated data.
Usual the rows to be updated are way below 5000 so it will make no sense to do an update in batches as the first batch will also be the last.

These are my thoughts:

42GB of that table:
Database table partitioning in SQL Server (sqlshack.com)

Check if SalesTransCDate and PaymentDate have an index.

WHERE tblDWConcession.LoyaltyMemberCode IS NULL AND (tblDWConcession.IsProcessCompleted = 0 OR tblDWConcession.IsProcessCompleted IS NULL);

WHERE tblDWConcession.LoyaltyMemberCode IS NULL AND ISNULL(tblDWConcession.IsProcessCompleted,0)=0;

You can also use a CTE, first you select the rows to update and then update the table.

You can post your plan so we can check it much better:
Instructions - Brent Ozar Unlimited®

1 Like

If you mean to say the job runs a longer list of queries, and that this particular query is embedded in it, could the full list be running in a transaction? Or with a high transaction isolation level? Either of these could possibly cause, or be affected by, blocking with other processes.

1 Like

Hi. Partitioning is a good call we should have that running as soon as we migrate, since I'm not sure how well a 2008R2 does the job.
The query is run in series inside an SP with hundred of insert and updates. We don't touch the isolation level. Let me check to be sure...Yep, there is no special isolation level. Also I have a log specifying the time and all the other queries run in less than a minute. Also the indexes you told are set.

Now, I fixed the names a little in the SP so this is the new on, just let me see how I can attach images.
The new SP is:
DECLARE @365DaysAgo DATETIME = DATEADD(DAY, -365, GETDATE());
UPDATE tblDWConcession
SET LoyaltyMemberCode = tblDWPayment.LoyaltyMemberCode,
LoyaltyClubID = tblDWLoyaltyMember.LoyaltyClubID
FROM tblDWConcession
INNER JOIN tblDWPayment ON tblDWPayment.CinemaCode = tblDWConcession.CinemaCode AND tblDWPayment.TransNumber = tblDWConcession.TransNumber AND tblDWConcession.SalesTransCinemaDate > @365DaysAgo AND PaymentDate > @365DaysAgo --AND tblDWPayment.LoyaltyMemberCode IS NOT NULL
INNER JOIN tblDWLoyaltyMember ON tblDWLoyaltyMember.LoyaltyMemberCode = tblDWPayment.LoyaltyMemberCode
WHERE tblDWConcession.LoyaltyMemberCode IS NULL AND (tblDWConcession.IsProcessCompleted = 0 OR tblDWConcession.IsProcessCompleted IS NULL);

Plan1

Plan2

Plan3

Just note I did a select to get the plans because the job is still running on tables

so:

DECLARE @365DaysAgo DATETIME = DATEADD(DAY, -365, GETDATE());
	select  C.LoyaltyMemberCode,  tblDWPayment.LoyaltyMemberCode,
	    C.LoyaltyClubID, tblDWLoyaltyMember.LoyaltyClubID
    FROM tblDWConcession C 
    INNER JOIN  ...

Do you get better results when you use the RECOMPILE hints on individual statements like mentioned in this article?

RECOMPILE Hints and Execution Plan Caching - Brent Ozar Unlimited®

1 Like

Hi.
No unfortunately I don't get better results.

Hi.
By updating statistics today the index seek wend from 54 to 42%.
This is something but not optimal .
What I can see is that the estimated number of executions will not change even if i reduce the dateadd to 3 or 4 days. It will still select a million rows and bring back 4-5 rows. Is there a way to find out why this is happening and possibly fix it?

Edit: It also made the hash match to 3% from 26.
I'm not sure if those will stay tho but I don't think I will be changing the query, if only to fix something but not to get other columns or data.

Yesterday I updated statistics WITH FULLSCAN.
There is some difference in running time but I could see a 15-20% at most.
So I don't know what else I can do. The only thing that comes into mind is if the @365DaysAgo is inferring and instead of having a smaller set, it actually mess the execution plan.
I can check that tomorrow. Other than that...

If you want another set of eyes, can you execute the script locally after running below and paste the results? Details are lost in the GUI images you showed.

set statistics time on
set statistics io on
set statistics profile on

I'm not sure it will show correct here:
Is there some way to paste them excel like?

I tried to smush them to an image, if it would help:

The last line is EstimateExecutions
NULL
1
1
1
1
1
1
1
1
1
1006741

Press ctrl-t to switch to text results instead of data grid. Also in Options set Text Results max column width=8000. Try pasting the result in a Blockquote. But I am guessing the query you posted the screenshot from is actually performing fine isn't it?

It performs fine but it still have the large estimated number of rows issue.
Talking here and there, I have this update statement that uses CTE, so to get all the indexes per table on.
This shows a better performance on the execution plan and I will be trying it on the night job batch to see how it performs:

WITH
    C  AS (SELECT CinemaCode, TransNumber,LoyaltyMemberCode,LoyaltyClubID
        FROM tblDWConcession 
  
        WHERE LoyaltyMemberCode IS NULL AND (tblDWConcession.IsProcessCompleted = 0 OR tblDWConcession.IsProcessCompleted IS NULL) AND SalesTransCinemaDate>(SELECT DateAdd(Year, -1, GetDate()) As OneYearAgo)),
    
    P  AS (SELECT T.LoyaltyMemberCode, T.CinemaCode, T.TransNumber, L.LoyaltyClubID 
        FROM tblDWPayment T
        INNER JOIN tblDWLoyaltyMember L ON L.LoyaltyMemberCode=T.LoyaltyMemberCode --Index on both LoyaltyMemberCode 
        WHERE T.PaymentDate>(SELECT DateAdd(Year, -1, GetDate()) As OneYearAgo))

	UPDATE U
    SET U.LoyaltyMemberCode = P.LoyaltyMemberCode,
        U.LoyaltyClubID = P.LoyaltyClubID
    FROM tblDWConcession U 
    INNER JOIN C ON C.CinemaCode = U.CinemaCode and C.TransNumber= U.TransNumber
    --Combined Index on both columns
    INNER JOIN P ON P.CinemaCode = C.CinemaCode AND P.TransNumber = C.TransNumber;

Not many ways to do this. Either it will use the index on SalesTransCinemaDate or it will not.
Force it do do so and see what happens;

UPDATE U
    SET U.LoyaltyMemberCode = P.LoyaltyMemberCode,
        U.LoyaltyClubID = P.LoyaltyClubID
    FROM tblDWConcession U  WITH (INDEX ( idxSalesTransCinemaDate )) -- whatever the date index is called
    INNER JOIN tblDWPayment P ON P.CinemaCode = U.CinemaCode AND P.TransNumber = U.TransNumber
    WHERE U.LoyaltyMemberCode IS NULL AND U.IsProcessCompleted IS NULL
      AND SalesTransCinemaDate>(SELECT DateAdd(Year, -1, GetDate()) As OneYearAgo);

Note;
Estimated IO is Important. Estimated Rows is less important. See your screen shot. Row 8 is an index scan of 1 million rows, 24 times. Index Scans are fast. Certainly cached. A lot but no problem.
Row 10. That is an Index Seek. Relatively expensive IO. May not be cached, depending many things. Still, 162940 seeks 24 times, is nothing SQL can not handle. Production will be much different.

If you want to you can create a new covering index for SalesTransCinemaDate which may speed lookups on tblDWPayment

CREATE INDEX tblDWConcession.idxSalesTransCinemaDate_2 (SalesTransCinemaDate, CinemaCode INCLUDE, TransNumber INCLUDE)
1 Like

I will try to index for a last test but |I don't see any light in this.
Unfortunately the job has not been improved.
I don't think it has to do with the execution plan , something is wrong when running the job.
My last thought is put the query on it's own step but that is just a wild guess.

As a test I did a select adding the with index but the trouble now is that the execution plan is complaining about other missing indexes that did not complain before. Note the "missing" indexes are actually set but the doing the WITH command , somehow loses them. Also those missing indexes are more important I think (CinemaCode and TransNumber) since it's the PK index