Improve Performance of Trigger

Kristen · January 26, 2016, 1:21pm

In an [AFTER] trigger if I JOIN [inserted] to the actual table, on the PKey (which is also the Clustered Index) columns is there an efficient index usage?

What about if I join other tables?

This is a simplified version of my code. Its been running for 3hrs and I figure its updating about 1M rows ... I can;t figure what is taking so long, and I'm reluctant to stop it (to add debug code and/or analyse the query plan) for a) the time it might take to rollback and b) the time it might take to run it again!!!

CREATE TRIGGER dbo.MyTrigger
ON dbo.TargetTable
AFTER INSERT, UPDATE, DELETE
AS

UPDATE	U
SET
	[TableName] = COALESCE(T1.TableName, T2.TableName, U.TableName),	-- varchar(40)
	[TableGUID] = COALESCE(T1.TableGUID, T2.TableGUID, U.TableGUID)		-- uniqueidentifier
FROM	inserted AS I
	JOIN dbo.TargetTable AS U
		 ON U.PKeyID = I.PKeyID		-- INT			
	LEFT OUTER JOIN dbo.LookupTable AS T1
		 ON T1.TableGUID = I.TableGUID	-- PK_LookupTable(TableGUID) clustered, unique, PKey
	LEFT OUTER JOIN dbo.LookupTable AS T2
		 ON T2.TableName = I.TableName	-- IX_LookupTable(TableName) unique

I wonder if the two Outer Joins would be any different it I change

I.TableGUID

to

U.TableGUID

As this is an AFTER trigger the original table and INSERTED will have the same value, but perhaps the query plan will be better if JOINed to the Original Table, rather than INSERTED?

I'm also thinking that it might be smart to add a WHERE clause so that only changed rows are UPDATEd ... but too late for this particular run.

The Trigger is basically just a Belt & Braces lookup of both the GUID and the NAME of the associated record. Both are unique (in the lookup table) and BOTH are stored in the TargetTable (for legacy reasons ). There is a fair chance that the APP has already correctly filled in both the GUID and TableName columns and, thus, the UPDATE being made by the Trigger may only be updating records to their existing values ...

gbritton · January 26, 2016, 2:11pm

Should be, yes.

About those left joins: What if the result of a left join produces no rows from the right side? Do you have code to handle that situation (eg. something like:

where t1.tableguid is not null

Though that effectively makes it an inner join.

Kristen · January 26, 2016, 2:21pm

Should never happen ... <\THUD!>

But because this is a trigger and I want it to work in all instances, process all rows in INSERTED I opted for an Outer Join.

The SET statement has a COALESCE on it which will take the "best value" from either OUTER JOIN table or, if none, then leave the value as it was provided. This is some very "Belt & Braces" code.

I'm busy trying to figure out how I can get a Query Plan from the Trigger. Being an old dinosaur, and preferring to view it in TEXT I normally use

SET SHOWPLAN_TEXT ON

I can see it (in Result #2, in SSMS ... but all that "percentages" stuff and MouseOver to see a bit more is a PITA to me ... Perhaps I can save it to a .sqlplan fie and then examine that with something or other?

gbritton · January 26, 2016, 2:28pm

If it should never happen, change it to an inner join. Or, add code to handle the situation where there are no matching rows from the right side.

Kristen · January 26, 2016, 2:48pm

I have to allow for the fact that some mishap may occur such that a related row might not exist. It is critical that all [other columns for all] rows in INSERTED are updated, even if the row in the related table does not exist.

My approach to "add code to handle the situation where there are no matching rows from the right side" was the COALESCE in the SET but would welcome suggestions for alternatives.

gbritton · January 26, 2016, 2:54pm

OK -- it's probably the best you can do. but you may get longer processing times for your trigger.

Kristen · January 26, 2016, 3:43pm

I don't know (I need to test it) whether performance is hampered by referencing columns from both INSERTED and [TargetTable] - both will be the same at the point that the [After] trigger fires, and I've been a bit slack about sometimes referencing columns from INSERTED and sometimes from [TargetTable] - so perhaps I should ONLY Join to INSERTED into order to locate appropriate [TargetTable] rows that need processing in the trigger - that should just be a Clustered Index seek - and then only reference columns from [TargetTable]

... but I have had a go at using a CTE to "assemble" the data to hopefully then be able to easily determine which rows have changed and need updating.

I don't know how to get a Query Plan (from a Trigger) in plain text, but I can get the I/O Stats.

The original was:

Table 'TargetTable'. Scan count 1, logical reads 14
   CPU time = 0 ms, elapsed time = 15 ms.

Table 'TargetTable'. Scan count 1, logical reads 797
Table 'Worktable'. Scan count 0, logical reads 0
Table 'LookupTable'. Scan count 2, logical reads 12
   CPU time = 16 ms,  elapsed time = 9 ms.
   CPU time = 16 ms,  elapsed time = 24 ms.

The first result is the simple UPDATE statement (I get set a column to itself). The second is from the Trigger.

After I changed it to use a CTE I got:

Table 'TargetTable'. Scan count 1, logical reads 14
   CPU time = 0 ms, elapsed time = 3 ms.

Table 'TargetTable'. Scan count 2, logical reads 28
Table 'Worktable'. Scan count 1, logical reads 0
Table 'LookupTable'. Scan count 2, logical reads 12
   CPU time = 16 ms,  elapsed time = 5 ms.
   CPU time = 63 ms,  elapsed time = 55 ms.

The reduction on logical reads on TargetTable looks good, but the second CPU time has gone up a lot, not sure if that is absolute or "logical" value though.

Adding tests (for differences) between the re-calculated CTE column values, and the TargetTable and doing a benign update (no changes to actual column values, and trigger's JOIN to Lookup table does not find any different values) I get this:

Table 'TargetTable'. Scan count 1, logical reads 14
   CPU time = 0 ms, elapsed time = 3 ms.

Table 'TargetTable'. Scan count 2, logical reads 28
Table 'Worktable'. Scan count 1, logical reads 578
Table 'LookupTable'. Scan count 2, logical reads 12

   CPU time = 0 ms,  elapsed time = 6 ms.
   CPU time = 47 ms,  elapsed time = 56 ms.

There will have been no physical UPDATES to TargetTable (i.e. by the trigger) but there is significant activity in the Worktable to compare columns to see if anything has changed. Not really apparent from this Stats logging whether there s an improvement in time, I expect I need to run it on a much bigger dataset and time how long it takes. My earlier process, which had been running for 4 hours before I aborted it, is still rolling back ... I might be gone some time!!

ScottPletcher · January 26, 2016, 8:54pm

You can remove "T2.TableName" from the first COALESCE, since it's known to be equal to U.TableName anyway if it's not null.

As for the joins, if it's a loop join, try the effect of forcing a hash join instead:

CREATE TRIGGER dbo.MyTrigger
ON dbo.TargetTable
AFTER INSERT, UPDATE, DELETE
AS

UPDATE	U
SET
	[TableName] = COALESCE(T1.TableName, U.TableName),	-- varchar(40)
	[TableGUID] = COALESCE(T1.TableGUID, T2.TableGUID, U.TableGUID)		-- uniqueidentifier
FROM	inserted AS I
	JOIN dbo.TargetTable AS U
		 ON U.PKeyID = I.PKeyID		-- INT			
	LEFT OUTER /*HASH*/ JOIN dbo.LookupTable AS T1
		 ON T1.TableGUID = I.TableGUID	-- PK_LookupTable(TableGUID) clustered, unique, PKey
	LEFT OUTER JOIN /*HASH*/ dbo.LookupTable AS T2
		 ON T2.TableName = I.TableName	-- IX_LookupTable(TableName) unique

gbritton · January 27, 2016, 2:03pm

On second thought, you should add WHERE clauses. Recall that the WHERE clause is processed well before the SET. It filters the row set processed. You're probably processing a (much) bigger row than required.