Increased physical reads from INSERT...SELECT

Version 1

INSERT INTO table_a (col_1, col_2)
SELECT DISTINCT col_1, col_2
FROM table_b b
WHERE b.col_1 IS NOT NULL
AND b.col_2 IS NOT NULL
AND b.id NOT IN
(
 SELECT b.id
 FROM table_b b
 JOIN table_a a WITH(NOLOCK)
 ON b.col_1 = a.col_1
 AND b.col_2 = a.col_2.
 )

Version 2

SELECT * INTO #temp FROM
(
 SELECT DISTINCT col_1, col_2
 FROM table_b

 WHERE b.col_1 IS NOT NULL
 AND b.col_2 IS NOT NULL
 AND b.id NOT IN
(
 SELECT b.id
 FROM table_b b
 JOIN table_a a WITH(NOLOCK)
 ON b.col_1 = a.col_1
 AND b.col_2 = a.col_2
)) t

INSERT INTO table_a (col_1, col_2) SELECT col_1, col_2 FROM #temp

col_1: varchar
col_2: varchar

table_a: clustered on primary key sequential id, non-clustered unique on col_1 include col_2, non-clustered unique on col_2 include col_1

Version 1 I/O

Table 'table_b'. Scan count 34, logical reads 118, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table_a'. Scan count 0, logical reads 109404, physical reads 8761, read-ahead reads 7761, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(4997 row(s) affected)

Version 2 I/O

Table 'table_b'. Scan count 34, logical reads 118, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table_a'. Scan count 0, logical reads 35454, physical reads 0, read-ahead reads 5435, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#temp_______________________________________________________________________________________________________________00000044D848'. Scan count 0, logical reads 5045, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(4994 row(s) affected)

(1 row(s) affected)
Table 'table_a'. Scan count 0, logical reads 105486, physical reads 331, read-ahead reads 5940, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 2, logical reads 10412, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#temp_______________________________________________________________________________________________________________00000044D848'. Scan count 1, logical reads 52, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(4994 row(s) affected)

Version 1 is significantly slow that version 2, I assume due to the higher physical read count.

Are there any obvious reasons why the physical read count would be so different, superficially it appears as though both versions should be doing the same work.

I have ran these numerous time with hot and cold and cache and seen consistent behavour across both versions.

Query plans attached.

Don't use that. All sorts of disasters waiting to happen. If you have contention problems choose a different solution, such as setting the database to Read Committed Snapshot

I read the higher read count on Table-A as being a combination of the NOT IN ... JOIN table_a and also the INSERT INTO table_a, whereas in Version 2 the read count for the INSERT INTO table_a is in the second step.

There are lots of physical reads in Version 1 - you could clear cache before each test (don't do that on a production server though!!) so that both tests were from an identical starting point, or repeat the test and use 2nd, 3rd etc. times as the benchmark (i.e. once the data was already in the cache)

But on the face of it I agree, just looking at the code, both are doing the same thing and it should be quicker just to store direct into Table-A rather than into #TEMP first and then Table-A

I think you're doing unnecessary I/O in either version. Try this instead;

INSERT INTO table_a (col_1, col_2)

SELECT DISTINCT col_1, col_2
FROM table_b b
WHERE b.col_1 IS NOT NULL
AND b.col_2 IS NOT NULL
AND NOT EXISTS(
 SELECT a.id
 FROM table_a a WITH(NOLOCK)
 WHERE b.col_1 = a.col_1
 AND b.col_2 = a.col_2
 )

Is there any perforamnce benefit in leaving the

WHERE b.col_1 IS NOT NULL
AND b.col_2 IS NOT NULL

out altogether and just relying on the

 FROM table_a a 
 WHERE b.col_1 = a.col_1
 AND b.col_2 = a.col_2

doing that part (i.e. that won't include anything with NULLs).

@Kristen, the WITH(NOLOCK, yeah... (unfortunately not all decisions are my own).

I am quite confident caching is not an issue because I have ran these numerous times with not and cold caches and in different order.

I've been pouring over these two version for a couple of days, including SQL server profier. With out really getting much closer to understanding the issue.

One thing i did notice in the trace, but unsure if it is relevant. the second versions lock escalation happens only for the insert part i.e. definitively after the SELECT has completed.

Interesting, I suppose it needs to because in your second version there is opportunity for table_a or table_b to have changed after the SELECT * INTO #temp was done, and before the INSERT INTO table_a starts - so as a consequence your Version 2 might break with Foreign Key or DUPLICATE error or somesuch.