How to do conditional deletes

ahmedbarbary · August 8, 2021, 11:57pm

I work sql server 2012 I face issue i need make delete statement to delete all rows
from #partattributes table when count of rows per code and code type both
is less than count of rows per same code and code type both on table test
as example

count of rows on table #test per code 8538906000 and code type 849774 is 2

count of rows on table #partattaributes per code 8538906000 and code type 849774 is 1

2 is bigger than 1 then delete all code from #partattaributes for code
8538906000 and code type 849774

create table #test
 (
 codetypeid int,
 code varchar(20),
 zplid int,
 zfeaturekey int
 )
 insert into #test(codetypeid,code,zplid,zfeaturekey)
 values
 (849774,8538906000,4123,160001),
 (849774,8538906000,4123,160003),
 (199987,8538906077,4125,160020),
 (199987,8538906077,4125,160050)
    
 ----drop table #partattaributes
 create table #partattaributes
 (
 partid int,
 code varchar(20),
 codetypeid int
 )
 insert into #partattaributes(partid,code,codetypeid)
 values
 (1024,'8538906000',849774),
 (2050,'8538906077',199987),
 (2050,'8538906077',199987)
    
 what i try 
  
 delete p FROM #partattaributes p 
 LEFT JOIN #test t on t.code=p.code and t.codetypeid=p.codetypeid
 where t.code is null

result must delete from table #partattaributes is :

partid	code	codetypeid
1024	8538906000	849774

yosiasz · August 9, 2021, 3:16am

What have you tried

ScottPletcher · August 9, 2021, 8:43am

DELETE FROM pa
FROM #partattaributes pa
INNER JOIN (
    SELECT codetypeid, code, COUNT(*) AS pa_count
    FROM #partattaributes
    GROUP BY codetypeid, code
) AS pa_count ON pa_count.code = pa.code AND pa_count.codetypeid = pa.codetypeid
INNER JOIN (
    SELECT codetypeid, code, COUNT(*) AS t_count
    FROM #test
    GROUP BY codetypeid, code
) AS t_count ON t_count.code = pa_count.code AND t_count.codetypeid = t_count.codetypeid AND t_count.t_count > pa_count.pa_count

JeffModen · August 10, 2021, 2:34am

Deleting is painful. Why not just do a SELECT that ignores what you're deleting?

ScottPletcher · August 10, 2021, 5:38am

Isn't it more painful to have to rerun code to ignore the rows every time you SELECT from the table rather than just deleting the rows once and from then on being able to just SELECT all rows from the table?

JeffModen · August 10, 2021, 6:34pm

That's actually a great question. I guess the best way to answer that is, remembering that someone only needs to leave a single byte on a page for it to persist, how many scan pages have they saved by doing Deletes? Usually, zero. With that in mind, how effective are their deletes other than making it so that they can eliminate a bit of criteria that they shouldn't have needed to begin with but do because they probably included the wrong rows in a temp table to being with?

People will also ask "Well what if it's a really big temp table"? My answer that they should figure out why the temp table is so big to begin with. I'm thinking they did something wrong in populating it to begin with. If that can't be avoided, then maybe do a SELECT INTO another temp table with just the rows they actually need instead of messing around with deletes that probably are ineffective anyway.

ScottPletcher · August 10, 2021, 6:38pm

There's some real overhead to the processing they need to use here to determine which rows to delete. To me it would become enormous overhead to have to do that every time you SELECT from the table.

Space management issues on the table are a different topic. If they've deleted enough rows for it to matter, they could rebuild the table to make the rows more contiguous again. Or, if it's a heap, just not worry about, the deleted space will be reused with the next load(s).

JeffModen · August 10, 2021, 7:21pm

You do agree that it's generally better to populate your temp table correctly (nothing to delete) the first, time?

As for the other things, there should be several tests.

Coded "ignores". with and without indexes.
Deletes with and without indexes and an extra where an in place CI has to be rebuilt.
Copy only "keeper" data to a new table with and without indexes.
Originally build with the correct data with and without indexes,
Maybe not use a temp table at all with and without indexes.