SQLTeam.com | Weblogs | Forums

Type II table delete un-necessary records

my table has duplicate un-necessary data and needs clean-up. any efficient way to do this please?

CREATE TABLE #TempTable (Number int,value1 bit,value2 bit,startdatetime datetime,enddatetime datetime)

INSERT INTO #TempTable VALUES(111, 1, 0, '2020-06-19 18:17:46', '2020-06-20 23:52:31')
INSERT INTO #TempTable VALUES(111, 1, 0, '2020-06-20 23:52:32', '2020-06-21 11:58:50')
INSERT INTO #TempTable VALUES(111, 1, 0, '2020-06-21 11:58:51', '2020-06-22 16:17:04')
INSERT INTO #TempTable VALUES(111, 1, 0, '2020-06-22 16:17:05', '2020-06-23 11:53:51')
INSERT INTO #TempTable VALUES(111, 1, 0, '2020-06-23 11:53:52', '2020-07-01 00:24:26')
INSERT INTO #TempTable VALUES(111, 1, 0, '2020-07-01 00:24:27', '9999-12-31 00:00:00')

INSERT INTO #TempTable VALUES(222, 1, 0, '2020-06-19 18:17:46', '2020-06-20 23:52:31')
INSERT INTO #TempTable VALUES(222, 0, 0, '2020-06-20 23:52:32', '2020-06-21 11:58:50')
INSERT INTO #TempTable VALUES(222, 0, 1, '2020-06-21 11:58:51', '2020-06-22 16:17:04')
INSERT INTO #TempTable VALUES(222, 0, 1, '2020-06-22 16:17:05', '2020-06-23 11:53:51')
INSERT INTO #TempTable VALUES(222, 0, 0, '2020-06-23 11:53:52', '2020-07-01 00:24:26')
INSERT INTO #TempTable VALUES(222, 1, 0, '2020-07-01 00:24:27', '9999-12-31 00:00:00')

CREATE TABLE #DesiredOutput (Number int,value1 bit,value2 bit,startdatetime datetime,enddatetime datetime)

INSERT INTO #DesiredOutput VALUES(111, 1, 0, '2020-06-19 18:17:46', '9999-12-31 00:00:00')

INSERT INTO #DesiredOutput VALUES(222, 1, 0, '2020-06-19 18:17:46', '2020-06-20 23:52:31')
INSERT INTO #DesiredOutput VALUES(222, 0, 0, '2020-06-20 23:52:32', '2020-06-21 11:58:50')
INSERT INTO #DesiredOutput VALUES(222, 0, 1, '2020-06-21 11:58:51', '2020-06-23 11:53:51')
INSERT INTO #DesiredOutput VALUES(222, 0, 0, '2020-06-23 11:53:52', '2020-07-01 00:24:26')
INSERT INTO #DesiredOutput VALUES(222, 1, 0, '2020-07-01 00:24:27', '9999-12-31 00:00:00')

SELECT * FROM #DesiredOutput

first write select query to identify the data .. change the select to delete
example
Table 1
ColumnData
1
2
3
select * from Table1 where ColumnData < 2
1 comes
Delete from Table1 where ColumnData < 2
now in Table 1
2
3
will be left

Do this kind of things with Begin Tran .. RollBack Commit

Begin Tran
do your delete
check results
all ok then commit
Not ok then rollback

1 Like

tanks for the reply @harishgg1 but i did not got what you were saying....

Ok
I will explain to you nicely clearly
With SQL server tsql screen shots

Please give me a little time

hi here it is

.. .. if you are not able ot understand .. please let me know ..
.. i will make it even more easy to understand ..

:slight_smile: :+1:

image

i want to remove the duplicates .. first i write a select to identify duplicates ..

select 
         id
     ,   age
     ,   count(1) as duplicates 
from 
    SampleData 
group by 
       id, age
having 
     count(1) > 1 

image

Now i use this query to delete the duplicates ..

delete 
from 
    SampleData
where 
    id = 1 and age = 20 

Now see the duplicates are deleted
image

One important thing is sometimes mistakes happen while deleting
You deleted .. mistake happened .. you want to go back to begining
You deleted .. everything was right .. you want to continue
For this you do

Begin tran
do delete
no mistake .. continue .. commit
mistake go back to original .. roll back ..

i will show how this works

begin tran
delete from SampleData where id in (1,2) -- wrong statement
select * from SampleData -- i check results -- its wrong i want to go back
image

begin tran
delete from SampleData where id in (1,2) -- wrong statement
select * from SampleData -- i check results -- its wrong i want to go back
rollback
select * from SampleData -- my original data before i deleted is there

image

Now
begin tran
delete from SampleData where id = 1 and age = 20 -- correct statement
select * from SampleData -- i check results -- its correct

image

begin tran
delete from SampleData where id = 1 and age = 20 -- correct statement
select * from SampleData -- i check results -- its correct
commit
select * from SampleData -- i want to continue .. thats why commit
image

why is only the second one omitted as duplicate?

INSERT INTO #TempTable 
VALUES(222, 0, 1, '2020-06-21 11:58:51', '2020-06-22 16:17:04')
INSERT INTO #TempTable 
VALUES(222, 0, 1, '2020-06-22 16:17:05', '2020-06-23 11:53:51')

maybe no need to consider date parts

then that breaks the rules for the following which OP has included in the desired result

INSERT INTO #TempTable 
VALUES(222, 0, 0, '2020-06-23 11:53:52', '2020-07-01 00:24:26')

INSERT INTO #TempTable 
VALUES(222, 0, 0, '2020-06-20 23:52:32', '2020-06-21 11:58:50')

OP needs to provide the rules of the desired output

Use the difference in order grouping technique:

WITH Grps
AS
(
	SELECT [Number], value1, value2, startdatetime, enddatetime
		,ROW_NUMBER() OVER (PARTITION BY [Number] ORDER BY startdatetime)
			- ROW_NUMBER() OVER (PARTITION BY [Number], value1, value2 ORDER BY startdatetime) AS Grp
	FROM #TempTable
)
SELECT [Number], value1, value2
	,MIN(startdatetime) AS startdatetime
	,MAX(enddatetime) AS enddatetime
FROM Grps
GROUP BY [Number], value1, value2, Grp
ORDER BY [Number], startdatetime;
1 Like

yes, the date should be considered and min of 1st and max of 2nd is supposed to be in the resultset

this worked perfectly. thanks

1 Like

I always do this way, whether update or delete.
This way i am hundred percent sure that i am deleting or updating desired rows.

another idea .. which will make it even more SAFE is ( if this idea helps great )

begin transaction

begin transaction
do your select
update or delete
check if you made mistake or all OK
commit if all ok
rollback if you made mistake

1 Like

Yes!

and safest of all, before doing data affecting script,
take your backup

+++++++++++++++++++++++++++++++++++++
if you say backup
other things crop up

-- million rows of data effected ..

-- lots of primary key foreign key dependancies
++++++++++++++++++++++++++++++++++++

i get really really nervous doing .. these sort of things !!!

I mean data affecting script and backup