Find duplicate in view and delete in correct table

nextaxtion · July 3, 2017, 3:34pm

Hi team ,

I have a view that consist of different tables from various databases like .

create view vw_mifin_mapping
as
select customer from db1.dbo.tbl_mifin_mapping
uniion all
select customer from db2.dbo.tbl_mifin_mapping
union all
select customer from db3.dbo.tbl_mifin_mapping
union all
select customer from db4.dbo.tbl_mifin_mapping

when i fetched this view i got some duplicate entries and after checking the records in tables i find that db1.dbo.tbl_mifin_mapping and db3.dbo.tbl_mifin_mapping both contains customer pr0091

There is a standard mapping table tbl_check_mapping exists that contains customer,dbname where customer pr0091 is mapped in db3.dbo.tbl_mifin_mapping.

My question is there might be some more records in future how can i find and delete the duplicate entries from physical tables automaticaly using sql code.

Please help.

Kristen · July 3, 2017, 3:48pm

How about including the Source in the VIEW , then when you have DUPs you can use that Source to control the DELETE

create view vw_mifin_mapping
as
select [Source]='db1', customer from db1.dbo.tbl_mifin_mapping 
uniion all
select [Source]='db2', customer from db2.dbo.tbl_mifin_mapping 
union all
select [Source]='db3', customer from db3.dbo.tbl_mifin_mapping 
union all 
select [Source]='db4', customer from db4.dbo.tbl_mifin_mapping

nextaxtion · July 3, 2017, 4:02pm

Thanks @Kristen , But I cannot update view right now.

please siuggest.

ScottPletcher · July 3, 2017, 4:43pm

Just use UNION instead of UNION ALL and SQL will automatically remove all dups from the results:

create view vw_mifin_mapping
as
select customer from db1.dbo.tbl_mifin_mapping
union
select customer from db2.dbo.tbl_mifin_mapping
union
select customer from db3.dbo.tbl_mifin_mapping
union
select customer from db4.dbo.tbl_mifin_mapping

nextaxtion · July 3, 2017, 4:50pm

no offence, but i don't want to slow down my queries.please understand i cannot change any structure.

please suggest sql script to find out the culprit and delete. this is only data cleaning task.

Kristen · July 3, 2017, 4:57pm

Fine, just take the code from the VIEW to make a stand-alone query that just uses the guts of it, and add a SOURCE column (to "feed" your delete).

ROW_NUMBER() is the method I use for deleting "all except the first one", then just make sure that the ORDER BY (for the OVER clause) presents the one you want to keep "first"

ScottPletcher · July 3, 2017, 5:06pm

If there are dups across tables, which table do you want to delete the dup row(s) from?

Say the same row appears in db1, db3 and db4. From which dbs would you want to delete the row?