SQLTeam.com | Weblogs | Forums

Sql Query for Count Duplicate


#1

Hi All,
I have one master table(master table has more than 100 tables data) it contains all the tablenames along with columns by specifying primary key and non primary key columns.I have to check duplicate records count for each table based on primary key if any of the table dont have primary key column need to consider all the columns for that tables except loaddate column in groupby .I need a generic query to check the record counts for all tha tables present in the master table instead of checking for each table based on primary key columns and non primary key columns..Please guide me on sql query if it possible.
Master Table:
ObjectName AttributeName Prikey
Emp empid Yes
Emp name No
Emp loc No
Emp address No
Emp loaddate No
dept deptid Yes
dept dname no
dept daddress No
dept loaddate No
Location lid No
Location laddress No
Location loaddate No
Job jobid No
job loaddate no

I know generic query to check for each table for example
-----In the below query consider empid as primarykey
select count*) from emp
group by empid
having count()>1
-----Table when dont have primary key considered all columnnames except loaddate colum in query
select count
) from location
group by lid,laddress
having count(*)>1
Instead of writing multiple queries is there any generic query to get out put like below based on table metadata available in master table
Output:Here emp and dept tables consider prikey ='yes' values in group by and tables dont have any Prikey.i.e all Prikey='No' it should consider all columns in groupby function for the table except loaddate field while checking count.
output format for reference:
ObjectName RecordCount
Emp 0
dept 0
Location 5
job 0

Table scripts for reference
CREATE TABLE table5
(
ObjectName varchar(255),
AttributeName varchar(255),
Prikey varchar(255),
);

Insert scripts

INSERT Into table5 VALUES ('Emp','empid','Yes');
INSERT Into table5 VALUES ('Emp','name','No');
INSERT Into table5 VALUES ('Emp','loc','No');
INSERT Into table5 VALUES ('Emp','address','No');
INSERT Into table5 VALUES ('Emp','loaddate','No');
INSERT Into table5 VALUES ('dept','deptid','Yes');
INSERT Into table5 VALUES ('dept','dname','No');
INSERT Into table5 VALUES ('dept','daddress','No');
INSERT Into table5 VALUES ('dept','loaddate','No');
INSERT Into table5 VALUES ('Location','lid','No');
INSERT Into table5 VALUES ('Location','laddress','No');
INSERT Into table5 VALUES ('Location','loaddate','No');
INSERT Into table5 VALUES ('Job','jobid','No');
INSERT Into table5 VALUES ('job','loaddate','No');


#2

Without data I can't test, but I think the following might get you started:

declare @query nvarchar(max)='';

select @query=case when @query='' then '' else @query+char(13)+char(10)+'union all'+char(13)+char(10) end
             +'select '''+a.ObjectName+''' as tablename'+char(13)+char(10)
             +'      ,count(*) as total_count'+char(13)+char(10)
             +'      ,count(*)*sign(count(*)-1) as duplicate_count'
             +'  from '+quotename(a.ObjectName)+' as db'+char(13)+char(10)
             +' group by '
             +case
                 when sum(case when Prikey='Yes' then 1 else 0 end)>0
                 then stuff((select ','+t.AttributeName
                               from table5 as t
                              where t.ObjectName=a.ObjectName
                                and t.AttributeName<>'loaddate'
                                and t.Prikey='Yes'
                                for xml path('')
                            )
                            ,1
                            ,1
                            ,''
                           )
                 else stuff((select ','+t.AttributeName
                               from table5 as t
                              where t.ObjectName=a.ObjectName
                                and t.AttributeName<>'loaddate'
                                for xml path('')
                            )
                            ,1
                            ,1
                            ,''
                           )
                 end
  from table5 as a
 group by a.ObjectName
;

set @query='select tablename'+char(13)+char(10)
          +'      ,sum(total_count) as total_count'+char(13)+char(10)
          +'      ,sum(duplicate_count) as duplicate_count'+char(13)+char(10)
          +'  into ##temp'+char(13)+char(10)
          +'  from ('+char(13)+char(10)
          +@query
          +'       ) as a'+char(13)+char(10)
          +' group by tablename'+char(13)+char(10)
          ;

execute sp_executesql @query;

select *
  from ##temp
 where duplicate_count>0
;

drop table ##temp;