SQLTeam.com | Weblogs | Forums

Query is very slow!

#1

Hello,

I am attempting to self join a table to see if there are any potential duplicate people based on various criteria (such as matching names, matching ID numbers, etc).

Below is the actual query that I've created to get the initial results. Problem is that it takes approximately 3 hours to run. Do you, or anybody reading this really, have any ideas on how to "fix" my very poorly designed/functioning query?

Note: I realise that I have a previous question very similar to this one. However, it wasn't letting me reply on that topic. Also, excuse the formatting below, site wasn't allowing me to post the actual query.

In my select list I have the following:

list1.Organization_Name
,list1.Person_Reference
,list1.First_Name
,list1.Last_Name
,list1.Date_of_Birth
,list1.Mothers_Birth_Last_Name
,list1.Soundex_First_Name
,list1.Soundex_Last_Name
,list1.Soundex_Birth_Name
,list1.[ID1]
,list1.[ID2]
,list1.[ID3]
,list1.[ID4]
,list1.[ID5]
,list1.Open_Case_Reference
,list2.Open_Case_Reference
,list2.Person_Reference
,list2.First_Name
,list2.Last_Name
,list2.Date_of_Birth
,list2.Mothers_Birth_Last_Name
,list2.Soundex_First_Name
,list2.Soundex_Last_Name
,list2.Soundex_Birth_Name
,list2.[ID1]
,list2.[ID2]
,list2.[ID3]
,list2.[ID4]
,list2.[ID5]

My from statement contains:

[Participants] list1, [Participants] list2

And my where statement includes:

--Only bring in members who are part of open cases
list1.Open_Case_Reference IS NOT NULL
--Only bring in members who do not "match" themselves
AND list1.Person_Reference <> list2.Person_Reference
AND (
--Include members who have a matching first name AND last name
(list1.First_Name = list2.First_Name AND list1.Last_Name = list2.Last_Name)
--Include members who have a matching Last Name and SOUNDEX of their first names match
OR (list1.Soundex_First_Name = list2.Soundex_First_Name AND list1.Last_Name = list2.Last_Name)
--Include members who have a matching First Name and SOUNDEX of their last names match
OR (list1.Soundex_Last_Name = list2.Soundex_Last_Name AND list1.First_Name = list2.First_Name)
--Include members who have a matching SOUNDEX first and last names
OR (list1.Soundex_Last_Name = list2.Soundex_Last_Name AND list1.Soundex_First_Name = list2.Soundex_First_Name)
--Include members who have a matching ID1 number
OR (list1.[ID1] = list2.[ID2])
--Include members who have a matching SIN
OR (list1.[ID2] = list2.[ID2])
--Include members who have a matching OHIP
OR (list1.[ID3] = list2.[ID3])
--Include members who have a matching Birth Certificate
OR (list1.[ID4] = list2.[ID4])
--Include members who have a matching Drivers License
OR (list1.[ID5] = list2.[ID5])
--Include members who have a matching First Name (or SOUNDEX First Name) and Birth Mothers Last Name (or SOUNDEX Birth Mothers Last Name)
OR (list1.First_Name = list2.First_Name AND (list1.Last_Name = list2.Mothers_Birth_Last_Name OR list1.Mothers_Birth_Last_Name = list2.Last_Name))
OR (list1.First_Name = list2.First_Name AND (list1.Soundex_Last_Name = list2.Soundex_Birth_Name OR list1.Soundex_Birth_Name = list2.Soundex_Last_Name))
)
--DO NOT Include members that have already been merged
AND NOT (CONCAT(cast(list1.[Person_Reference] AS varchar(50)),cast(list2.[Person_Reference] AS varchar(50))) IN (SELECT CONCAT(cast([Original_Reference] AS varchar(50)),cast([Duplicate_Reference] AS varchar(50))) FROM [Merged Pairs])
OR CONCAT(cast(list2.[Person_Reference] AS varchar(50)),cast(list1.[Person_Reference] AS varchar(50))) IN (SELECT CONCAT(cast([Original_Reference] AS varchar(50)),cast([Duplicate_Reference] AS varchar(50))) FROM [Merged Pairs]))

Thanks,