SQLTeam.com | Weblogs | Forums

Variation on finding duplicates

Hi Team,
In need of your expertise again. Please see the code below and description of what I am trying to do, which I'm sure is easy for the experts! All help appreciated as always.

DROP TABLE [dbo].[mytable2]

CREATE TABLE [dbo].[mytable2](
[Date] [date] NULL,
[RefNo] [int] NULL,
[Retry] varchar NULL,
[UserID] [int] NULL,
[Status] varchar NULL)

INSERT INTO [dbo].[mytable2]
([Date]
,[RefNo]
,[Retry]
,[UserID]
,[Status])
VALUES
('01/06/2019'
,'1234'
,''
,'12345'
,'Denied'),
('01/08/2019'
,'1234'
,'Retry'
,'12345'
,'Accessed'),
('01/06/2019'
,'5678'
,''
,'12345'
,'Denied'),
('01/06/2019'
,'5678'
,''
,'12345'
,'Denied'),
('01/08/2019'
,'5678'
,'Retry'
,'12345'
,'Denied'),
('01/06/2019'
,'9999'
,''
,'12345'
,'Denied'),
('01/06/2019'
,'9999'
,''
,'12345'
,'Accessed'),
('01/08/2019'
,'9999'
,'Retry'
,'12345'
,'Accessed'),
('01/06/2019'
,'1111'
,''
,'12345'
,'Accessed'),
('01/08/2019'
,'1111'
,'Retry'
,'12345'
,'Accessed'),
('01/08/2019'
,'1111'
,'Retry'
,'12345'
,'Accessed')

GO

SELECT * FROM [mytable2]

/*

Date RefNo Retry UserID Status
2019-01-06 1234 12345 Denied
2019-01-08 1234 Retry 12345 Accessed
2019-01-06 5678 12345 Denied
2019-01-06 5678 12345 Denied
2019-01-08 5678 Retry 12345 Denied
2019-01-06 9999 12345 Denied
2019-01-06 9999 12345 Accessed
2019-01-08 9999 Retry 12345 Accessed
2019-01-06 1111 12345 Accessed
2019-01-08 1111 Retry 12345 Accessed
2019-01-08 1111 Retry 12345 Accessed

Query#1
Get all rows where
RefNo & UserID are the same and Status = Denied
If there is a rows where Status=Accessed then exclude all
rows for this RefNo/UserID
2019-01-06 5678 12345 Denied
2019-01-06 5678 12345 Denied
2019-01-08 5678 Retry 12345 Denied

Query #2
Get all rows where
RefNo & UserID are the same and only one Retry worked
There must only be one 'pair' of Denied & Accessed
otherwise exclude
2019-01-06 1234 12345 Denied
2019-01-08 1234 Retry 12345 Accessed

Query #3
Get all rows where
RefNo & UserID are the same and more than one Accessed
2019-01-06 9999 12345 Denied
2019-01-06 9999 12345 Accessed
2019-01-08 9999 Retry 12345 Accessed
2019-01-06 1111 12345 Accessed
2019-01-08 1111 Retry 12345 Accessed
2019-01-08 1111 Retry 12345 Accessed

Basically I am trying to match Users with a RefNo
where they have at least one Denied and
subsequently one or more either Denied or Accessed
I have over 13,000 rows in total and just need to
bring it down to a manageable level
*/

Below are query#1 and query#2.

Your criteria for query#3 contradict each other, so I'm not sure how to answer that one.

--query#1
SELECT DISTINCT mt.*
FROM dbo.mytable2 mt
INNER JOIN (
    SELECT RefNo, UserID
    FROM dbo.mytable2
    GROUP BY RefNo, UserID
    HAVING 
        SUM(CASE WHEN Status = 'Denied' THEN 1 ELSE 0 END) >= 2 AND
        SUM(CASE WHEN Status = 'Accessed' THEN 1 ELSE 0 END) = 0
) AS query#1 ON query#1.RefNo = mt.RefNo AND query#1.UserID = mt.UserID

--query#2
SELECT mt.*
FROM dbo.mytable2 mt
INNER JOIN (
    SELECT RefNo, UserID
    FROM dbo.mytable2
    GROUP BY RefNo, UserID
    HAVING 
        COUNT(*) = 2 AND
        SUM(CASE WHEN Status = 'Denied' THEN 1 ELSE 0 END) = 1 AND
        SUM(CASE WHEN Status = 'Accessed' AND Retry = 'Retry' THEN 1 ELSE 0 END) = 1
) AS query#2 ON query#2.RefNo = mt.RefNo AND query#2.UserID = mt.UserID

Genius! Thank you so much, that gives me enough to go on to try writing the 3rd one myself.