Novice! Remove duplicates

Hi there,

Firstly, I am a complete novice and JUST starting my training on SQL - so my knowledge is limited.

I am using Microsoft Report Builder 3.0 - SQL2008

I have created a table which has a unique identifier [TENANCY-REF] I am trying to remove the duplicates of this identifier but keep getting an error as I am trying the group by clause?

As mentioned I am a complete novice at this, and still trying to get to grips with it...

SELECT
[CN] = COUNT(*) OVER(),
[FOR].[TENANCY-REF]
,[FOR].[PERSON-REF]
,[CO-PERSON].[FORENAMES]
,[CO-PERSON].[SURNAME]
,[FOR].[RELATION-CODE]
,[FOR].[ARREARS-STAGE]
,[FOR].[CURR-BALANCE]
,[CUR-TNCY-REF] = [CUR].[TENANCY-REF]
,[CUR-BALANCE] = [CUR].[CURR-BALANCE]
FROM [FOR]
LEFT JOIN [CUR] ON [FOR].[PERSON-REF] = [CUR].[PERSON-REF]
LEFT JOIN [CO-PERSON] ON [FOR].[PERSON-REF] = [CO-PERSON].[PERSON-REF]
WHERE
[FOR].[CURR-BALANCE] < 0 AND [CUR].[CURR-BALANCE] > 0
ORDER BY
[CO-PERSON].[SURNAME]

https://www.sqlshack.com/different-ways-to-sql-delete-duplicate-rows-from-a-sql-table/

1 Like

we don't have your schema, so we don't know if the joins are even correct. I know you're knew to sql, but can you provide DDL for the tables? In Management Studio (SSMS), you can right click on each table and select Script Table As then choose Create. That will give us an idea on what the 3 tables look like. My guess is that the person_ref column is not unique and the same person appears in it multiple times causing your dup issues

The first join should be changed to an INNER JOIN as you are filtering on CURR-BALANCE from the outer table which eliminates NULL values (which would appear if there is a row in [FOR] that does not have a corresponding row in [CUR] with a PERSON-REF value.

If you only have a single row in the [FOR] table for each PERSON-REF, but have multiple rows in [CUR] or [CO-PERSON] for that PERSON-REF you will get 'duplicates'. These are not really duplicates though - this is either caused by a missing join clause - or those tables are supposed to have multiple rows for each PERSON-REF.

To determine this - you need to identify the primary key of both [CUR] and [CO-PERSON]. Once you have that identified then you need to determine what row you want returned. Then you will be able to determine how to adjust the JOIN to only include that one row (or modify the joins to INNER JOIN and filter in the WHERE clause).