SQLTeam.com | Weblogs | Forums

Select Distinct on Some Columns only


#1

I want to do a Select Distinct in my query but I only want it to apply to 3 columns. I want to return data from four columns though.

I have four columns that I want to return in my Select statement: Firstname, Lastname, PN, RefNum
I want to do a Select Distinct of only Firstname, Lastname, PN

Anyone know how to do this?


#2

Try GROUP BY Firstname, Lastname, PN.


#3

I tried this and I get the error message:

"Column RefNum is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause"


#4

If you have two rows that have the same FirstName, LastName, and PN, but have different RefNums, I think you want to get only one row back. And you do want to get all four columns. Given that there are two choices for the RefNum column, which of those do you want to get?

If you don't care which RefNum you get, you can query like this:

SELECT FirstName, LastName, PN,RefNum
FROM
(
	SELECT 
		*,
		ROW_NUMBER() OVER (PARTITION BY  FirstName, LastName, PN
			ORDER BY (SELECT NULL)) AS N
	FROM
		YourTable
)s
WHERE N = 1	

If you want to pick the highest value of RefNum, change (SELECT NULL) to RefNum DESC. You can apply a different order by clause if you have different rule to pick the RefNum.


#5

Thanks for the reply James.
This is working fine.


#6

Or just use MAX or MIN on RefNum:

SELECT Firstname, Lastname, PN, MAX(RefNum) AS RefNum
FROM table_name
GROUP BY Firstname, Lastname, PN
--ORDER BY Firstname, Lastname, PN

#7

SELECT DISTINCT is done at the row level. What you want when you have data like this:
('Joe', 'Celko' 1, 2)
('Joe', 'Celko' 1, 3)