I have a table named TransList with the following columns:
ID Int (Identity column)
TransNum varchar(10)
TransData1 varchar(50)
TransData2 varchar(50)
TransData3 varchar(50)
The TransNum is not unique and can be repeating the same value many times. Only the ID is unique. I would like to do a query that shows only the rows for the first occurrence of each TransNum. Here is what I did:
SELECT * FROM TransList T
WHERE ID = (SELECT MIN(ID) FROM TransList WHERE TransNum = T.TransNum)
I'm not sure if my query is efficient or not. Could you please comment and maybe suggest a better way that would run faster?
Thank you so much in advance for your much appreciated help!
;WITH cte AS
(
SELECT
*,
N = ROW_NUMBER() OVER (PARTITION BY TransNum ORDER BY ID)
FROM
TransList
)
SELECT
ID,
TransNum,
TransData1,
TransData2,
TransData3
FROM
cte
WHERE
N=1;
Alternative (and almost identical) to solution by @JamesK
select top(1) with ties
id
,transnum
,transdata1
,transdata2
,transdata3
from translist
order by row_number() over(partition by transnum
order by id
)
;
I did some testing on my own system, and to my surprice, "with ties" was slower. I did actually expect it to have equal or better performance. I can think of other scenarios where "with ties" would beat "with cte", but not here.
That is a surprise! Intuitively one would think that with ties would be faster. But, I have seen that when you look at the execution plan (in general - not for this specific query, which I have not tried), sort takes a good percentage of the total. That may be it. Interesting read here.
Hey guys, how do you determine which query runs faster? Do you perform timed benchmarks or do you just analyze the execution plans of the queries in question?