Need Help to Optimize Query

I have a table named TransList with the following columns:

ID Int (Identity column)
TransNum varchar(10)
TransData1 varchar(50)
TransData2 varchar(50)
TransData3 varchar(50)

The TransNum is not unique and can be repeating the same value many times. Only the ID is unique. I would like to do a query that shows only the rows for the first occurrence of each TransNum. Here is what I did:

SELECT * FROM TransList T
WHERE ID = (SELECT MIN(ID) FROM TransList WHERE TransNum = T.TransNum)

I'm not sure if my query is efficient or not. Could you please comment and maybe suggest a better way that would run faster?

Thank you so much in advance for your much appreciated help! :slight_smile:

This may be more efficient

;WITH cte AS
(
	SELECT 
		*,
		N = ROW_NUMBER() OVER (PARTITION BY TransNum ORDER BY ID)
	FROM
		TransList
)
SELECT	
	ID,
	TransNum,
	TransData1,
	TransData2,
	TransData3
FROM
	cte
WHERE
	N=1;
2 Likes

Awesome, that looks great and will do the trick! Thank you so much!

Alternative (and almost identical) to solution by @JamesK

select top(1) with ties
       id
      ,transnum
      ,transdata1
      ,transdata2
      ,transdata3
  from translist
 order by row_number() over(partition by transnum
                                order by id
                           )
;
2 Likes

I like this!! (better than what I posted)

"with ties" is a cool feature :smile: with%20ties

Thanks bitsmed for the "WITH TIES" solution. I didn't know about "WITH TIES" before so I Googled it and now I know. Very cool!

Will it actually run faster than the other solution by JamesK?

I did some testing on my own system, and to my surprice, "with ties" was slower. I did actually expect it to have equal or better performance. I can think of other scenarios where "with ties" would beat "with cte", but not here.

I have found the same thing - it is one of those questions where the answer is: it depends...

In some cases I have found that WITH TIES worked better and performed the same (or better) - and other times it caused the query to never complete.

So - test, test, test...

1 Like

That is a surprise! Intuitively one would think that with ties would be faster. But, I have seen that when you look at the execution plan (in general - not for this specific query, which I have not tried), sort takes a good percentage of the total. That may be it. Interesting read here.

1 Like

It often is

Totally agree

Exactly my thoughts

Hey guys, how do you determine which query runs faster? Do you perform timed benchmarks or do you just analyze the execution plans of the queries in question?