Need Help to Optimize Query

mtl777 · February 19, 2019, 9:17pm

I have a table named TransList with the following columns:

ID Int (Identity column)
TransNum varchar(10)
TransData1 varchar(50)
TransData2 varchar(50)
TransData3 varchar(50)

The TransNum is not unique and can be repeating the same value many times. Only the ID is unique. I would like to do a query that shows only the rows for the first occurrence of each TransNum. Here is what I did:

SELECT * FROM TransList T
WHERE ID = (SELECT MIN(ID) FROM TransList WHERE TransNum = T.TransNum)

I'm not sure if my query is efficient or not. Could you please comment and maybe suggest a better way that would run faster?

Thank you so much in advance for your much appreciated help!

JamesK · February 19, 2019, 10:22pm

This may be more efficient

;WITH cte AS
(
	SELECT 
		*,
		N = ROW_NUMBER() OVER (PARTITION BY TransNum ORDER BY ID)
	FROM
		TransList
)
SELECT	
	ID,
	TransNum,
	TransData1,
	TransData2,
	TransData3
FROM
	cte
WHERE
	N=1;

mtl777 · February 21, 2019, 1:08am

Awesome, that looks great and will do the trick! Thank you so much!

bitsmed · February 21, 2019, 2:07am

Alternative (and almost identical) to solution by @JamesK

select top(1) with ties
       id
      ,transnum
      ,transdata1
      ,transdata2
      ,transdata3
  from translist
 order by row_number() over(partition by transnum
                                order by id
                           )
;

JamesK · February 21, 2019, 3:23pm

I like this!! (better than what I posted)

bitsmed · February 21, 2019, 5:17pm

"with ties" is a cool feature with%20ties

mtl777 · February 21, 2019, 11:47pm

Thanks bitsmed for the "WITH TIES" solution. I didn't know about "WITH TIES" before so I Googled it and now I know. Very cool!

Will it actually run faster than the other solution by JamesK?

bitsmed · February 22, 2019, 4:35pm

I did some testing on my own system, and to my surprice, "with ties" was slower. I did actually expect it to have equal or better performance. I can think of other scenarios where "with ties" would beat "with cte", but not here.

jeffw8713 · February 22, 2019, 7:01pm

I have found the same thing - it is one of those questions where the answer is: it depends...

In some cases I have found that WITH TIES worked better and performed the same (or better) - and other times it caused the query to never complete.

So - test, test, test...

JamesK · February 22, 2019, 7:18pm

That is a surprise! Intuitively one would think that with ties would be faster. But, I have seen that when you look at the execution plan (in general - not for this specific query, which I have not tried), sort takes a good percentage of the total. That may be it. Interesting read here.

bitsmed · February 22, 2019, 8:34pm

It often is

Totally agree

bitsmed · February 22, 2019, 8:35pm

Exactly my thoughts

mtl777 · March 12, 2019, 11:51pm

Hey guys, how do you determine which query runs faster? Do you perform timed benchmarks or do you just analyze the execution plans of the queries in question?