A better way of writing this query?

Kerryman · May 12, 2016, 10:21am

Hi guys, this query works (I think) but it's written in my usual clunky style, can anybody see a more elegant solution. It produces results that show the most used product and supplier on current live contracts using SQL Server 2008:

SELECT a.OrderID, a.OrderDate, a.ContractID, b.ProductID, c.ProductName, c.SupplierID, d.SupplierName
INTO #contracts
FROM [Orders] a
INNER JOIN [Order Details] b
ON a.OrderID = b.OrderID
INNER JOIN [Products] c
ON b.ProductID = c.ProductID
INNER JOIN [Suppliers] d
ON c.SupplierID = d.SuppliersID
WHERE DATEPART(year, a.OrderDate) >= '2016'

SELECT a.ContractID, a.SupplierID, a.ProductID, a.ProductName,
(SELECT COUNT() FROM #contracts WHERE SupplierID = a.SupplierID AND ContractID = a.ContractID) AS 'MostUsedSupplier',
(SELECT COUNT() FROM #contracts WHERE ProductID = a.ProductID AND ContractID = a.ContractID) AS 'MostUsedProduct'
INTO #orders
FROM #contracts a

GROUP BY ContractID, SupplierID, ProductID, ProductName
ORDER BY MostUsedSupplier

SELECT ROW_NUMBER()
OVER (ORDER BY a.ContractID) AS RowNum,
a.ContractID, a.MostUsedSupplier , a.SupplierID, a.MostUsedProduct, a.ProductID, a.ProductName, b.name_, c.SupplierName
INTO #results
FROM #orders a
left JOIN demaa.dbo.jcacc b
ON a.ContractID = b.code_
left JOIN Suppliers c
ON a.SupplierID = c.SuppliersID
WHERE b.co_ = '01'

ORDER BY a.ContractID

SELECT a.*
INTO #results1
FROM #results a
JOIN
(
SELECT ContractID, MAX(MostUsedSupplier) maxSupplier, MAX(MostUsedProduct) maxProduct
FROM #results
GROUP BY ContractID
) b ON a.ContractID = b.ContractID AND
a.MostUsedSupplier = b.maxSupplier AND
a.MostUsedProduct = b.maxProduct

   ORDER BY a.ContractID

SELECT
MAX(RowNum)AS 'RowNum',
ContractID AS 'ContractID',
name_ AS 'Contract',
MAX(ProductName) AS 'MostUsedProduct',
MAX(SupplierName) AS 'MostUsedSupplier'

FROM #results1
GROUP BY ContractID,name_

ORDER BY ContractID

DROP TABLE #contracts
DROP TABLE #orders
DROP TABLE #results
DROP TABLE #results1

Sample Output:
RowNum ContractID Contract MostUsedProduct MostUsedSupplier
133 100638 Sample Contract DULUX TR MATT DULUX -
190 100607 Sample Contract JOHNSTONES COVA PLUS KALON

Kristen · May 13, 2016, 5:41am

I would do

WHERE a.OrderDate >= '20160101'

because using a Function on a column (a.OrderDate in this case) will usually prevent SQL using an index to speed up the query, and thus it will take a table scan instead which is a lot slower.

I would do:

; WITH MostUsedSupplier AS 
(
	SELECT	SupplierID, ContractID, COUNT(*) AS MyCount
	FROM	#contracts
),
MostUsedProduct AS 
(
	SELECT	ProductID, ContractID, COUNT(*) AS MyCount
	FROM	#contracts
)
SELECT	a.ContractID, a.SupplierID, a.ProductID, a.ProductName,
 	MUS.MyCount AS 'MostUsedSupplier',	
 	MUP.MyCount AS 'MostUsedProduct'
INTO #orders
FROM	#contracts a
	LEFT OUTER JOIN MostUsedSupplier AS MUS
		 ON MUS.ContractID = a.ContractID
		AND MUS.SupplierID = a.SupplierID
	LEFT OUTER JOIN MostUsedProduct AS MUP
		 ON MUP.ContractID = a.ContractID
		AND MUP.ProductID = a.ProductID
	
GROUP BY ContractID, SupplierID, ProductID, ProductName
ORDER BY MostUsedSupplier

No idea if SQL will make a better query plan (than your inline COUNT(*) in the SELECT statement). Could probably also use other means of getting running ranking totals like this, but I find that level of sophistication often gets in the way of real-world coding, but that may just be me ...

Dunno if relevant in this case, but where we want to display some sort of sub-totals, within a resultset, we add a "Type" column to the #results table (set to, say, 1 for "normal data") and then insert, into the same #results table additional "Types" of 2, 3, ... for each type of sub-total / additional / different data. So, for example, I might insert some rows with Type=2 with a COUNT of the rows GROUPED BY Supplier and then Type=3 for the COUNT GROUPED BY Product - obvious that would include all the Supplier Counts just added hence the Type column so I can say WHERE Type = 1 to only include the "raw" data

Then I can just SELECT the whole lot from #results which will include the Total rows too. That saves having to process from #results INTO #results1 and so on

Not relevant in this case as you only want the Totals, not Raw Data AND Sub Totals.

You can get what you want using CTE, instead of INTO #results1 FROM #results ... but I have no idea if it is "better". For me "better" would be faster / more efficient - plus easily maintainable. In that guise using a coding style that is familiar to you has a lot of benefit ... but I guess you are also looking at "Can my familiar coding techniques be brought up-to-date" For me there are several more modern coding techniques which I don't use (such as MERGE) because I find that the complexity & readability of the code works against my productivity and code maintainability / "defensive programming" keeps bugs to a minimum. YMMV of course ...

Kerryman · May 13, 2016, 9:08am

Thanks Kristen. I found that useful and certainly an improvement on the original. As you suggest I'm looking to better my familiar coding techniques. Again I appreciate your time. Regards Kerryman