SQLTeam.com | Weblogs | Forums

SQL Subquery Comparison

Hello Community,

As you will have noticed from my previous posts, I'm on steep learning curve with SQL (although the help I received from you guys have made my SQL journey much easier :wink:

I have been taking the advice of member called AndyC by building SQL code by understanding the questions and not focus too much on SQL answers from books.

The question from a book states the following:
Find the sales, costs, and gross and net profit for every car sold.

The SQL code answer from the book is as follows:

  SELECT
  DT.MakeName
 ,DT.ModelName
 ,DT.SaleDate
 ,DT.SalePrice
 ,DT.Cost
 ,DT.SalePrice - DT.DirectCosts AS GrossProfit
FROM (SELECT
    MK.MakeName
   ,MD.ModelName
   ,SA.SaleDate
   ,SD.SalePrice
   ,ST.Cost
   ,SD.LineItemDiscount
   ,ISNULL(ST.RepairsCost, 0) + ST.PartsCost + ST.TransportInCost AS DirectCosts
  FROM Data.Make MK
  INNER JOIN Data.Model MD
    ON MK.MakeID = MD.MakeID
  INNER JOIN Data.Stock ST
    ON ST.ModelID = MD.ModelID
  INNER JOIN Data.SalesDetails SD
    ON ST.StockCode = SD.StockID
  INNER JOIN Data.Sales SA
    ON SA.SalesID = SD.SalesID) DT

The answer from the above code produces 348 records

The following is my SQL code:

SELECT
  Make.MakeName
 ,Model.ModelName
 ,Sales.SaleDate
 ,SalesDetails.SalePrice
 ,Stock.Cost
 ,SalesDetails.SalePrice - SubQuery.DirectCosts AS GrossProfit
FROM (SELECT
         Make.MakeName
        ,Model.ModelName
        ,Sales.SaleDate
        ,Stock.Cost
        ,ISNULL(Stock.RepairsCost, 0) + Stock.PartsCost + Stock.TransportInCost AS DirectCosts
       FROM Data.Make
       INNER JOIN Data.Model
         ON Make.MakeID = Model.MakeID
       INNER JOIN Data.Stock
         ON Model.ModelID = Stock.ModelID
       INNER JOIN Data.SalesDetails
         ON Stock.StockCode = SalesDetails.StockID
       INNER JOIN Data.Sales
         ON SalesDetails.SalesID = Sales.SalesID) SubQuery
    ,Data.Make
     INNER JOIN Data.Model
       ON Model.MakeID = Make.MakeID
     INNER JOIN Data.Stock
       ON Stock.ModelID = Model.ModelID
     INNER JOIN Data.SalesDetails
       ON SalesDetails.StockID = Stock.StockCode
     INNER JOIN Data.Sales
       ON Sales.SalesID = SalesDetails.SalesID

My Code produces over 1000's records.

Can someone take a look at my code and let me know why my code produces so many records.

Once again, please let me know if you need further information to help me with this issue?

Regards

Carlton

This here is creating a Cartesian product - it is a cross join of all results from the derived table (SubQuery) and the Data.Make table.

The SQL from the book is using a derived table - so that the calculation (,SalesDetails.SalePrice - SubQuery.DirectCosts AS GrossProfit) is easier to write.

This could also be done without the derived table using the following:

,SD.SalePrice - (ISNULL(ST.RepairsCost, 0) + ST.PartsCost + ST.TransportInCost) AS GrossProfit

I believe the intent is to show how you can use derived tables to perform calculations and then use the results to simplify the calculation. If you need to show both DirectCosts and GrossProfit you would have to repeat the calculation...using a derived table allows you to calculate the value one time and use the value multiple times in the outer query.

The book is trying to teach you about derived tables - i.e. how the output of a select statement can be used in the from clause of a different select as if it were a table of data. If you run the query from the derived table on it's own, you'll see it has all the data you needed so additional joins are unnecessary in the outer select.

Like a lot of book examples, unfortunately, it's a bit contrived. You don't really need a derived table in this case at all because the one calculated value is only being used once. And even if it were being used multiple times, it would be clearer written using an apply operator, such as:

SELECT
	MK.MakeName
	,MD.ModelName
	,SA.SaleDate
	,SD.SalePrice
	,ST.Cost
	,SD.LineItemDiscount
    ,CALC.DirectCosts
    ,SD.SalePrice - CALC.DirectCosts AS GrossProfit
FROM Data.Make MK
INNER JOIN Data.Model MD
	ON MK.MakeID = MD.MakeID
INNER JOIN Data.Stock ST
	ON ST.ModelID = MD.ModelID
INNER JOIN Data.SalesDetails SD
	ON ST.StockCode = SD.StockID
INNER JOIN Data.Sales SA
	ON SA.SalesID = SD.SalesID
CROSS APPLY
(
	SELECT ISNULL(ST.RepairsCost, 0) + ST.PartsCost + ST.TransportInCost AS DirectCosts
) CALC

As that way you aren't repeating almost the entire query twice just to do one sum. This is always one of the problems with textbooks, they're trying to focus on one concept at a time and a little too often, in the attempt to keep things simple, use an example where the concept they're describing isn't strictly speaking necessary. Sometimes you just kind of have to accept that it's trying to explain a concept and focus on what that is rather than whether what is being shown is the best, or even the only, way to accomplish the end result.

Hi AndyC

Thanks again. I tried to run your query, but I got the following error:

Incorrect syntax near )
Incorrect syntax near calc

Jeff, thanks for responding.

I'm afraid I don't understand what you mean by Cartesian product..

Oops, didn't quite cut of the end of the derived table - fixed in my post.

AndyC

Thanks for the amendments.

I also think the book is trying to describe one of the benefits of Derived tables is the ability to be able re-use the table over and over again.

A Cartesian product is the result of joining every row from one table to every row of another. The usual way to do this in SQL is a CROSS JOIN:

Select * From A Cross Join B

But, for historical reasons, SQL also allows you to use a , as shorthand:

Select * From A,B

Which is what you have in you version of the query. This is mostly frowned upon as it goes against modern language standards and is harder to read.

Yes, and it is absolutely correct to do so - I'm just not sure that's the best way of actually demonstrating that. :wink:

AndyC, / Community,

Following on with this example.
I got the same result by (1) creating the following Derived table using the select:

SELECT
  Make.MakeName
 ,Model.ModelName
 ,AVG(Stock.Cost) AS AveragePurchasePrice
 ,AVG(SalesDetails.SalePrice) AS AverageSalePrice
FROM Data.Make
INNER JOIN Data.Model
  ON Make.MakeID = Model.MakeID
INNER JOIN Data.Stock
  ON Model.ModelID = Stock.ModelID
INNER JOIN Data.SalesDetails
  ON Stock.StockCode = SalesDetails.StockID
GROUP BY Make.MakeName
        ,Model.ModelName

I then saved the results from the above select as table, and called the table DT3

I then used the table in following code to get the final result:

SELECT
  Stock.DateBought
 ,DT3.MakeName
 ,DT3.ModelName
 ,Stock.Color
 ,Stock.Cost
 ,DT3.AvgPurchasePrice
 ,DT3.AvgSalePrice
FROM Data.DT3
INNER JOIN Data.Make
  ON DT3.MakeName = Make.MakeName
INNER JOIN Data.Model
  ON Model.ModelName = DT3.ModelName
INNER JOIN Data.Stock
  ON Stock.ModelID = Model.ModelID
INNER JOIN Data.SalesDetails
  ON Stock.StockCode = SalesDetails.StockID
ORDER BY Stock.DateBought, DT3.MakeName, DT3.ModelName

My question is, is this good or bad code. In one breath it seems good because I can used table derived table D3 over and over again, however, it seems very long-winded.

I will be interested to hear your opinions

You need to use aggregate functions with group by

Thanks jotorre