SQLTeam.com | Weblogs | Forums

T-sql 2012 cross join

I have a question concerning the cross join in the following sql:

SELECT a.SCHOOLYEAR, CASE when SUM(c.UNECNT) > 0 then 9 else max(a.SEMESTER) end as SEMESTER, a.MNUM, a.STUNUNM, a.ABSDT,
'ZZZ' AS UCNT,
SUM(c.UNECNT) as ABSENCES
FROM S.dbo.LtrDetails a WITH (NOLOCK)
CROSS JOIN (
SELECT *
FROM S.dbo.LtrDetails d WITH (NOLOCK)
WHERE d.SCHOOLYEAR = (
SELECT endYear
FROM S.dbo.SchoolYear WITH (NOLOCK)
WHERE active = 1)
)

The person who wrote the sql above, said the query above returned all the rows in the table that applied versus. The inner join did not return all the needed rows.
I do not understand the benefit of using a 'cross join' versus an 'inner join' on the sql above pointing to the same table?

First of all I would replace

CROSS JOIN (SELECT *
	FROM S.dbo.LtrDetails d WITH (NOLOCK)
	WHERE d.SCHOOLYEAR = (	SELECT endYear
				FROM S.dbo.SchoolYear WITH (NOLOCK)
				WHERE active = 1)
			)

by (and add " AS C" at the end).

CROSS JOIN (SELECT d.*
		FROM S.dbo.LtrDetails d WITH (NOLOCK)
			INNER JOIN S.dbo.SchoolYear as SY
				ON d.SCHOOLYEAR = SY.endYear
		WHERE SY.active = 1
		) as C

I try to understand the query:
A CROSS JOIN between two tables gives the Cartesian product.

Here the query makes the Cartesian product of all the records in the table S.dbo.LtrDetails with a subset of the same table S.dbo.LtrDetails, namely the subset with only those records whose school year = the currently active end year.
A consequence of a Cartesian product is the explosion of records in the result set. If the full table contains 10 000 rows and the table's subset contains 1000 records, the result set will contain 1 000 000 records.

Part of the query is missing, making it hard to figure out its meaning. The fact that "AS C" is missing, the explosion of records in the result set if the query does not contain a WHERE clause, aggregate functions (SUM) without a GROUP BY clause, ... all point in that direction.

It's hard to understand the meaning of such a query without knowing the meaning of the table and attributes.
If I must guess, I'd say the query results in a report that compares the courses (lectures?, absentees?) of all the years to the current year.

I do not understand the benefit of using a 'cross join' versus an 'inner join' on the sql above pointing to the same table?

With a CROSS JOIN you can put data side by side that don't match. Suppose this school year you had a "Nanotech" course for the very first time and in the past you had a "How to breed mammoths" course. And you want a report that shows both courses. (A CROSS JOIN has no JOIN condition, like A.course_Id = B.course_id).

With an INNER JOIN you will only see data that are related (in many cases equal) at some point: the columns used in the JOIN condition. If the JOIN condition would be course_id, in the result set the Mammoth course will not show up. The Nanotech course will show up, because you compare the courses of all the years (including the last year) with the last year.

Without the full query and understanding of the attributes it's hard to interpret, though.

You may try to rewrite the query with a FULL OUTER JOIN, they are faster. And maybe, maybe it will also return the desired report.