Is this statement correct?

sqlor · January 9, 2021, 3:23pm

Hello

I have seen the statement below:

SELECT *
FROM Table1
LEFT JOIN Table2 on Table1.ID=Table2.ID and [Cancelled]='Y'

I am puzzled with the and [Cancelled]='Y' bit.

Is this really a valid way to introduce filtering in a selection?

Is this really the same as the below?

SELECT *
FROM Table1
LEFT JOIN Table2 on Table1.ID=Table2.ID
WHERE [Cancelled]='Y'

Thanks!

Ifor · January 9, 2021, 3:59pm

You need to understand the logical order of query processing. eg:

With your examples, if Cancelled is in Table2 (and you should really have written Table2.Cancelled or used an alias) then your second query will effectively convert the LEFT OUTER JOIN to an INNER JOIN as the NULLs from the LEFT OUTER JOIN will never be equal to 'Y'.

Create some test data and see what happens.

sqlor · January 9, 2021, 4:06pm

It doesn't matter if it will be converted into an INNER join.

My question is, are the two statements I wrote the same? You may assume Table1.Cancelled='Y' or Table2.Cancelled='Y'. As I saw it, it didn't distinguish and from my understanding, [Cancelled] exists in both Tables.

My real question is, if you do SELECT * FROM Table1 LEFT JOIN Table2 on [some equation] and [another equation], will the [another equation] act as a filter, i.e. as if you do SELECT * FROM Table1 LEFT JOIN Table2 on [some equation] WHERE [another equation] ?

Ifor · January 9, 2021, 4:12pm

Umm... You have a more fundamental problem.

ScottPletcher · January 10, 2021, 6:30am

It does if you want the query to function like it's a LEFT OUTER JOIN.

If you actually don't need a LEFT OUTER JOIN, then just use an INNER JOIN.

sqlor · January 10, 2021, 10:38am

Thanks, is my exact question answerable? If yes, what is the answer please?
I need that information first, then I will be able to understand if it is converted in an INNER JOIN and how.

To say the same thing in other words in case it makes it more clear:

When joining on two ON equations separated by AND, how is this treated by SQL?
In continuation to the above, what happens if the second ON equation is an equation between a field and a value instead of two fields from the respective tables?

Thanks!

jeffw8713 · January 10, 2021, 4:27pm

You can have many conditions in the ON portion of a join - whether that is to multiple columns or expressions such as your example. The ON clause creates the relationship between the tables identified in that join statement.

The WHERE clause filters the results of all of the joins - after the data has been joined.

In your example, you are looking to get all rows from Table1 - and match only those rows from Table2 that match the key and are canceled. So you would get every row from Table1 and the columns from Table2 would either be NULL (no match found) or have a value (matched on both key and canceled).

Now - if you move that condition to the where clause you will only get all rows that match the join criteria (matching on key only) - and then filter out any rows from the results that are canceled. For all rows from Table1 where there is no match (matched on key value - not canceled) would be excluded because the NULL value returned from the outer join cannot and never will equal 'Y'.

If the column [Cancelled] comes from Table1 and not Table2 then things get trickier...because you would still get every row from Table1...but then only match on rows from Table2 for those rows in Table1 that are canceled. Again - moving the criteria to the where returns different results...because those rows where no match occurs on the join criteria would be eliminated.

This isn't really a question about how this is treated by SQL Server - it is a fundamental question on how an RDBMS uses and defines relations and how those relationships are represented in code.

sqlor · January 14, 2021, 7:36pm

You are basically saying that:
LEFT JOIN Table2 on Table1.ID=Table2.ID AND [Cancelled]='Y' : returns results where [Cancelled] is 'Y' or NULL
And
LEFT JOIN Table2 on Table1.ID=Table2.ID WHERE [Cancelled]='Y' : returns results where [Cancelled] is 'Y' only
Right?

But I get results that are both [Cancelled]='Y' and 'N' when I do LEFT JOIN Table2 on Table1.ID=Table2.ID and [Cancelled]='Y'

This is different from what you are saying, right?

mike01 · January 14, 2021, 9:29pm

if [Cancelled]='Y' comes from Table1, then they will be the same, which they are. If it came from table2, then it would not return the null values

ScottPletcher · January 15, 2021, 9:04am

NO. That returns results from Table1 regardless of what is in the Table2.Cancelled column. If there was not a matching row from Table2, then SQL will force NULL into all the Table2 columns, including Cancelled, ID, etc..

sqlor · January 15, 2021, 8:56pm

Apologies, the FROM Table1 LEFT JOIN Table2 on Table1.ID=Table2.ID AND Table1.[Cancelled]='Y' : returns results where [Cancelled] is 'Y' or 'N', there is no NULL.

I still don't understand what is the function of the AND Table1.[Cancelled]='Y'.
I would also be interested to understand the function of the AND Table2.[Cancelled]='Y'

Any idea please?

ScottPletcher · January 16, 2021, 5:49am

Not from me. I give up. Good luck with this.

jeffw8713 · January 18, 2021, 5:37pm

It changes the relationship - and you get different results

Declare @table1 Table (KeyColumn int, Cancelled char(1), SomeOtherValue varchar(20));
Declare @table2 Table (KeyColumn int, Cancelled char(1), SomeOtherValue varchar(20));

 Insert Into @table1 (KeyColumn, Cancelled, SomeOtherValue)
 Values (1, 'Y', 't1-y')
      , (1, 'N', 't1-n');

 Insert Into @table2 (KeyColumn, Cancelled, SomeOtherValue)
 Values (1, 'Y', 't2-1-y')
      , (1, 'N', 't2-1-n');

 Select *
   From @table1         t1
   Left Join @table2    t2 On t2.KeyColumn = t1.KeyColumn
                          And t1.Cancelled = 'Y';

 Select *
   From @table1         t1
   Left Join @table2    t2 On t2.KeyColumn = t1.KeyColumn
                          And t2.Cancelled = 'Y';

As you can see - we get 3 rows when using Table1.Cancelled in the join but only for one row from Table1. The second row from table1 has NULL values because the Cancelled value of that row is 'N' (thus - no match in the join so no matching row from table2).

We get 2 rows using Table2.Cancelled in the join - and we get that one matching row in table2 for both rows in Table1.

If we remove the check for Cancelled - we get 4 rows returned. Row1 from Table1 matched to both rows in Table2 - and Row2 in Table1 matched to both rows in Table2.