Do I need to table-reference columns in nested SELECT?

Hello

When I do:
SELECT * FROM TABLE1 WHERE COL1 NOT IN (SELECT * FROM TABLE2 WHERE COL2<>0)

Does SQL, keep the column references within each SELECT statement or I have to provide reference for each table and reference accordingly all columns?

Ie do I need to do:
SELECT * FROM TABLE1 a WHERE a.COL1 NOT IN (SELECT * FROM TABLE2 b WHERE b.COL2<>0)

Thanks

It is best to use the aliases.

Your first query will work as expected if Col2 exists in Table2.
If there is no Col2 in Table2 but there is in Table1 the query will run but results will probably not be what you are expecting.

I would not allow any query like your fist one into production.

ps Both of your queries will work only if there is one column in Table2. While SELECT * can be useful for some DBA tasks I do not allow it in application code.

You definitely need to use aliases in the "subquery". And, with a NOT IN, you need to insure that a NULL value is never present, so, for example:

NOT IN (SELECT b.COL1 FROM TABLE2 b WHERE b.COL2<>0 AND b.COL1 IS NOT NULL)

1 Like

none of the above queries will work. You need to specify the corresponding column in the subquery

COL1 NOT IN (select COL1 )

declare @sql4 table(COL1 int)
insert into @sql4
select distinct top 100 column_id from sys.all_columns

declare @sql0 table(COL1 int, COL2 int)
insert into @sql0
select distinct top 100 column_id, object_id from sys.all_columns

SELECT * FROM @sql4 a WHERE a.COL1 NOT IN (SELECT * FROM @sql0 b WHERE b.COL2<>0)

SELECT * 
  FROM @sql4 a 
 WHERE a.COL1 NOT IN (SELECT * FROM @sql0 b WHERE b.COL2<>0)
1 Like

@yosiasz:
I'm guessing you didn't see my post when you made your comment.
As I noted, you should be sure you use a table alias in the NOT IN query and you should make sure a NULL value is not allowed into the result.

@ScottPletcher must have clicked button at same time. this is interesting also forgot why SQL allows it

declare @sql4 table(COL1 int)
insert into @sql4
select distinct top 100 column_id from sys.all_columns

declare @sql0 table( COL2 int)
insert into @sql0
select distinct top 100 column_id from sys.all_columns

SELECT * 
  FROM @sql4 a 
WHERE a.COL1 NOT IN (SELECT * FROM @sql0 b WHERE b.COL2<>0)

SELECT * 
  FROM @sql4 a 
 WHERE a.COL1 NOT IN (SELECT * FROM @sql0 b WHERE b.COL2<>0)
1 Like

IF there is only one column in the "subquery" table, then you could use *, but that's a very bad practice you don't want to start doing.

As everyone has stated - you definitely need to use aliases for your tables. Not only is it good practice - but it can save you from making a huge mistake.

We had someone run a delete with an IN statement where the sub-query was not aliased. The problem was that the column referenced in the sub-query did not exist in that table - but did exist in the outer query. Instead of deleting the 20 items identified in the sub-query - this ended up deleting 454,000+ items.

Simply aliasing the tables would have prevented this issue.

2 Likes