Joining Table A with index to Table B with no Index

catbeasy · May 26, 2020, 6:26pm

So, just as the subject asks. If i have a field, ID, where the field is, for example, a clustered index and i join that field to its partner on another table, ID, where it is not indexed, will the query still either run as fast as if both fields were clustered indexes?

Or perhaps it will run faster, than if neither were indexed, but not as fast as if both were indexed?
Or perhaps the run speed will take exactly the same time as if both were not indexed?

Thanks for any help clearing this up..!

yosiasz · May 26, 2020, 7:31pm

the answer as always will be it depends.

What if you are joining a table with millions of rows (column1 has good index) to another table with only one row but joining column has no index
You are joining one table with million of rows and well indexed to another table with millions of rows also well indexed but you have a where clause that filters on a column that has no index

so many variables, you understand your db, ultimately you have to decide use cases

One way to find out is to test this out with real or near real data.

I would defer to @ScottPletcher, he has depth in this area, we all should really but @ScottPletcher is the person I laser on when he comments on this subject

catbeasy · May 26, 2020, 8:22pm

In my particular case, the indexed table has 50,000 rows and the id is indexed to another id field in another table where the id field is not indexed. This other non-indexed id table has 1.5 billion rows. There are no where clauses..

However, for the sake of getting crazy over 1.5 billion rows versus 50k, and with a desire to get this conceptually, lets assume that both tables have equivalent row counts of 2 million and that the id field on table A is indexed, the id field on table B is not indexed, and there are no where clauses and id is inner join.

btw, thanks for your reference to @ScottPletcher - to ask for his inpout specficially, do i share this with him at: @ScottPletcher?

Thanks for any continued assistance!

ScottPletcher · May 29, 2020, 4:11pm

No. A clustered index on the second table should allow a "merge join", which is the most efficient type of join. If the other table doesn't already have a clus index, it would make perfect sense to cluster it on that join column, if the table is often joined on that column.

For a full join of both tables, a non-clustered index on the other table would help for sure only if that index contained all columns needed in the query. For example, say table2 has cols A thru J. The join is on C and the query also lists columns D and E. A non-clus index on the join column, C, would be useless by itself, because to use that index, SQL would still have to do a lookup back to the main table to get cols D and E. Doing all those lookups would be vastly more overhead than just scanning the entire table once to do the join.

For a join of only a limited number of rows, a non-clus index could help, because far fewer lookups are required. The more rows there are in the table, the more likely SQL will be able to use a non-clus for a limited number of rows.

@yosiasz:
Thanks so much for the favorable reference!