I'm trying to find all related records in a large table and update them based on a smaller recordset, but as soon as I do that, all performance goes out the window and it starts doing full table scans of a massive table.
Below is an example of what the table would look like after importing file3 (there's an index on the VoidID column and FileName as well)
The table is showing that ID #1 is related to ID#2 and vice versa, same for ID #4 and #5. ID #3 is a sale with no void.
After Importing File3, I want to look for all records where there is a sale and void in the same file, but only for the file that was just imported. So I'm using a temp table #Results where I only insert the ID's from File3, but when I add that temp table into the query, it does a full scan of the table and takes forever to run:
DECLARE @Import Table(ID int PRIMARY KEY NOT NULL, TransType varchar(10), VoidID int, FileName varchar(25))
INSERT INTO @Import
VALUES(1,'Sale',2,'File1'),(2,'Void',1,'File1'),(3,'Sale',NULL,'File2'),(4,'Sale',5,'File3'),(5,'Void',4,'File3')
SELECT * FROM @Import
CREATE table #Results(ID integer PRIMARY KEY NOT NULL)
INSERT INTO #Results(ID)
SELECT ID FROM @Import WHERE FileName = 'File3'
select * from #Results
SELECT P1.ID
FROM @Import P1 INNER JOIN #Results R ON P1.ID = R.ID INNER JOIN @Import P2 ON P1.ID = P2.VoidID
WHERE P1.FileName = P2.FileName
DROP TABLE #Results
This works, but the full table scan is still running right now (over an hour) so that's really not acceptable.
How can I improve this query?