SQLTeam.com | Weblogs | Forums

Most efficient way to filter by a text-based list?

I need to run a query that filters by arbitrary parameters supplied in a comma separated list. I use DelimitedSplit8K.

Currently, I am doing something like this. Though it is on a substantially larger dataset.

It works, but when the comma-separated list supplied is greater than 5 parameters it really slows things down. When it is greater than 20 parameters, it is nearly unusable.

  1. Assuming we're talking about millions of rows, is there ever a way to filter by a comma-separated-list efficiently?

  2. Would you change anything about the query itself?

  3. I use TVFs which inherit from other TVFs. Is there a performance difference between passing the raw @skuList from the parent TVF to the child & letting the child TVF filter VS running the child TVF without filters & letting the parent filter at the end?

-- Prepare Temporary table with dummy data
CREATE TABLE #product (id nvarchar(100), dateadded datetime default(getdate()));
INSERT INTO #product (id) VALUES
('AARCONFETTI')
, ('97-9F6J-H7MU')
, ('1592')
, ('ABC')
, ('DEF')
, ('GHI')
, ('JKL')
;

-- Query data
DECLARE @skuList NVARCHAR(MAX)	= 'AARCONFETTI,97-9F6J-H7MU,1592'

SELECT 
	[bp].*
FROM #product bp
FULL JOIN [MyDB].[dbo].[DelimitedSplit8K](@skuList,',') asl
	ON LTRIM(RTRIM([asl].[item])) = LTRIM(RTRIM([bp].[ID]))
WHERE 1=1
	AND [asl].[Item] IS NOT NULL

Any function against the column will kill performance. You have to clean up the list prior to executing the split. Roughly something like below. Try running this command first, then doing the split without any functions, as shown.

Also, for best overall performance, uniquely cluster the #product table on the "id" column (if there really is a temp table in use: if that's just a stand-in for the real non-temp table, then please ignore these comments). It may not help, but it could, and it certainly won't hurt.

DECLARE @skuList NVARCHAR(MAX)	= 'AARCONFETTI,97-9F6J-H7MU,1592,...'

/*remove any spaces in @skuList, ok assuming no SKU has an embedded space (seems
  right to me); but, if any SKU has embedded space, we'll have to change the code!*/
SELECT @skuList = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
    REPLACE(LTRIM(RTRIM(@skuList)), SPACE(33), SPACE(1)), SPACE(17), 
    SPACE(1)), SPACE(9), SPACE(1)), SPACE(5), SPACE(1)), SPACE(3), SPACE(1)), 
    SPACE(2), SPACE(1)), SPACE(2), SPACE(1)), SPACE(1), '')

SELECT 
	[bp].*
FROM #product bp
FULL JOIN [MyDB].[dbo].[DelimitedSplit8K](@skuList,',') asl
	ON [asl].[item] = [bp].[ID]
WHERE 1=1
	AND [asl].[Item] IS NOT NULL
1 Like