SQLTeam.com | Weblogs | Forums

SQL sub query counting - is there a better way?

sql2014

#1

I'm struggling with a large dataset (10 million+ rows) and a slow sub-query counting records; Added as many relevant indexes as I can see, looking through the execution plan etc.

Is there a better way of doing this? CTE or using a join on the aggregate query

declare @campaign_ids varchar(50) = '17830,17811'
declare @startdate varchar(50) = '2017/Jul/29'
declare @enddate varchar(50) = '2018/Jan/29'

SELECT 	DISTINCT 
			cet.p_campaign_email_tracking_id,
			cet.link, 
			c.name,
			(
				-- clicks
				SELECT COUNT(*) 
				FROM history_click 
				WHERE history_click.f_campaign_id = hc.f_campaign_id 
				AND history_click.f_campaign_email_tracking_id = hc.f_campaign_email_tracking_id		
				AND	convert(datetime, history_click.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112) 				
			), 
			(
				-- opens
				SELECT COUNT(*) 
				FROM history_open ho
				WHERE ho.f_campaign_id = hc.f_campaign_id 
				AND	convert(datetime, ho.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112) 
			),
			(
				-- sent
				SELECT COUNT(*) 
				FROM history_sent hs
				WHERE hs.f_campaign_id = hc.f_campaign_id 
				AND	convert(datetime, hs.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112) 
			)
FROM 	history_click hc
INNER	JOIN campaign c ON hc.f_campaign_id = c.p_campaign_id
INNER	JOIN campaign_email_tracking cet ON cet.p_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
WHERE 	hc.f_campaign_id in (select list from CSVToTable(@campaign_ids))
AND		convert(datetime, hc.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)

#2

Try to avoid using sub selects in the select list. Instead get those values by joining whatever is required in the from clause.


#3

Avoid using functions on columns in where clauses - and avoid using BETWEEN for datetime.

Change this: convert(datetime, ho.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)

To this: ho.date_created >= @startdate And ho.date_created < dateadd(day, 1, @enddate)

Also - avoid using DISTINCT unless absolutely necessary. If you are getting duplicate rows then make sure your joins are correct and the relationships are valid.


#4

One other thing - SELECT COUNT(1) is faster than COUNT(*). I used this change in production and actually got a thank you for the performance boost.


#5

I would also get rid of that splitter function in the WHERE clause. That can really cause the optimizer grief (depending on how it's written). Since you have DECLARE statements I take this code to be something where you could create a TEMP TABLE and then INSERT the results from the splitter function then JOIN to that temp table later.

This is an example using a splitter function I use but you get the idea. The PK value on the splitter values helps too especially if you have lots of rows.

DECLARE @CampaignIDs VARCHAR(256) = '17830,17811';

IF OBJECT_ID('tempdb..#CampaignIDs') IS NOT NULL
DROP TABLE #CampaignIDs;

CREATE TABLE #CampaignIDs
(
CampaignID INT NOT NULL PRIMARY KEY
);
INSERT INTO #CampaignIDs (CampaignID)
SELECT CSVValues FROM dbo.Split (@CampaignIDs, ',');

SELECT DISTINCT
cet.p_campaign_email_tracking_id,
cet.link,
c.name,
(
-- clicks
SELECT COUNT()
FROM history_click
WHERE history_click.f_campaign_id = hc.f_campaign_id
AND history_click.f_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
AND convert(datetime, history_click.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
),
(
-- opens
SELECT COUNT(
)
FROM history_open ho
WHERE ho.f_campaign_id = hc.f_campaign_id
AND convert(datetime, ho.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
),
(
-- sent
SELECT COUNT(*)
FROM history_sent hs
WHERE hs.f_campaign_id = hc.f_campaign_id
AND convert(datetime, hs.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
)
FROM history_click hc
INNER JOIN campaign c ON hc.f_campaign_id = c.p_campaign_id
INNER JOIN campaign_email_tracking cet ON cet.p_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
INNER JOIN #CampaignIDs As cid ON hc.f_campaign_id = cid.CampaignID
WHERE convert(datetime, hc.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)