I'm struggling with a large dataset (10 million+ rows) and a slow sub-query counting records; Added as many relevant indexes as I can see, looking through the execution plan etc.
Is there a better way of doing this? CTE or using a join on the aggregate query
SELECT DISTINCT
cet.p_campaign_email_tracking_id,
cet.link,
c.name,
(
-- clicks
SELECT COUNT(*)
FROM history_click
WHERE history_click.f_campaign_id = hc.f_campaign_id
AND history_click.f_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
AND convert(datetime, history_click.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
),
(
-- opens
SELECT COUNT(*)
FROM history_open ho
WHERE ho.f_campaign_id = hc.f_campaign_id
AND convert(datetime, ho.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
),
(
-- sent
SELECT COUNT(*)
FROM history_sent hs
WHERE hs.f_campaign_id = hc.f_campaign_id
AND convert(datetime, hs.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
)
FROM history_click hc
INNER JOIN campaign c ON hc.f_campaign_id = c.p_campaign_id
INNER JOIN campaign_email_tracking cet ON cet.p_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
WHERE hc.f_campaign_id in (select list from CSVToTable(@campaign_ids))
AND convert(datetime, hc.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
Avoid using functions on columns in where clauses - and avoid using BETWEEN for datetime.
Change this: convert(datetime, ho.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
To this: ho.date_created >= @startdate And ho.date_created < dateadd(day, 1, @enddate)
Also - avoid using DISTINCT unless absolutely necessary. If you are getting duplicate rows then make sure your joins are correct and the relationships are valid.
I would also get rid of that splitter function in the WHERE clause. That can really cause the optimizer grief (depending on how it's written). Since you have DECLARE statements I take this code to be something where you could create a TEMP TABLE and then INSERT the results from the splitter function then JOIN to that temp table later.
This is an example using a splitter function I use but you get the idea. The PK value on the splitter values helps too especially if you have lots of rows.
IF OBJECT_ID('tempdb..#CampaignIDs') IS NOT NULL
DROP TABLE #CampaignIDs;
CREATE TABLE #CampaignIDs
(
CampaignID INT NOT NULL PRIMARY KEY
);
INSERT INTO #CampaignIDs (CampaignID)
SELECT CSVValues FROM dbo.Split (@CampaignIDs, ',');
SELECT DISTINCT
cet.p_campaign_email_tracking_id,
cet.link,
c.name,
(
-- clicks
SELECT COUNT()
FROM history_click
WHERE history_click.f_campaign_id = hc.f_campaign_id
AND history_click.f_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
AND convert(datetime, history_click.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
),
(
-- opens
SELECT COUNT()
FROM history_open ho
WHERE ho.f_campaign_id = hc.f_campaign_id
AND convert(datetime, ho.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
),
(
-- sent
SELECT COUNT(*)
FROM history_sent hs
WHERE hs.f_campaign_id = hc.f_campaign_id
AND convert(datetime, hs.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)
)
FROM history_click hc
INNER JOIN campaign c ON hc.f_campaign_id = c.p_campaign_id
INNER JOIN campaign_email_tracking cet ON cet.p_campaign_email_tracking_id = hc.f_campaign_email_tracking_id
INNER JOIN #CampaignIDs As cid ON hc.f_campaign_id = cid.CampaignID
WHERE convert(datetime, hc.date_created, 112) between convert(datetime, @startdate, 112) and convert(datetime, @enddate, 112)