SQLTeam.com | Weblogs | Forums

Combining tables with overlapping date ranges

Hi all,
Thanks for looking,

I have multiple tables with timelines on, my hope is to be able to merge these into one to create a timeline across multiple topics, essentially to make later querying easier. I've changed the topics for easier understanding but kept the principle.

We get examples with three, four or more tables with start and end dates (or nulls) - and perhaps 80,000 records! Though I've presented a target table below ideas outside that for handling this sort of thing are also very welcome.

Dave C

** BASE PRICE TABLE **

+PRODUCT+BASEPRICE+PRICESTART+PRICEEND
--------------------------------------
|SHOES1 |23.55   |10/05/20 |31/12/21
|SHOES1 |26.99   |01/01/22 |(NULL)

** DISCOUNT TABLE **

+PRODUCT+DISCOUNT+DISCSTART+DISCEND
-----------------------------------
|SHOES1 |0.10    |10/05/20 |15/02/22
|SHOES1 |0.20    |16/02/22 |(NULL)

** SOMEHOW TO GET TO **

+PRODUCT+BASEPRICE+DISCOUNT+START   +END
------------------------------------------
|SHOES1 |23.55    |0.10    |10/05/20|31/12/21
|SHOES1 |26.99    |0.10    |01/01/22|15/02/22
|SHOES1 |26.99    |0.20    |16/02/22|(NULL)

Thanks again,
DC

In future, please post consumable test data:

SET ANSI_NULLS, QUOTED_IDENTIFIER, ANSI_PADDING ON;
GO
CREATE TABLE #ProductPrices
(
	Product varchar(20) NOT NULL
	,BasePrice decimal(19,4) NOT NULL
	,PriceStart date NOT NULL
	,PriceEnd date NULL
	,PRIMARY KEY (Product, PriceStart)
);
INSERT INTO #ProductPrices
VALUES ('SHOES1', 23.55, '20200510', '20211231')
	,('SHOES1', 26.99, '20220101', NULL);


CREATE TABLE #ProductDiscounts
(
	Product varchar(20) NOT NULL
	,Discount decimal(19,4) NOT NULL
	,DiscStart date NOT NULL
	,DiscEnd date NULL
	,PRIMARY KEY (Product, DiscStart)
);
INSERT INTO #ProductDiscounts
VALUES ('SHOES1', 0.1, '20200510', '20220215')
	,('SHOES1', 0.2, '20220216', NULL);
GO

The simple, but expensive, approach is to unwrap the the dates with a number/tally table and then wrap them up again. With:

Try something like:

WITH Prices
AS
(
	SELECT P.Product, P.BasePrice
		,DATEADD(day, T.N, P.PriceStart) AS PriceDay
	FROM #ProductPrices P
		JOIN dbo.fnTally(0,8000) T
			ON T.N <= DATEDIFF(day, P.PriceStart, COALESCE(P.PriceEnd, CURRENT_TIMESTAMP))
)
, Discounts
AS
(
	SELECT D.Product, D.Discount
		,DATEADD(day, T.N, D.DiscStart) AS DiscDay
	FROM #ProductDiscounts D
		JOIN dbo.fnTally(0,8000) T
			ON T.N <= DATEDIFF(day, D.DiscStart, COALESCE(D.DiscEnd, CURRENT_TIMESTAMP))
)
SELECT P.Product, P.BasePrice, D.Discount
	,MIN(PriceDay) AS DayStart
	,NULLIF(MAX(PriceDay), DATEADD(day, DATEDIFF(day, 0, CURRENT_TIMESTAMP), 0)) AS DayEnd
FROM Prices P
	JOIN Discounts D
		ON P.Product = D.Product
			AND P.PriceDay = D.DiscDay
GROUP BY P.Product, P.BasePrice, D.Discount
ORDER BY Product, DayStart;

It should be possible to produce a more efficient solution but it would required more test data to account for edge/boundary conditions.

@crazydavec

I know you say you want that 3rd table in your example but, I have to ask, what's the ultimate goal of having such a table?

I know... Well duh! Obvious question. For pricing and discounts by day. But, since you have start and end dates to begin with, what will the queries look like that will play against that table and will they be single row lookups or comparisons of dates in another table?

The reason why I'm asking is because, except for using NULL for a "has no end date" value, you've got a couple of pretty nice "Temporal Tables" where it should be pretty easy to do either singleton lookups or table lookups.

So, checking the obvious, how will that desired table be used in a query?