I have a table that tracks changes in inventory by date. But I need to do a calculation that takes into account how many days the inventory was at that value.
I am able to generate a list of all dates within a given range & have merged the two tables to identify which rows do not have data. I also think that I should be using LAG to identify the value of the day before. However, I cannot get it to work.
The idea being that if a day has a null value, I would fill in the inventory value of the day before. Can anyone help me finish this? Alternatively is there a better way to do this?
-- Create Dummy data
CREATE TABLE #tmp_inv ([ActivityDate] date, [ProductID] nvarchar(max), [TotalBinQty] int);
INSERT INTO #tmp_inv ([ActivityDate], [ProductID], [TotalBinQty]) VALUES
('2022-01-13', 'MyProduct', '2')
, ('2022-01-11', 'MyProduct', '0')
, ('2022-01-07', 'MyProduct', '302')
, ('2022-01-06', 'MyProduct', '302')
, ('2022-01-04', 'MyProduct', '5832')
, ('2022-01-03', 'MyProduct', '5833')
;
-- Declare parameters
DECLARE @StartDate datetime = '2022-01-01'
,@EndDate datetime = '2022-01-31'
;
-- Create list of all dates in range
WITH date_list AS
(SELECT CAST( @StartDate AS DATE) as [Date]
UNION ALL
SELECT CAST( DATEADD(day, 1, [Date]) AS DATE )
FROM date_list
WHERE DATEADD(day, 1, [Date]) <= @EndDate
)
-- Merge Available data with all dates
SELECT
[dli].[Date]
, [gbd].[ProductID]
, [gbd].[TotalBinQty]
, (LAG([TotalBinQty]) OVER (
PARTITION BY [ProductID], [ActivityDate]
ORDER BY [ActivityDate])
) AS [YesterdaysTotalBinQty]
FROM date_list dli
FULL JOIN #tmp_inv gbd
ON [dli].[Date] = [gbd].[ActivityDate]
ORDER BY [dli].[Date] DESC
LAG wouldn't work in this case because LAG is not recursive. You could do something like shown below.
WITH date_list AS
(
SELECT CAST( @StartDate AS DATE) as [Date]
UNION ALL
SELECT CAST( DATEADD(day, 1, [Date]) AS DATE )
FROM date_list
WHERE DATEADD(day, 1, [Date]) <= @EndDate
)
SELECT
[dli].[Date]
, [gbd].[ProductID]
, gbd.TotalBinQty
FROM date_list dli
OUTER APPLY
(
SELECT TOP (1)
gbd.ProductID,
gbd.TotalBinQty
FROM
#tmp_inv gbd
WHERE
gbd.ActivityDate < dli.Date
AND gbd.TotalBinQty IS NOT NULL
ORDER BY
gbd.ActivityDate DESC
) AS gbd
ORDER BY
dli.Date;
Couple of things to note though:
- This is not very efficient. For each row, it is effectively doing a subquery. You can do this much more efficiently using windowing functions. Search for Islands and Gaps in SQL. There are examples on the web. Itzik Ben-Gan has some good articles on it.
- The way you are generating calendar table is inefficient. Would be useful to have a calendar table in your database.
/* For best performance, be SURE to create this index BEFORE running the query:
CREATE NONCLUSTERED INDEX ActivityDate ON #tmp_inv ( ActivityDate )
INCLUDE ( TotalBinQty ) WITH ( FILLFACTOR = 98, SORT_IN_TEMPDB = ON );
/* If listing multiple ProductIDs at once, I would think you might also need to check that the ProductID matches; if so, create this index INSTEAD */
--CREATE NONCLUSTERED INDEX ActivityDate ON #tmp_inv ( ActivityDate, ProductID )
--INCLUDE ( TotalBinQty ) WITH ( FILLFACTOR = 96, SORT_IN_TEMPDB = ON );
*/
;WITH
cte_tally10 AS (
SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
),
cte_tally1000 AS (
SELECT 0 AS number UNION ALL
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number FROM cte_tally10 c1 CROSS JOIN cte_tally10 c2 CROSS JOIN cte_tally10 c3
)
-- Merge Available data with all dates
SELECT
DATEADD(DAY, t.number, @StartDate) AS [Date]
, [gbd].[ProductID]
, [gbd].[TotalBinQty]
FROM cte_tally1000 t
OUTER APPLY (
SELECT TOP (1) *
FROM #tmp_inv gbd
WHERE [gbd].[ActivityDate] <= DATEADD(DAY, t.number, @StartDate) AND gbd.TotalBinQty IS NOT NULL
--AND ProductID = 'MyProduct'
ORDER BY [gbd].[ActivityDate] DESC
) AS gbd
WHERE t.number BETWEEN 0 AND DATEDIFF(DAY, @StartDate, @EndDate)
ORDER BY DATEADD(DAY, t.number, @StartDate) DESC