SQLTeam.com | Weblogs | Forums

Filling in the blanks for missing days

I have a table that tracks changes in inventory by date. But I need to do a calculation that takes into account how many days the inventory was at that value.

I am able to generate a list of all dates within a given range & have merged the two tables to identify which rows do not have data. I also think that I should be using LAG to identify the value of the day before. However, I cannot get it to work.

The idea being that if a day has a null value, I would fill in the inventory value of the day before. Can anyone help me finish this? Alternatively is there a better way to do this?

-- Create Dummy data
CREATE TABLE #tmp_inv ([ActivityDate] date, [ProductID] nvarchar(max), [TotalBinQty] int);
INSERT INTO #tmp_inv ([ActivityDate], [ProductID], [TotalBinQty]) VALUES 
	 ('2022-01-13', 'MyProduct', '2')
	, ('2022-01-11', 'MyProduct', '0')
	, ('2022-01-07', 'MyProduct', '302')
	, ('2022-01-06', 'MyProduct', '302')
	, ('2022-01-04', 'MyProduct', '5832')
	, ('2022-01-03', 'MyProduct', '5833')
	;

-- Declare parameters
DECLARE @StartDate datetime = '2022-01-01'
       ,@EndDate   datetime = '2022-01-31'
;

-- Create list of all dates in range
WITH date_list AS
	(SELECT CAST( @StartDate AS DATE)  as [Date]
	UNION ALL
	SELECT CAST( DATEADD(day, 1, [Date]) AS DATE )
	FROM date_list
	WHERE DATEADD(day, 1, [Date]) <= @EndDate
	)
-- Merge Available data with all dates
SELECT
	[dli].[Date]
	, [gbd].[ProductID]
	, [gbd].[TotalBinQty]
	, (LAG([TotalBinQty]) OVER (
			PARTITION BY [ProductID], [ActivityDate]
			ORDER BY [ActivityDate])
			)  AS [YesterdaysTotalBinQty]
FROM date_list dli
FULL JOIN #tmp_inv gbd
	ON [dli].[Date] = [gbd].[ActivityDate]
ORDER BY [dli].[Date] DESC 

LAG wouldn't work in this case because LAG is not recursive. You could do something like shown below.

WITH date_list AS
(
SELECT CAST( @StartDate AS DATE)  as [Date]
	UNION ALL
	SELECT CAST( DATEADD(day, 1, [Date]) AS DATE )
	FROM date_list
	WHERE DATEADD(day, 1, [Date]) <= @EndDate
	)
SELECT
	[dli].[Date]
	, [gbd].[ProductID]
    , gbd.TotalBinQty
FROM date_list dli
    OUTER APPLY
    (
        SELECT TOP (1) 
            gbd.ProductID,
            gbd.TotalBinQty
        FROM    
            #tmp_inv gbd
        WHERE
            gbd.ActivityDate < dli.Date
            AND gbd.TotalBinQty IS NOT NULL
        ORDER BY
            gbd.ActivityDate DESC
    ) AS gbd
ORDER BY
    dli.Date;

Couple of things to note though:

  1. This is not very efficient. For each row, it is effectively doing a subquery. You can do this much more efficiently using windowing functions. Search for Islands and Gaps in SQL. There are examples on the web. Itzik Ben-Gan has some good articles on it.
  2. The way you are generating calendar table is inefficient. Would be useful to have a calendar table in your database.

/* For best performance, be SURE to create this index BEFORE running the query:
   CREATE NONCLUSTERED INDEX ActivityDate ON #tmp_inv ( ActivityDate ) 
        INCLUDE ( TotalBinQty ) WITH ( FILLFACTOR = 98, SORT_IN_TEMPDB = ON );
   /* If listing multiple ProductIDs at once, I would think you might also need to check that the ProductID matches; if so, create this index INSTEAD */
   --CREATE NONCLUSTERED INDEX ActivityDate ON #tmp_inv ( ActivityDate, ProductID ) 
        --INCLUDE ( TotalBinQty ) WITH ( FILLFACTOR = 96, SORT_IN_TEMPDB = ON );
*/

;WITH
cte_tally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
),
cte_tally1000 AS (
    SELECT 0 AS number UNION ALL
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number FROM cte_tally10 c1 CROSS JOIN cte_tally10 c2  CROSS JOIN cte_tally10 c3
)
-- Merge Available data with all dates
SELECT
	DATEADD(DAY, t.number, @StartDate) AS [Date]
	, [gbd].[ProductID]
	, [gbd].[TotalBinQty]
FROM cte_tally1000 t
OUTER APPLY (
    SELECT TOP (1) *
    FROM #tmp_inv gbd
	WHERE [gbd].[ActivityDate] <= DATEADD(DAY, t.number, @StartDate) AND gbd.TotalBinQty IS NOT NULL
        --AND ProductID = 'MyProduct'
    ORDER BY [gbd].[ActivityDate] DESC
) AS gbd
WHERE t.number BETWEEN 0 AND DATEDIFF(DAY, @StartDate, @EndDate)
ORDER BY DATEADD(DAY, t.number, @StartDate) DESC