SQLTeam.com | Weblogs | Forums

T-SQL to fill in gaps in table with averages determined in another table


#1

Hi,

Very new to T-SQL and SQL Server but already have come across a complex problem even my more experienced colleagues don't know how to solve.

Basically, we have meter readings taken from devices every day over the internet, and from this we can determine the daily volume. But occasionally the reading data is 'spotty' and some devices will be offline for a day or sometimes many days at a time. When the device comes back online we can tell what volume has been done across those days, but obviously not the amount done on each individual day.

In order to plug these gaps, I want to fill in the missing days with the average per day. A chart will end up having a straight line for a single device, but I'd rather that than have it drop off altogether. Charting will generally display the sum of many hundreds of devices anyway so the averaging of a few sporadically won't be too noticeable.

My problem is how! I have a table of every device with every date, and where we know the volume it's already filled in. But how do I fill in the nulls?

This screenshot of a mock-up might explain all of this a lot better:

What do I have to run against Table A (which determines the gaps between readings, and determines the average per day when the gaps in readings is multiple days) to populate Table B (which displays the volume per day)?

Many thanks in advance for any assistance on this.


#2

I would do:

Scheduled Task (say, once a day)

Find any adjacent rows that are more than 1 day apart
Use TALLY table to create a set of days that match the gap
Use the rows before / after the gap for the Start / End amounts, and divide by the number of missing days and, thus, the average-amount-per-day

Also: mark the rows as Derived, rather than Actual (as per your screen-shot)


#3

Don't know what some of it means - but you've given me something to go on and something to Google!

Thanks


#4
SELECT 
    a.Item, 
    DATEADD(DAY, t.N, FromDate) AS UsageDate, 
    a.Volume / ca1.Days AS Volume, 
    CASE WHEN Days = 1 THEN 'Actual' ELSE 'Derived' END AS Type
FROM TABLE_A a
CROSS APPLY (
    SELECT DATEDIFF(DAY, FromDate, ToDate) AS Days
) AS ca1
INNER JOIN dbo.tally t ON t.N BETWEEN 1 AND ca1.Days
/*ORDER BY Item, UsageDate*/

Btw, just to let you know, actual sample tables and data would be vastly more useful than a picture. I could then actually run the code, thus being able to refine it and debug it for you :slight_smile: .


#5

Hi Scott,

Thanks for the response. Can I ask what dbo.tally is? Is that simply a list of dates covering the entire period of time I'm interested in, as per Kristen's response? What does the field "N" contain?

Thanks for the forum tip too.


#6

Its just a table of consecutive numbers. You can select from the tally table from, say, 1-to-100 and add it to, say, "today's date" to get dates for the next 100 days (and other jobs like that).

Jeff has a function that will dynamically generate the tally-table numbers in real time (and is efficient enough that in most cases you don't need to go to the trouble of making sure that you have a table of numbers available). I'll find the link in a mo ...

That will be the numbers 1, 2, 3, ...


#7

Didn't find a specific article with a canned solution, but this may help with background, and a CTE that you could use

http://www.sqlservercentral.com/blogs/dwainsql/2014/03/27/tally-tables-in-t-sql/


#8

Excellent - thanks to you both.. This code and explanation has worked in exactly the way I needed.