Earliest Continuous Effective Date

ace_friends22 · January 12, 2023, 6:23pm

Hi All,
I have 3 columns. ID, Effective Date and Term Date in the source

I have 4 columns ID, Effective Date, Term Date and Earliest Continuous Date in the target

ID, Effective date and term date will remain same between source and target, but for Earliest Continuous Date, I need to populate min effective date which have continuous eff and term date.

For example. I have uploaded file with expected target data. If you notice continuation of Effective Date and Term Date

record 1, 2 & 3 are continuous from 02/01/2016 to 01/01/2017 hence earliest continuous date for these 3 records is 02/01/2016

record 4 has gap of 5 months and 06/01/2017 to 06/01/2017, hence earliest continuous date for the record is 06/01/2017

records 5,6,7,8 are continuous from 09/01/2017 to 02/01/2019, hence earliest continuous date for these 3 record is 09/01/2017.

I have tried in google to find solution, but not getting it. Can somebody please help me to solve this?

ID,EFFECTIVE_DATE, TERM_DATE, EFFECTIVE_CONTINUOUS_DATE
123,02/01/2016,02/29/2016,02/01/2016
123,03/01/2016,12/31/2016,02/01/2016
123,01/01/2017,01/01/2017,02/01/2016
123,06/01/2017,06/01/2017,06/01/2017
123,09/01/2017,12/31/2017,09/01/2017
123,01/01/2018,12/31/2018,09/01/2017
123,01/01/2019,01/31/2019,09/01/2017
123,02/01/2019,02/01/2019,09/01/2017

JeffModen · January 13, 2023, 5:53am

Thanks for posting some aligned text instead of a useless graphic. It would be helpful if you took the time to take it the next level and provide the table and some readily consumable data on future posts. Here's how that could look...

--===== If the test table already exists, 
     -- drop it to make code reruns in SSMS easier.
   DROP TABLE IF EXISTS #TestTable
;
--===== Show the data in a "Readily Consumable" bit of code
     -- and store it in a temp table to play with.
 SELECT v.ID
        ,EFFECTIVE_DATE = CONVERT(DATE,v.EFFECTIVE_DATE)
        ,TERM_DATE      = CONVERT(DATE,v.TERM_DATE)
   INTO #TestTable
   FROM (VALUES
         (123,'02/01/2016','02/29/2016') --'02/01/2016'
        ,(123,'03/01/2016','12/31/2016') --'02/01/2016'
        ,(123,'01/01/2017','01/01/2017') --'02/01/2016'
        ,(123,'06/01/2017','06/01/2017') --'06/01/2017'
        ,(123,'09/01/2017','12/31/2017') --'09/01/2017'
        ,(123,'01/01/2018','12/31/2018') --'09/01/2017'
        ,(123,'01/01/2019','01/31/2019') --'09/01/2017'
        ,(123,'02/01/2019','02/01/2019') --'09/01/2017'
        -- Added this second ID to prove the survival of the code when it happens
        ,(456,'02/01/2016','02/29/2016') --'02/01/2016'
        ,(456,'03/01/2016','12/31/2016') --'02/01/2016'
        ,(456,'01/01/2017','01/01/2017') --'02/01/2016'
        ,(456,'06/01/2017','06/01/2017') --'06/01/2017'
        ,(456,'09/01/2017','12/31/2017') --'09/01/2017'
        ,(456,'01/01/2018','12/31/2018') --'09/01/2017'
        ,(456,'01/01/2019','01/31/2019') --'09/01/2017'
        ,(456,'02/01/2019','02/01/2019') --'09/01/2017'
        )v(ID,EFFECTIVE_DATE,TERM_DATE)
;
--===== Show the data in the test table
 SELECT * FROM #TestTable ORDER BY ID,EFFECTIVE_DATE
;
GO

Here's one fairly easy (once you understand the concatenation for the MAX) way to accomplish the task using a trick of converting two columns to Binaries to easily find a MAX/OVER and then pluck the date back out from that using a CONVERTed SUBSTRING. Details are in the comments in the code.

--===== This wonderful bit of PFM isn't my original idea. I got it from the post by Drew Allen 
     -- at the following URL and he got the idea from Itzik Ben-Gan.
     -- https://www.sqlservercentral.com/forums/topic/query-help-299#post-4072505
   WITH 
cteMarkStart AS
(--==== Find the starting row of each group based on being more than 1 day since the previous rows
     -- The group starting rows are marked with a "1" and the rest are marked with a "0".
 SELECT *
        ,GrpStart = SIGN(DATEDIFF(dd,LAG(TERM_DATE,1,DATEADD(dd,-2,EFFECTIVE_DATE)) 
                                     OVER (PARTITION BY ID ORDER BY ID,EFFECTIVE_DATE),EFFECTIVE_DATE)-1)
   FROM #TestTable
)
,cteGrpDate AS
(--==== Add a sort by ID and EFFECTIVE_DATE to keep things simple and convert the GrpStart markers to EFFECTIVE_DATE.
 SELECT *
        ,SortOrder = ROW_NUMBER() OVER (ORDER BY ID,EFFECTIVE_DATE)
        ,GrpDate   = IIF(GrpStart = 1,EFFECTIVE_DATE,NULL)
   FROM cteMarkStart
)--==== This concatenates the BINARYs of the SortOrder and the GrpDate so we can easily determine the previous or
     -- current MAX of the two columns, then SUBSTRINGS the date BINARY back out and converts it back to a date.
 SELECT gd.*
        ,EFFECTIVE_CONTINUOUS_DATE = CAST(SUBSTRING(MAX(CAST(gd.SortOrder AS BINARY(5)) + CAST(gd.GrpDate AS BINARY(3))) 
                                                    OVER(ORDER BY gd.SortOrder ROWS UNBOUNDED PRECEDING), 6, 3) AS DATE)
FROM cteGrpDate gd
GO

And, here are the results. I left the extra "working" columns in the output so you could see what was going on. You can, of course, change the "*" in the final output to just the columns you want to see in the output.

p.s. I call this type of thing a "Data Smear" because you're "smearing" existing data down to fill NULLs until it hits another piece of NON-NULL data. Some people call this the "Last Previous Non-Null Value" problem.

It IS possible to combine the first two CTEs to eliminate the need for the GrpStart column but the code is already "different enough" to blow the top's off of people's heads and so I keep them separate so it's a wee bit more obvious and the comments explain it better. I don't really do that for other people... I do it for myself, so that when I read it six months from now, I can remember what the hell it was that I did.

JeffModen · January 13, 2023, 6:53am

As a bit of a sidebar, if you're using SQL Server 2017 or above, you can use the FIRST_VALUE() function. Here's an example on Stack Overflow. It does NOT, however, detect where each group starts like this problem does and so you'll still need the first CTE in the solution I posted above. Also, the question is tagged as "postgresql" but it looks like it should work in SQL Server 2017 and above.

Nemesis3 · January 14, 2023, 3:38am

ace_friends22:

I have 3 columns. ID, Effective Date and Term Date in the source

I have 4 columns ID, Effective Date, Term Date and Earliest Continuous Date in the target

ID, Effective date and term date will remain same between source and target, but for Earliest Continuous Date, I need to populate min effective date which have continuous eff and term date.

For example. I have uploaded file with expected target data. If you notice continuation of Effective Date and Term Date

record 1, 2 & 3 are continuous from 02/01/2016 to 01/01/2017 hence earliest continuous date for these 3 records is 02/01/2016

record 4 has gap of 5 months and 06/01/2017 to 06/01/2017, hence earliest continuous date for the record is 06/01/2017

records 5,6,7,8 are continuous from 09/01/2017 to 02/01/2019, hence earliest continuous date for these 3 record is 09/01/2017.

I have tried in google to find solution, but not getting it. Can somebody please help me to solve this?

Let me know if this one works for you!

WITH cte AS (
    SELECT 
        ID, 
        Effective_Date, 
        Term_Date, 
        LAG(Term_Date) OVER (PARTITION BY ID ORDER BY Effective_Date) AS prev_term, 
        LEAD(Effective_Date) OVER (PARTITION BY ID ORDER BY Effective_Date) AS next_effective
    FROM source
)
SELECT 
    ID, 
    Effective_Date, 
    Term_Date, 
    (SELECT MIN(Effective_Date) FROM cte WHERE ID = cte.ID AND prev_term = cte.Effective_Date - 1 AND next_effective = cte.Term_Date + 1) AS Earliest_Continuous_Date
FROM cte;

Nemesis3 · January 14, 2023, 3:44am

In case of Oracle the above. SQL server? -->

WITH cte AS (
    SELECT 
        ID, 
        Effective_Date, 
        Term_Date, 
        LAG(Term_Date) OVER (PARTITION BY ID ORDER BY Effective_Date) AS prev_term, 
        LEAD(Effective_Date) OVER (PARTITION BY ID ORDER BY Effective_Date) AS next_effective
    FROM source
)
SELECT 
    ID, 
    Effective_Date, 
    Term_Date, 
    (SELECT MIN(Effective_Date) FROM cte WHERE ID = cte.ID AND prev_term = DATEADD(day, -1, cte.Effective_Date) AND next_effective = DATEADD(day, 1, cte.Term_Date)) AS Earliest_Continuous_Date
FROM cte;

JeffModen · January 14, 2023, 5:03am

@Nemesis3 ,

In the absence of more info from the OP about the original test data posted, I ran your code against the test table that I created above and got the following error message when I ran your code against it (after changing the name of the source table).

Msg 206, Level 16, State 2, Line 36
Operand type clash: date is incompatible with int

I fixed that simply by changing the DATE datatypes to DATETIME but the expected result isn't appearing correctly...