Pivot Data based on the date range

Hi All,

I have a dataset which has information about students and their registered courses. following is a temp table with a sample structure and data.

DROP TABLE IF EXISTS #StudentData

CREATE TABLE #StudentData (StudentID INT,CourseType VARCHAR(50),CourseID INT,SubscriptionStartDate DATE, SubscriptioneEndDate DATE)

INSERT INTO #StudentData VALUES (1,'Basic',50,'10-15-2016','10-15-2017')

INSERT INTO #StudentData VALUES (1,'Advanced',72,'10-15-2016','2018-12-31')

INSERT INTO #StudentData VALUES (1,'Basic',50,'11-13-2017','12-31-2018')

INSERT INTO #StudentData VALUES (1,'Intermediate',66,'01-10-2017','07-30-2017')

INSERT INTO #StudentData VALUES (1,'Intermediate',68,'08-01-2017','12-30-2017')

INSERT INTO #StudentData VALUES (1,'Advanced',85,'01-10-2019',NULL)

INSERT INTO #StudentData VALUES (1,'Basic',57,'10-15-2020','10-15-2021')

SELECT * FROM #StudentData

ORDER BY CourseType,SubscriptionStartDate

I am trying to create a query that can group the data for different bands(based on the start and end dates), folloiwng is the desired output for the sample provided above.

DROP TABLE IF EXISTS #DesiredOutput

CREATE TABLE #DesiredOutput (StudentID INT,CourseType VARCHAR(50),[16-17] VARCHAR(50),[17-18] VARCHAR(50),[18-19] VARCHAR(50),[19-20] VARCHAR(50),[20-21] VARCHAR(50))

INSERT INTO #DesiredOutput VALUES (1,'Basic',50,50,null,null,57)

INSERT INTO #DesiredOutput VALUES (1,'Intermediate',null,'66,68',null,null,null)

INSERT INTO #DesiredOutput VALUES (1,'Advanced',72,72,null,85,85)

SELECT * FROM #DesiredOutput

I tried Pivot and was trying to use case whens but had no luck and hoping someone can help me. any help is highly appreciated.

Thanks in advance.

2 Likes

Welcome and great sample data!!

Those ranges 16-17 etc. What ranges are they? date ranges within a year or ..year ranges?

Never mind I see it is years...i think

Agreed, good sample data!

Will you always partition the results by StudentId? I assume you will, but before I actually write code, if you have had Stu 1 and Stu 2, would they appear in different 'Basic' / 'Intermediate' / ... rows or the same row?

Also there is data that "spans" more than year. How do you want to handle that?
Some are in the same year 17-17

SELECT StudentID, [CourseType], [16-17], [16-18],[17-18], [18-19], 
[19-20], [20-21] 
from (                  
		select [CourseType],  
		       StudentID, 
			   CourseID , 
			   right(year(SubscriptionStartDate),2) + '-'  
+ right(year(isnull(SubscriptioneEndDate,'')),2) as year_ranges             
		from #StudentData             
	  ) x              
 pivot (                   
	MAX(CourseID) for year_ranges in ([16-17], [16-18], 
[17-18], [18-19], [19-20], [20-21])
	   ) p 

@yosiasz - Thank you so much for the attempt. this is very similar to what I had. the issue with this approach is that there will be new Bands created in the result set like 16-18 as an example. the output bands are fixed as in the #DesiredResults temp table in the description.

Also, just noticed that none of the intermediate courses were picked up.

@ScottPletcher - Very good question.

the partition/group will be by StudentID, Course Type. that means there will be new rows in the result for Stu 2 and the different course types

@ScottPletcher - correct those are Years (which are standard [2016-2017], [2017-2018] etc..), like enrollment years.

if the end date is null then the assumption is that the subscription never ends, in other words we can consider the end date to be getdate() if it is null.

dynamic then. WHat do you want to do with multi year enrollments and same year start end?

declare @cols varchar(1500);

SELECT  @cols = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM (  SELECT  N'[' + a.yy  + '], '
        FROM (  SELECT  DISTINCT top 5  column_id, right(year(DATEADD(yy,column_id,'2015-01-01') ),2) +
 '-' + right(year(DATEADD(yy,column_id,'2016-01-01') ),2) as yy
					FROM sys.all_columns r 
				    order by column_id
				  ) a
        ORDER BY a.yy
        FOR XML PATH ( '' ) ) b ( Piv );

select @cols

declare @query varchar(max);

set @query = 'SELECT StudentID, [CourseType], ' + @cols + ' from 
            (
				select [CourseType],  
							   StudentID, 
							   CourseID , 
							   right(year(SubscriptionStartDate),2) + ''-''  
+ right(year(isnull(SubscriptioneEndDate,getdate())),2) as year_ranges             
						from #StudentData  
           ) x
            pivot 
            (
                 MAX(CourseID) for year_ranges in (' + @cols + ')
            ) p '

exec(@query)

Here is what I have managed to do so far.

  1. changed the bands to use simple years assuming 2016 is [16-17], 2017 as [17-18], etc..
  2. I converted the student data to be more granular from the original source by using CTEs to create one row per StudentID, CourseType, CourseID and Year the course is active for. as following

; WITH CTE AS
(
SELECT studentID,coursetype,courseid, datepart(year, SubscriptionStartDate) as yr,SubscriptioneEndDate,SubscriptionStartDate
FROM #StudentData
UNION ALL
SELECT studentID,coursetype,courseid, yr + 1,SubscriptioneEndDate,SubscriptionStartDate
FROM CTE
WHERE yr < datepart(year, isnull(SubscriptioneEndDate,getdate()))
)
SELECT * FROM (
SELECT studentID,coursetype,yr,STRING_AGG(courseid,',') AS Courses
FROM (
SELECT DISTINCT studentID,coursetype,yr,courseid
FROM (
select studentID,coursetype,courseid,yr
from CTE
) AS a
) AS B
GROUP BY studentID,coursetype,yr
) AS C
PIVOT
(
MIN(Courses)
for yr in ([2016],[2017],[2018],[2019],[2020],[2021])
) as pvt

After the granularity change i used Pivot and it worked great!

1 Like