with cte
as (select dateandtime
,[group]
,row_number() over(order by dateandtime)
-row_number() over(order by [group]) as rn
from log_table
)
select min(dateandtime) as [from]
,max(dateandtime) as [to]
,[group]
,count(*) as logons
from cte
group by [group]
,rn
order by [from]
;
over(order by dateandtime) - row_number() over(order by [group])
returns the same and unique value for all members of the same timeframe. I can understand that the values really are the same, but are they unique, too??? It would be a really cool and elegant way to build an unique indicator for members of the same bundle. Is this reliable? I'm puzzling over the logic.
instinctivly I would expect (at least):
over(order by dateandtime) - row_number() over(order by [group], dateandtime)
I belive you're right - should have added dateandtime to the "order by" in the last "over".
As to if it is reliable I think it is in this situation, but it's up to you to test on a larger scale of data. I would be very careful as to where I'd use this technique.
use tempdb
go
drop table data
go
create table data
(
rn int identity(1,1) not null,
DateAndTime datetime ,
Group1 varchar(100)
)
go
insert into data SELECT '2018-10-26 08:00','GroupA'
insert into data SELECT '2018-10-26 08:10','GroupA'
insert into data SELECT '2018-10-27 08:05','GroupB'
insert into data SELECT '2018-10-28 08:05','GroupB'
insert into data SELECT '2018-10-28 10:00','GroupB'
insert into data SELECT '2018-10-29 08:20','GroupA'
go
select * from data
go
SQL recursive CTE
; WITH reccte
AS (SELECT *,
1 AS grp
FROM data
WHERE rn = 1
UNION ALL
SELECT a.*,
CASE
WHEN a.group1 = b.group1 THEN b.grp + 1
WHEN a.group1 <> b.group1 THEN 1
END AS grp
FROM data a
JOIN reccte b
ON a.rn = b.rn + 1)
SELECT *
FROM reccte
go
I tend to avoid recursive CTE whenever I can, so here's a couple of alternatives:
with cte
as (select dateandtime as [from]
,lead(dateandtime,1,'9999-12-31 23:59:59.998') over(order by dateandtime) as [to]
from (select top(1) with ties
dateandtime
from log_table
order by case when [group]=lag([group],1,null) over(order by dateandtime) then 1 else 0 end
) as a
)
select a.[from]
,max(b.dateandtime) as [to]
,min([group]) as [group]
,count(*) as logons
from cte as a
inner join log_table as b
on b.dateandtime>=a.[from]
and b.dateandtime<a.[to]
group by a.[from]
;
with cte1
as (select top(1) with ties
dateandtime
from log_table
order by case when [group]=lag([group],1,null) over(order by dateandtime) then 1 else 0 end
)
,cte2
as (select a.dateandtime
,a.[group]
,sum(1) as group_num
from log_table as a
left outer join cte1 as b
on b.dateandtime<=a.dateandtime
group by a.dateandtime
,a.[group]
)
select min(dateandtime) as [from]
,max(dateandtime) as [to]
,min([group]) as [group]
,count(*) as logons
from cte2
group by group_num
;
Mostly for performance reasons, but also for readability and to avoid infinite loops. Actually I convinced myself to avoid cursors for the same reasons.
Ps.: I'm not saying anyone else should avoid recursive cte's and cursors as they were invented for a reason, it's just a little "tick" of mine