Hi, i have a list of million customers. each customers might have multiple encounters and each customer might have multiple categories in a given year. see below:
I want to be able to pull unique customers and their last encounter based on the date and pull last category the customer had that is associated with that last date. so for the example above i want to pull
Yes! But using "top(1) with ties" will give you all records with row number 1.
The "row_number" function will assign each row with a number starting from 1 and counting upwards. The "partition" part will start over from 1 every time "partition by field" changes (in this case, the customer field). And it will sort by the date field in descending order.
This will give you:
customer encounter date category rownumber
R123 1234 3/12/2017 A 3
R123 1235 7/15/2017 B 1
R123 1236 3/14/2017 C 2
and with the "top(1) with ties" in effect, it should give you only row number 1:
customer encounter date category rownumber
R123 1235 7/15/2017 B 1
so in this case i want to pull their last category in 2017 and last category in 2018 as well. so what I am trying to do is to pull all my customers each year, pull their last encounter and their last category of the year.
so if they are new in 2018, i want to pull their last encounter along their last category in 2018 and if they don't come back in 2018, i still want to pull their last encounter in 2017 and category.if they only have one encounter each year, i still want to pull it as well
select top(1) with ties
customer
,encounter
,[date]
,category
from yourtable
order by row_number() over(partition by customer
,datepart(year,[date])
order by [date] desc
)
;
SELECT TOP (1) WITH TIES customer ID, Year, category, date
FROM encounters LEFT OUTER JOIN
Date ON encounters_Date_Key = Date.Date_Key LEFT OUTER JOIN
category ON Encounter.category_Key = category_Key RIGHT OUTER JOIN
order by row_number() over(partition by customer iD order by Date desc)
WHERE Year IN (2017, 2018))
It says there is an issue near the where statement.
great, one last question, what about if i want to pull each customer and last their last encounter in each year. so if my customers only had encounters in 2017 , i want to pull their last encounter in that year but if they have multiple encounters in 2017 and 2018, i want to pull their last encounter in 2017 and their last encounter in 2018 along with the category on each encounter.
so for the example above, for my customer R123, i want to pull his last encounter in 2017 and last encounter in 2018. i like my results to look like this:
select top(1) with ties
customer
,encounter
,[date]
,category
,datepart(year,[date]) as [year]
from yourtable
where [date]>=cast('2017-01-01' as date)
and [date]<cast('2019-01-01' as date)
order by row_number() over(partition by customer
,datepart(year,[date])
order by [date] desc
)
;
use tempdb
go
drop table data
go
create table data
(
customer varchar(100),
encounter int ,
datein date,
category varchar(1) )
go
insert into data select 'R123',1234,'3/12/2017','A'
insert into data select 'R123',1235,'7/15/2017','B'
insert into data select 'R123',1236,'3/14/2017','C'
insert into data select 'R123',1237,'3/14/2018','D'
insert into data select 'R123',1238,'4/14/2018','E'
go
SQL
SELECT b.customer,
b.encounter,
b.datein,
b.category
FROM data b
JOIN (SELECT customer,
Max(encounter) AS maxencounter,
Year(datein) AS groupyear
FROM data
GROUP BY customer,
Year(datein)) a
ON a.maxencounter = b.encounter
AND a.customer = b.customer
DROP TABLE customers
create table customers
(
Customer varchar(10),
encounter bigint,
DDaate VARCHAR(50),
Category varchar(10),
Year int
)
insert into customers values('R123',1234,'3/12/2017','A',2017)
insert into customers values('R123',1235,'7/15/2017','B',2017)
insert into customers values('R123',1236,'3/14/2017','C',2017)
insert into customers values('R123',1237,'3/14/2018','D',2018)
insert into customers values('R123',1234,'4/14/2018','E',2018)
select * from(
SELECT *,row_number() over(partition by Year order by DDaate desc) as rnk FROM customers
) a
where rnk=1