SQL query for custoemr retention

Analyst Exercise
July 01, 2019
Overview:
The ability for a company to retain its customers for a period of time is called
customer retention . Customer retention metrics are a set of measures that tells
how well a company is doing in retaining its customer base. There are various
measures of customer retention and every business/function chooses one that's most suitable for their
business needs. The rider CRM team in Uber would want to find the retention of a customer cohort on a
weekly-rolling basis, so that they can take necessary intervention if a rider has not taken a ride for 28
days.
Exercise 1:
Write a Query in SQL which will give me the following Columns city wise and date-wise. The
following data needs to be found ‘ 2019-06-01 ’. You could choose any sql language you are
comfortable with. Please use the sample data to create a table and write the query.
Definitions of Columns in the Output Query (Single Query):

  1. date: the date in respect to which the below metric will be computed.
  2. city_id: the id of the city
  3. mau_28: Count of distinct riders who have completed min one trip in the last 28 days
    with respect to the date in column ‘date’
  4. previous_mau: count of distinct riders who have taken min one trip between the last 56
    to last 29 days with respect to the date in column ‘date’
  5. retained: intersection of distinct riders in the previous_mau and mau_28 phase
  6. resurrect: count of distinct riders who were inactive in the previous_mau phase but were
    active in the mau_28 phase.
    Active : If a rider has completed at least one trip in the respective period
    Inactive : If a rider has not taken a single trip in the respective period
    Tables Schema:
    Table 1: Trips_Data which stores the data for all the trips
    Columns:
  7. date : The date on which the trip was requested
  8. rider_id: Id of the rider requesting the trip
  9. trip_id: id of the trip request
  10. city_id: Id of the city where the trip was taken
  11. status: Status can be ‘completed’ , ‘cancelled’ or ‘unfulfilled’
    Special Considerations:
  12. A rider may take a trip from multiple cities, which might lead to counting the riders active
    or inactive in multiple cities. Hence to solve this problem a rider needs to be mapped to
    one city only. A rider should be mapped to a city from which they have taken the
    maximum trips considering only their recent 20 trips.
  13. For all calculations related to a city it is important to consider the city mapped to a rider
    instead of the city where the trip occurred.
  14. Our database system does not have a standard mode function so the rider city mapping
    needs to be derived.
  15. Ensure that each city has only one row for a respective date.
  16. Please give enough comments/explanations of the logic in the query so we could assess
    how/what you are trying to achieve while writing the query. Please put a comment for all
    your filters, joins and join conditions, aggregate statement, case statements, etc.
  17. Use “with” clauses or create temporary tables to write the query and avoid using
    subqueries as it impact readability of your code.
    Exercise 2:
    The Marketing team uses this definition measuring Retention and Reactivations from the above
    query result:
    ● Retention: retained / mau_28
    ● Reactivation : resurrect / previous_mau
    Could you assess if this is a correct way to measure retention and reactivation? Else could you
    improve on these definitions with explanations.

I have been trying to solve but unable to solve pls advise ?

what issues are you having ????
please let me know

/*

People in this forum are busy and will not have time to do this ...( create sample data)
we want to directly work on the SQL part ...

We can also help you understand how to write SQL ..

Please post sample data ...for your questions
like this ...

drop table #sampledata
go

create table #sampledata
(ts datetime2,
item varchar(20),
sellerid varchar(20)
)
go

insert into #sampledata
values
('2019-09-05 12:02:55.533','b123 ','dfjk'),
('2019-09-03 12:02:55.533','b123 ','TV12'),
('2019-09-03 12:02:55.533','b123 ','uiop')
go

*/

Very clearly this is some type of homework / class assignment. We can give you general assistance but most don't really want to write it for you as you will learn nothing from that.

Hi Harsih
Sample data

date rider_id trip_id city_id status
6/1/2019 348 1 8 completed
6/1/2019 1729 2 5 completed
6/1/2019 5265 3 4 completed
6/1/2019 2098 4 4 completed
6/1/2019 4942 5 8 completed
6/1/2019 5424 6 11 completed
6/1/2019 4269 7 7 completed
6/1/2019 5649 8 1 completed
6/1/2019 2385 9 6 completed
6/1/2019 5161 10 8 completed
6/1/2019 571 11 8 completed
6/1/2019 5072 12 9 completed
6/1/2019 1233 13 5 completed
6/1/2019 2490 14 5 completed
6/1/2019 5665 15 9 completed
6/1/2019 1400 16 2 completed
6/1/2019 3324 17 4 completed
6/1/2019 2533 18 13 completed
6/1/2019 5314 19 11 completed
6/1/2019 4773 20 12 completed
6/1/2019 5544 21 2 completed
6/1/2019 1232 22 5 completed
6/1/2019 4993 23 9 completed
6/1/2019 5073 24 9 completed
6/1/2019 2788 25 12 completed
6/1/2019 384 26 1 completed
6/1/2019 3510 27 9 completed
6/1/2019 4012 28 2 completed
6/1/2019 5544 29 2 completed
6/1/2019 5037 30 11 completed
6/1/2019 1472 31 1 completed
6/1/2019 826 32 5 completed
6/1/2019 4941 33 8 completed
6/1/2019 1604 34 10 completed
6/1/2019 623 35 10 completed
6/1/2019 3969 36 9 completed
6/1/2019 3810 37 2 completed
6/1/2019 4022 38 1 completed
6/1/2019 376 39 2 completed
6/1/2019 841 40 2 completed
6/1/2019 2622 41 10 completed
6/1/2019 1598 42 5 completed
6/1/2019 607 43 9 completed
6/1/2019 3207 44 13 completed
6/1/2019 3555 45 13 completed
6/1/2019 5264 46 4 completed
6/1/2019 3091 47 4 completed
6/1/2019 5543 48 2 completed
6/1/2019 2156 49 10 completed
6/1/2019 472 50 9 completed
6/1/2019 3898 51 6 completed
6/1/2019 2260 52 7 completed
6/1/2019 1075 53 2 completed
6/1/2019 5102 54 1 completed
6/1/2019 242 55 4 completed
6/1/2019 4762 56 3 completed
6/1/2019 385 57 1 completed
6/1/2019 401 58 1 completed
6/1/2019 2733 59 6 completed
6/1/2019 944 60 6 completed
6/1/2019 3850 61 8 completed
6/1/2019 4575 62 1 completed
6/1/2019 4069 63 9 completed
6/1/2019 4461 64 11 completed
6/1/2019 4667 65 6 completed
6/1/2019 520 66 10 completed
6/1/2019 1399 67 2 completed
6/1/2019 4332 68 9 completed
6/1/2019 3266 69 1 completed
6/1/2019 2161 70 13 completed
6/1/2019 2699 71 12 completed
6/1/2019 4910 72 6 completed
6/1/2019 4879 73 6 completed
6/1/2019 3245 74 6 completed
6/1/2019 1931 75 1 completed
6/1/2019 1970 76 7 completed
6/1/2019 4903 77 10 completed
6/1/2019 4041 78 7 completed
6/1/2019 139 79 8 completed
6/1/2019 2017 80 11 completed
6/1/2019 3947 81 1 completed
6/1/2019 3207 82 13 completed
6/1/2019 3899 83 4 completed
6/1/2019 1391 84 6 completed
6/1/2019 4248 85 12 completed
6/1/2019 942 86 8 completed
6/1/2019 216 87 11 completed
6/1/2019 1233 88 5 completed
6/1/2019 2656 89 13 completed
6/1/2019 3863 90 4 completed

i got your data ...

r u interested in learning how to do the SQL ...!!!!
i know it will be very very tough for new people who dont have any idea !!!!

or if you just want the answers I am ok also ( becuase you are in a rush )

Let me know
:slight_smile:

I know little bit of sql but i need to submit my assignment by tom eod ,so due to shortage of time for me to solve , hence requesting help form the people who are experts in sql queries.

exercise one query required ? Sure i can do it but due to less time i have psoted it in the community for faster resolution,

i will assure u that i will try the same in my own way and agin post the answer which i have got after few days.

Regards

viren

its my bedtime

no access to my computer

tomorrow when i get up in the morning
i will do it

good night

:slight_smile:

hi virin

i tried to do this
its not understandable who has given this way of telling things !!!
i mean .. its not easy to understand what to do !! its not making sense
here in this forum we like to work on the SQL part ... NOT THIS kind of making sense of what somebody has written POORLY

i have worked a little bit please see

drop create data ... click arrow mark
drop table #sampledata
go

create table #sampledata
(
[date] date,
rider_id int,
trip_id int,
city_id int,
status varchar(100)
)
go

insert into #sampledata select'6/1/2019',348,1,8,'completed'
insert into #sampledata select'6/1/2019',1729,2,5,'completed'
insert into #sampledata select'6/1/2019',5265,3,4,'completed'
insert into #sampledata select'6/1/2019',2098,4,4,'completed'
insert into #sampledata select'6/1/2019',4942,5,8,'completed'
insert into #sampledata select'6/1/2019',5424,6,11,'completed'
insert into #sampledata select'6/1/2019',4269,7,7,'completed'
insert into #sampledata select'6/1/2019',5649,8,1,'completed'
insert into #sampledata select'6/1/2019',2385,9,6,'completed'
insert into #sampledata select'6/1/2019',5161,10,8,'completed'
insert into #sampledata select'6/1/2019',571,11,8,'completed'
insert into #sampledata select'6/1/2019',5072,12,9,'completed'
insert into #sampledata select'6/1/2019',1233,13,5,'completed'
insert into #sampledata select'6/1/2019',2490,14,5,'completed'
insert into #sampledata select'6/1/2019',5665,15,9,'completed'
insert into #sampledata select'6/1/2019',1400,16,2,'completed'
insert into #sampledata select'6/1/2019',3324,17,4,'completed'
insert into #sampledata select'6/1/2019',2533,18,13,'completed'
insert into #sampledata select'6/1/2019',5314,19,11,'completed'
insert into #sampledata select'6/1/2019',4773,20,12,'completed'
insert into #sampledata select'6/1/2019',5544,21,2,'completed'
insert into #sampledata select'6/1/2019',1232,22,5,'completed'
insert into #sampledata select'6/1/2019',4993,23,9,'completed'
insert into #sampledata select'6/1/2019',5073,24,9,'completed'
insert into #sampledata select'6/1/2019',2788,25,12,'completed'
insert into #sampledata select'6/1/2019',384,26,1,'completed'
insert into #sampledata select'6/1/2019',3510,27,9,'completed'
insert into #sampledata select'6/1/2019',4012,28,2,'completed'
insert into #sampledata select'6/1/2019',5544,29,2,'completed'
insert into #sampledata select'6/1/2019',5037,30,11,'completed'
insert into #sampledata select'6/1/2019',1472,31,1,'completed'
insert into #sampledata select'6/1/2019',826,32,5,'completed'
insert into #sampledata select'6/1/2019',4941,33,8,'completed'
insert into #sampledata select'6/1/2019',1604,34,10,'completed'
insert into #sampledata select'6/1/2019',623,35,10,'completed'
insert into #sampledata select'6/1/2019',3969,36,9,'completed'
insert into #sampledata select'6/1/2019',3810,37,2,'completed'
insert into #sampledata select'6/1/2019',4022,38,1,'completed'
insert into #sampledata select'6/1/2019',376,39,2,'completed'
insert into #sampledata select'6/1/2019',841,40,2,'completed'
insert into #sampledata select'6/1/2019',2622,41,10,'completed'
insert into #sampledata select'6/1/2019',1598,42,5,'completed'
insert into #sampledata select'6/1/2019',607,43,9,'completed'
insert into #sampledata select'6/1/2019',3207,44,13,'completed'
insert into #sampledata select'6/1/2019',3555,45,13,'completed'
insert into #sampledata select'6/1/2019',5264,46,4,'completed'
insert into #sampledata select'6/1/2019',3091,47,4,'completed'
insert into #sampledata select'6/1/2019',5543,48,2,'completed'
insert into #sampledata select'6/1/2019',2156,49,10,'completed'
insert into #sampledata select'6/1/2019',472,50,9,'completed'
insert into #sampledata select'6/1/2019',3898,51,6,'completed'
insert into #sampledata select'6/1/2019',2260,52,7,'completed'
insert into #sampledata select'6/1/2019',1075,53,2,'completed'
insert into #sampledata select'6/1/2019',5102,54,1,'completed'
insert into #sampledata select'6/1/2019',242,55,4,'completed'
insert into #sampledata select'6/1/2019',4762,56,3,'completed'
insert into #sampledata select'6/1/2019',385,57,1,'completed'
insert into #sampledata select'6/1/2019',401,58,1,'completed'
insert into #sampledata select'6/1/2019',2733,59,6,'completed'
insert into #sampledata select'6/1/2019',944,60,6,'completed'
insert into #sampledata select'6/1/2019',3850,61,8,'completed'
insert into #sampledata select'6/1/2019',4575,62,1,'completed'
insert into #sampledata select'6/1/2019',4069,63,9,'completed'
insert into #sampledata select'6/1/2019',4461,64,11,'completed'
insert into #sampledata select'6/1/2019',4667,65,6,'completed'
insert into #sampledata select'6/1/2019',520,66,10,'completed'
insert into #sampledata select'6/1/2019',1399,67,2,'completed'
insert into #sampledata select'6/1/2019',4332,68,9,'completed'
insert into #sampledata select'6/1/2019',3266,69,1,'completed'
insert into #sampledata select'6/1/2019',2161,70,13,'completed'
insert into #sampledata select'6/1/2019',2699,71,12,'completed'
insert into #sampledata select'6/1/2019',4910,72,6,'completed'
insert into #sampledata select'6/1/2019',4879,73,6,'completed'
insert into #sampledata select'6/1/2019',3245,74,6,'completed'
insert into #sampledata select'6/1/2019',1931,75,1,'completed'
insert into #sampledata select'6/1/2019',1970,76,7,'completed'
insert into #sampledata select'6/1/2019',4903,77,10,'completed'
insert into #sampledata select'6/1/2019',4041,78,7,'completed'
insert into #sampledata select'6/1/2019',139,79,8,'completed'
insert into #sampledata select'6/1/2019',2017,80,11,'completed'
insert into #sampledata select'6/1/2019',3947,81,1,'completed'
insert into #sampledata select'6/1/2019',3207,82,13,'completed'
insert into #sampledata select'6/1/2019',3899,83,4,'completed'
insert into #sampledata select'6/1/2019',1391,84,6,'completed'
insert into #sampledata select'6/1/2019',4248,85,12,'completed'
insert into #sampledata select'6/1/2019',942,86,8,'completed'
insert into #sampledata select'6/1/2019',216,87,11,'completed'
insert into #sampledata select'6/1/2019',1233,88,5,'completed'
insert into #sampledata select'6/1/2019',2656,89,13,'completed'
insert into #sampledata select'6/1/2019',3863,90,4,'completed'
go 

select * from #sampledata
go
what i tried .. click arrow mark
/*
Exercise 1:
*/
-------------------------------------------------------------------------------------
1
date: the date in respect to which the below metric will be computed.
--------------------------------------------------------------------------------------
2 
city_id: the id of the city
---------------------------------------------------------------------------------------
3
mau_28: Count of distinct riders who have completed min one trip in the last 28 days
with respect to the date in column ‘date’

count(distinct rider_id) having count(trip_) >= 1
where [date] between  '2019-06-01' and '2019-06-01'-28 days 
group by rider_id 
---------------------------------------------------------------------------------------
4
previous_mau: count of distinct riders who have taken min one trip between the last 56
to last 29 days with respect to the date in column ‘date’

count(distinct rider_id) having count(trip_) >= 1
where [date] between  '2019-06-01'-56 days  and '2019-06-01'-29 days 
group by rider_id
--------------------------------------------------------------------------------------
5
retained: intersection of distinct riders in the previous_mau and mau_28 phase
select rider_id from mau_28
intersect
select rider_id from previous_mau:
--------------------------------------------------------------------------------------
6
resurrect: count of distinct riders who were inactive in the previous_mau phase but were
active in the mau_28 phase.
Active : If a rider has completed at least one trip in the respective period
Inactive : If a rider has not taken a single trip in the respective period
Tables Schema:
Table 1: Trips_Data which stores the data for all the trips
Columns:
date : The date on which the trip was requested
rider_id: Id of the rider requesting the trip
trip_id: id of the trip request
city_id: Id of the city where the trip was taken
status: Status can be ‘completed’ , ‘cancelled’ or ‘unfulfilled’
Special Considerations:
--------------------------------------------------------------------------------------

Thanks buddy

Dear Viren

Did you get the solution? If so, can you please share it to me @ravish.ranjan@gmail.com

1 Like

Can you send me this on anindakar1993@gmail.com

Can hou send me here?
anindakar1993@gmail.com

hi Aninda

Please let me know what to send !! what are you looking for !!!

to anindakar1993@gmail.com

I will definitely send !!

Sure, I can help you with that. Here's an example query that will give you the required output for Exercise 1:
WITH rider_trips AS (
SELECT rider_id, city_id, date,
ROW_NUMBER() OVER (PARTITION BY rider_id ORDER BY date DESC) AS rn
FROM trips_data
WHERE status = 'completed'
),
rider_cities AS (
SELECT rider_id, city_id
FROM rider_trips
WHERE rn <= 20
GROUP BY rider_id, city_id
),
active_riders AS (
SELECT city_id, date, COUNT(DISTINCT rider_id) AS mau_28
FROM rider_trips
WHERE date BETWEEN DATE_SUB('2019-06-01', INTERVAL 27 DAY) AND '2019-06-01'
GROUP BY city_id, date
),
previous_mau AS (
SELECT city_id, DATE_SUB(date, INTERVAL 28 DAY) AS date,
COUNT(DISTINCT rider_id) AS previous_mau
FROM rider_trips
WHERE date BETWEEN DATE_SUB('2019-06-01', INTERVAL 56 DAY) AND DATE_SUB('2019-06-01', INTERVAL 29 DAY)
GROUP BY city_id, date
),
retained AS (
SELECT a.city_id, a.date, COUNT(DISTINCT a.rider_id) AS retained
FROM rider_trips a
JOIN active_riders b ON a.city_id = b.city_id AND a.date = b.date
JOIN previous_mau c ON a.city_id = c.city_id AND c.date = DATE_SUB(b.date, INTERVAL 28 DAY)
GROUP BY a.city_id, a.date
),
resurrect AS (
SELECT a.city_id, a.date, COUNT(DISTINCT a.rider_id) AS resurrect
FROM rider_trips a
JOIN active_riders b ON a.city_id = b.city_id AND a.date = b.date
JOIN previous_mau c ON a.city_id = c.city_id AND c.date = DATE_SUB(b.date, INTERVAL 28 DAY)
LEFT JOIN retained d ON a.city_id = d.city_id AND a.date = d.date
WHERE d.city_id IS NULL
GROUP BY a.city_id, a.date
)
SELECT a.date, a.city_id,
COALESCE(b.mau_28, 0) AS mau_28,
COALESCE(c.previous_mau, 0) AS previous_mau,
COALESCE(d.retained, 0) AS retained,
COALESCE(e.resurrect, 0) AS resurrect
FROM (SELECT DISTINCT date, city_id FROM trips_data WHERE date = '2019-06-01') a
LEFT JOIN active_riders b ON a.city_id = b.city_id AND a.date = b.date
LEFT JOIN previous_mau c ON a.city_id = c.city_id AND a.date = c.date
LEFT JOIN retained d ON a.city_id = d.city_id AND a.date = d.date
LEFT JOIN resurrect e ON a.city_id = e.city_id AND a.date = e.date;

Explanation of the query:

  1. The first common table expression (CTE) rider_trips selects all completed trips for each rider, along with the corresponding city and date. The ROW_NUMBER() function is used to assign a row number to each trip for each rider, in descending order of date.
  2. The second CTE rider_cities maps each rider to a single city based on the city where they have taken the maximum number of trips in their recent 20 trips.