问题
nnnn下面是某游戏公司记录的用户每日登录数据, 计算每个用户最大的连续登录天数,定义连续登录时可以间隔一天。举例:如果一个用户在 1,3,5,6,9 登录了游戏,则视为连续 6 天登录。
nnnnuser_id dtn1001 2021-12-12n1002 2021-12-12n1001 2021-12-13n1001 2021-12-14n1001 2021-12-16n1002 2021-12-16n1001 2021-12-19n1002 2021-12-17n1001 2021-12-20
nnnn解答
nnnn这是个连续问题的升级版,当满足某种要求时我们也是算作连续的,所以不能使用传统的连续编号,然后做差值的解法了。
nnnn核心思路解析如下:
nnnn登录日期 | 第一步:上一个日期 | 第二步:判断登录日期与上一个日期差值是否在2之内 | 第三步:然后根据标记开窗做sum |
---|---|---|---|
1 | 1 | 0 | 0 |
3 | 1 | 0 | 0 |
5 | 3 | 0 | 0 |
6 | 5 | 0 | 0 |
9 | 6 | 1 | 1 |
11 | 9 | 0 | 1 |
with temp as ( nselect '1001' as user_id , '2021-12-12' as dtnunion all nselect '1002' as user_id , '2021-12-12' as dtnunion allnselect '1001' as user_id , '2021-12-13' as dtnunion allnselect '1001' as user_id , '2021-12-14' as dtnunion allnselect '1001' as user_id , '2021-12-16' as dtnunion allnselect '1002' as user_id , '2021-12-16' as dtnunion allnselect '1001' as user_id , '2021-12-19' as dtnunion allnselect '1002' as user_id , '2021-12-17' as dtnunion allnselect '1001' as user_id , '2021-12-20' as dtn)nnselectnuser_idn,max(diff) as max_login_daysnfromn( selectn user_id n ,user_groupn ,datediff(max(dt),min(dt))+1 as diff --拿到每个用户下,连续时间里面最大日期与最小日期的差值加1就得到来连续天数n fromn (nselectn user_idn ,dtn -- 如果当前日期与上一个日期的差值在2之内,那么就给0,否则给1n ,sum(if(datediff(dt,last_dt)<=2,0,1)) over(partition by user_id order by dt) as user_groupn fromn (nselectn user_idn ,dtn ,lag(dt,1,dt) over(partition by user_id order by dt) as last_dt --根据user_id分组,拿到当前行的上一个日期,没有上一个就给自己本身的值n from tempn)t1n)t1ngroup by user_id ,user_groupn)t1ngroup by user_id n;
n
Comments NOTHING