问题
nnnn如下为某平台的商品促销数据,字段含义分别为品牌名称、打折开始日期、打折结束日期,现在要计算每个品牌的打折销售天数(注意其中的交叉日期)。比如vivo的打折销售天数就为17天。
nnnnbrand start_date end_datennxiaomi 2021-06-05 2021-06-09nxiaomi 2021-06-11 2021-06-21nvivo 2021-06-05 2021-06-15nvivo 2021-06-09 2021-06-21 nhonor 2021-06-05 2021-06-21 nhonor 2021-06-09 2021-06-15nredmi 2021-06-17 2021-06-26nhuawei 2021-06-05 2021-06-26nhuawei 2021-06-09 2021-06-15nhuawei 2021-06-17 2021-06-21
nnnn解答
nnnn方法一,对每行数据打折范围,枚举出所有打折日期,对品牌+打折日期去重后,得到每个品牌的打折日期都是哪几天,每天一行
nnnnwith temp as (n select 'xiaomi' as brand ,'2021-06-05' as start_date,'2021-06-09' as end_daten union alln select 'xiaomi' as brand ,'2021-06-11' as start_date,'2021-06-21' as end_daten union alln select 'vivo' as brand ,'2021-06-05' as start_date,'2021-06-15' as end_daten union alln select 'vivo' as brand ,'2021-06-09' as start_date,'2021-06-21' as end_daten union all n select 'honor' as brand ,'2021-06-05' as start_date,'2021-06-21' as end_daten union all n select 'honor' as brand ,'2021-06-09' as start_date,'2021-06-15' as end_daten union alln select 'honor' as brand ,'2021-06-17' as start_date,'2021-06-26' as end_daten union alln select 'huawei' as brand ,'2021-06-05' as start_date,'2021-06-26' as end_daten union alln select 'huawei' as brand ,'2021-06-09' as start_date,'2021-06-15' as end_daten union alln select 'huawei' as brand ,'2021-06-17' as start_date,'2021-06-21' as end_daten)nnselectnbrandn,count(distinct dt) as dtsnfrom (nselectn brandn ,start_daten ,end_daten ,date_add(start_date,tmp.col_idx) as dtnfrom tempnlateral VIEW posexplode(split(repeat("#,",datediff(date(end_date), date(start_date))),'#')) tmp AS col_idx,col_valn) tt ngroup by brandn;
nnnn方法二,每个品牌下,打折开始时间升序,当前开始时间 对比 上一行结束打折时间是否有交集(start_date<=max_date),有的话使用上一行结束打折日期后一天,来规避日期范围交集
nnnnwith temp as (n select 'xiaomi' as brand ,'2021-06-05' as start_date,'2021-06-09' as end_daten union alln select 'xiaomi' as brand ,'2021-06-11' as start_date,'2021-06-21' as end_daten union alln select 'vivo' as brand ,'2021-06-05' as start_date,'2021-06-15' as end_daten union alln select 'vivo' as brand ,'2021-06-09' as start_date,'2021-06-21' as end_daten union all n select 'honor' as brand ,'2021-06-05' as start_date,'2021-06-21' as end_daten union all n select 'honor' as brand ,'2021-06-09' as start_date,'2021-06-15' as end_daten union alln select 'honor' as brand ,'2021-06-17' as start_date,'2021-06-26' as end_daten union alln select 'huawei' as brand ,'2021-06-05' as start_date,'2021-06-26' as end_daten union alln select 'huawei' as brand ,'2021-06-09' as start_date,'2021-06-15' as end_daten union alln select 'huawei' as brand ,'2021-06-17' as start_date,'2021-06-21' as end_daten)nnselectnbrandn,sum(datediff(date(end_date),date(start_date))+1)nfromn(nselectn brandn ,casen when start_date<=max_date then date_add(date(max_date),1)n else start_date endn as start_daten ,end_daten from(n selectn brandn ,start_daten ,end_daten ,max(end_date) over(partition by brand order by start_date rows between UNBOUNDED PRECEDING and 1 PRECEDING ) as max_date --获取同一品牌内按开始日期排序后,取第一行到前一行的最大结束时间n from tempn )t1n )t1nwhere end_date>=start_datengroup by brandn;
n
Comments NOTHING