力扣题
1、题目地址
2701. 连续递增交易
2、模拟表
表:Transactions
字段名 | 类型 |
---|---|
transaction_id | int |
customer_id | int |
transaction_date | date |
amount | int |
- transaction_id 是该表的主键。
- 每行包含有关交易的信息,包括唯一的 (customer_id, transaction_date),以及相应的 customer_id 和 amount。
3、要求
编写一个 SQL 查询,找出至少连续三天 amount 递增的客户。
并包括 customer_id 、连续交易期的起始日期和结束日期。
一个客户可以有多个连续的交易。
返回结果并按照 customer_id 升序 排列。
4、示例
输入:
Transactions 表:
transaction_id | customer_id | transaction_date | amount |
---|---|---|---|
1 | 101 | 2023-05-01 | 100 |
2 | 101 | 2023-05-02 | 150 |
3 | 101 | 2023-05-03 | 200 |
4 | 102 | 2023-05-01 | 50 |
5 | 102 | 2023-05-03 | 100 |
6 | 102 | 2023-05-04 | 200 |
7 | 105 | 2023-05-01 | 100 |
8 | 105 | 2023-05-02 | 150 |
9 | 105 | 2023-05-03 | 200 |
10 | 105 | 2023-05-04 | 300 |
11 | 105 | 2023-05-12 | 250 |
12 | 105 | 2023-05-13 | 260 |
13 | 105 | 2023-05-14 | 270 |
输出:
customer_id | consecutive_start | consecutive_end |
---|---|---|
101 | 2023-05-01 | 2023-05-03 |
105 | 2023-05-01 | 2023-05-04 |
105 | 2023-05-12 | 2023-05-14 |
解释:
- customer_id 为 101 的客户在 2023年5月1日 至 2023年5月3日 期间进行了连续递增金额的交易。
- customer_id 为 102 的客户没有至少连续三天的交易。
- customer_id 为 105 的客户有两组连续交易:从 2023年5月1日 至 2023年5月4日,以及 2023年5月12日 至 2023年5月14日。
- 结果按 customer_id 升序排序
5、代码编写
我的代码
1、将不同客户的时间正序排序,将序号排出来。
2、使用 lag 窗口函数数据往后推,往后推之后第一条数据默认值设为0保证有值,由此来判断交易金额递增。
3、将金额差值负数两种情况的数据剔除(因为存在 1,-1,1,1 的情况第二位不能剔除,我们需要把 -1,-1,1 将第一位剔除,-1,1,-1 将第一位剔除),并将序号标上。
4、重新对数据剔除后的数据进行编号,分组根据不同客户和交易时间减去序号天数,然后时间正序查询出来,这是因为原先的序号因为数据剔除的缘故不准需要再次编号。
5、我们只需要根据不同客户和交易时间减去序号天数进行分组,然后查询数量是大于等于3的数据,开始时间(因为存在两种情况 1,1,-1,1,1 和 1,-1,1,1,等下还需进行处理),结束时间直接用开始时间加数量减1天数去推算就行
6、我们只需要将时间加1天和时间加2天查询出来如果是负数,直接取时间最大值就行,将上面开始时间替换就行
with tmp as (select *, row_number() over(partition by customer_id order by transaction_date) AS rnfrom Transactions
), tmp2 as (select customer_id,transaction_date,amount - lag(amount, 1, 0) over(partition by customer_id, date_sub(transaction_date, interval rn day) order by transaction_date) AS diffAmountfrom tmp
), tmp3 as (select customer_id,transaction_date,row_number() over(partition by customer_id order by transaction_date) AS rnfrom tmpwhere not exists (select customer_id, transaction_datefrom tmp2where customer_id = tmp.customer_idand transaction_date = tmp.transaction_dateand diffAmount < 0and (not exists (select * from tmp2 where customer_id = tmp.customer_id and transaction_date in(date_add(tmp.transaction_date, interval 1 day))and diffAmount > 0 ) OR not exists(select * from tmp2 where customer_id = tmp.customer_id and transaction_date in(date_add(tmp.transaction_date, interval 1 day))and diffAmount < 0UNION ALLselect * from tmp2 where customer_id = tmp.customer_id and transaction_date in(date_add(tmp.transaction_date, interval 2 day))and diffAmount > 0)))
), tmp4 as (select customer_id,transaction_date,row_number() over(partition by customer_id, date_sub(transaction_date, interval rn day) order by transaction_date) AS rnfrom tmp3
), tmp5 as (select customer_id, transaction_date AS consecutive_start, date_add(transaction_date, interval count(*)-1 day) AS consecutive_end,count(*) AS numfrom tmp4group by customer_id, date_sub(transaction_date, interval rn day)having count(*) >= 3
)
select tmp5.customer_id, max(if(tmp2.diffAmount<0, tmp2.transaction_date, tmp5.consecutive_start)) AS consecutive_start, tmp5.consecutive_end
from tmp5, tmp2
where tmp2.customer_id = tmp5.customer_id
and tmp2.transaction_date in (date_add(tmp5.consecutive_start, interval 1 day) , date_add(tmp5.consecutive_start, interval 2 day) )
group by 1, 3
网友代码第一种(巧解)
1、确定其下一个日期金额 > 当前日期的行
2、利用 [当前日期-分组row_number]分连续日期的组 date_group
3、组内数据超过 2 条则为满足至少连续 3 天(因为最后一条日期关联不到下一个日期,在一开始就过滤掉了),然后给 max 日期+1
select customer_id,min(transaction_date) as consecutive_start,date_add(max(transaction_date), interval 1 day) as consecutive_end
from (select customer_id,transaction_date,date_sub(transaction_date, interval row_number() over(partition by customer_id order by transaction_date) day) as date_groupfrom (select t1.customer_id,t1.transaction_datefrom Transactions t1left join Transactions t2 on t1.customer_id = t2.customer_id and t1.transaction_date = date_sub(t2.transaction_date, interval 1 day) where t1.amount < t2.amount) a
) b
group by customer_id,date_group
having count(customer_id) >= 2
order by customer_id
网友代码第二种(和我的思路比较接近)
# table1用于筛选出连续日期
with table1 as (select *,datediff(transaction_date, '1970-01-01') - row_number() over (partition by customer_id order by transaction_date) flagfrom transactions
),
# table2用于筛选出table1条件下的连续递增标志
table2 as (select *,if(amount > lag(amount, 1, amount) over (partition by customer_id, flag order by transaction_date), 1, 0) isIncfrom table1
),
# table3用于筛选出table2条件下的连续日期
table3 as (select *,datediff(transaction_date, '1970-01-01') - row_number() over (partition by customer_id, flag order by transaction_date) newFlagfrom table2where isInc = 1
)
select customer_id,# 因为判断递增是比较上一条,则默认递增时第一条忽略,所以这里减一,同时下面having里为2min(transaction_date) - interval 1 day consecutive_start,max(transaction_date) consecutive_end
from table3
group by customer_id, newFlag
having count(*) >= 2
order by customer_id;