轻松一点,写一些SQL面试题系列,今天从连续登陆问题开始。连续登陆无疑是数据开发面试高频面试题,但遇到本人面试可能并不会出现这个面试题,一般我会根据实际开发场景出具SQL场景题目,且会设计成从浅入深考察应聘者的SQL开发能力,优秀的同学最终会根据场景题延伸到SQL执行计划理解层面,基本上能过上述一关的可以判断为开发能力优秀,工作中无需过多关心代码质量问题。而对于单一场景SQL开发来说,无法完全判断你是否具有独立开发能力。
连续登录场景描述
在许多应用中,分析用户的登录行为对于理解用户粘性、活跃度以及系统使用情况至关重要。连续登录是其中的一个关键指标,它可以帮助我们识别出那些经常访问系统的忠实用户。在这个场景中,我们想要找出在特定时间段内连续登录了至少N天的用户,并统计他们的连续登录天数。
表结构
假设我们有一个名为user_logins
的表,该表记录了用户的登录历史。表结构如下:
create table tbl_user_login ( user_id int primary key, login_date date not null
);
其中:
user_id
是用户的唯一标识符。login_date
是用户登录的日期。
示例数据
| user_id | login_date |
|---------|------------|
| 1 | 2023-01-01 |
| 1 | 2023-01-02 |
| 1 | 2023-01-04 |
| 2 | 2023-01-01 |
| 2 | 2023-01-03 |
| 2 | 2023-01-04 |
| 3 | 2023-01-01 |
| 3 | 2023-01-02 |
| 3 | 2023-01-03 |
| 4 | 2023-01-02 |
| 4 | 2023-01-04 |
最终SQL
下面是一个完整的SQL查询,用于找出在指定时间段内连续登录了至少N天的用户及其连续登录的天数:
-- 求连续3天登陆用户
with tbl_user_login as (select 1 as user_id, '2023-01-01' as login_date union allselect 1 as user_id, '2023-01-02' as login_date union allselect 1 as user_id, '2023-01-04' as login_date union allselect 2 as user_id, '2023-01-01' as login_date union allselect 2 as user_id, '2023-01-03' as login_date union allselect 2 as user_id, '2023-01-04' as login_date union allselect 3 as user_id, '2023-01-01' as login_date union allselect 3 as user_id, '2023-01-02' as login_date union allselect 3 as user_id, '2023-01-03' as login_date union allselect 4 as user_id, '2023-01-02' as login_date union allselect 4 as user_id, '2023-01-04' as login_date
),
tmp_user_rank as ( select user_id, login_date, -- 为每个用户的每次登录分配了一个行号row_number() over (partition by user_id order by login_date) as rn from tbl_user_login
),
tmp_user_continuously_group as ( select a.user_id, a.login_date, a.rn, datediff(a.login_date, a.rn) as day_difffrom tmp_user_rank a
)select distinct user_id
from tmp_user_continuously_group
group by user_id,day_diff
having count(1) >= 3
;