一、背景
在以下数据中如何实现对每一个列按照更新时间取最新的非null值?
1 a a null 202301 202301
1 b b null null 202302
1 null c null null 202303
1 d null null null 202304如何实现 1 d c null 202301 202301
二、last_value函数的使用
select last_value(age) over(partition by a order by b,c desc)
SELECT id,last_value(name,TRUE) OVER (PARTITION BY id) name,last_value(age,TRUE) OVER (PARTITION BY id) name,last_value(address,TRUE) OVER (PARTITION BY id) address,last_value(ct_time,TRUE) OVER (PARTITION BY id) ct_time,up_time
FROM
(select *
from
(select 1 as id,'a' as name ,'a' as age,null as address,202301 as ct_time,202301 as up_time
union all
select 1 as id,'b' as name ,'b' as age,null as address,null as ct_time, 202302 as up_time
union all
select 1 as id,null as name,'c' as age,null as address,null as ct_time, 202303 as up_time
union all
select 1 as id,'d' as name ,null as age,null as address,null as ct_time, 202304 as up_time
) t
ORDER BY t.up_time asc
);
在上述sql中,使用last_value函数对每一个列按照主键id分组,取一个最新值,如果遇见null值,使用参数true进行忽略,最后再使用窗口函数row_number进行分组排序取最大一条数据即可实现数据合并。