1、行转列 (扁平化)
数据准备 表 aa
1.1 cross join unnest
在Dremio中,UNNEST
函数用于将数组或复杂类型的列(如JSON、Map或Array类型)中的值“炸裂”(分解)成多行.
with aa as (
select '上海' as city, ARRAY['浦东新区','黄浦区'] as area
union
select '北京' as city, ARRAY['朝阳区','海淀区','昌平区'] as area
)
select city,area_b
from aa
CROSS JOIN UNNEST(aa.area) AS t(area_b)
1.2 flatten
将复合值分解为多行。FLATTEN 函数采用一 LIST
列并生成横向视图(即,包含引用 FROM 子句中它前面的其他表的相关性的内联视图)。
表达式的数据类型 必须为 LIST
with aa as (
select '上海' as city,ARRAY['浦东新区', '黄浦区'] as areas
union
select '北京' as city,ARRAY['朝阳区', '海淀区','昌平区'] as areas
)
SELECT city,FLATTEN(areas) AS area
FROM aa;
两种函数 都能 把 数组 行转为列,结果如下
然而我们的初始数据可能是 带有分割符的字符串 而不是 数组
dremio 提供 字符 转换 为 数组 的 正则表达式函数 REGEXP_SPLIT ( ) 实例如下:
with aa as(
select '上海' as id,'浦东新区,黄浦区' as name
union
select '北京' as id,'朝阳区,海淀区,昌平区' as name
)SELECT REGEXP_SPLIT(name, 'r(?<=,)(?<=,$)', 'ALL', 1) AS "list"
from aa
结果:
更多dremio 图标类型识别:dremio数据类型图标识别
2 、列转行
数据准备
2.1 LISTAGG
将一组行连接成一个字符串列表,并在它们之间放置一个分隔符。 返回字符类型
with aa as (
select '上海' as city,'浦东区新' as area union select '上海' as city,'黄浦区' as area
union
select '北京' as city,'朝阳区' as area union select '北京' as city,'海淀区' as area
)
select city,LISTAGG(DISTINCT area, ' | ')
WITHIN GROUP (ORDER BY area) "city_list"
FROM aa
group by city
返回结果:
2.2 ARRAY_AGG
将提供的表达式聚合到一个数组中。返回数组类型
with aa as(
select '上海' as id,'浦东新区' as name union select '上海' as id,'黄浦区' as name
union
select '北京' as id,'朝阳区' as name union select '北京' as id,'海淀区' as name
union
select '北京' as id,'昌平区' as name
)
SELECT id, ARRAY_AGG(name)
FROM aa
GROUP BY id
结果:
3、转置 ,反转置
数据准备:
3.1 转置 ( PIVOT )
SELECT *
FROM aa
PIVOT (SUM(sales) FOR region IN ('North', 'South', 'East', 'West'))
order by product
结果:
3.2 反转置 (UNPIVOT)
数据准备:
SELECT product, region, sales
FROM aa
UNPIVOT (sales FOR region IN (North, South, East, West))
order by product
结果: