大数据项目实战之数据仓库:电商数据仓库系统——第10章 数仓开发之DWS层

文章目录

  • 第10章 数仓开发之DWS层
    • 10.1 最近1日汇总表
      • 10.1.1 交易域用户商品粒度订单最近1日汇总表
      • 10.1.2 交易域用户商品粒度退单最近1日汇总表
      • 10.1.3 交易域用户粒度订单最近1日汇总表
      • 10.1.4 交易域用户粒度加购最近1日汇总表
      • 10.1.5 交易域用户粒度支付最近1日汇总表
      • 10.1.6 交易域省份粒度订单最近1日汇总表
      • 10.1.7 交易域用户粒度退单最近1日汇总表
      • 10.1.8 流量域会话粒度页面浏览最近1日汇总表
      • 10.1.9 流量域访客页面粒度页面浏览最近1日汇总表
      • 10.1.10 数据装载脚本
    • 10.2 最近n日汇总表
      • 10.2.1 交易域用户商品粒度订单最近n日汇总表
      • 10.2.2 交易域用户商品粒度退单最近n日汇总表
      • 10.2.3 交易域用户粒度订单最近n日汇总表
      • 10.2.4 交易域用户粒度加购最近n日汇总表
      • 10.2.5 交易域用户粒度支付最近n日汇总表
      • 10.2.6 交易域省份粒度订单最近n日汇总表
      • 10.2.7 交易域优惠券粒度订单最近n日汇总表
      • 10.2.8 交易域活动粒度订单最近n日汇总表
      • 10.2.9 交易域用户粒度退单最近n日汇总表
      • 10.2.10 流量域访客页面粒度页面浏览最近n日汇总表
      • 10.2.11 数据装载脚本
    • 10.3 历史至今汇总表
      • 10.3.1 交易域用户粒度订单历史至今汇总表
      • 10.3.2 交易域用户粒度支付历史至今汇总表
      • 10.3.3 用户域用户粒度登录历史至今汇总表
      • 10.3.4 数据装载脚本

第10章 数仓开发之DWS层

设计要点:

(1)DWS层的设计参考指标体系。

(2)DWS层的数据存储格式为ORC列式存储 + snappy压缩。

(3)DWS层表名的命名规范为dws_数据域_统计粒度_业务过程_统计周期(1d/nd/td)

注:1d表示最近1日,nd表示最近n日,td表示历史至今。

10.1 最近1日汇总表

10.1.1 交易域用户商品粒度订单最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_sku_order_1d;
CREATE EXTERNAL TABLE dws_trade_user_sku_order_1d
(`user_id`                   STRING COMMENT '用户id',`sku_id`                    STRING COMMENT 'sku_id',`sku_name`                  STRING COMMENT 'sku名称',`category1_id`              STRING COMMENT '一级分类id',`category1_name`            STRING COMMENT '一级分类名称',`category2_id`              STRING COMMENT '一级分类id',`category2_name`            STRING COMMENT '一级分类名称',`category3_id`              STRING COMMENT '一级分类id',`category3_name`            STRING COMMENT '一级分类名称',`tm_id`                     STRING COMMENT '品牌id',`tm_name`                   STRING COMMENT '品牌名称',`order_count_1d`            BIGINT COMMENT '最近1日下单次数',`order_num_1d`              BIGINT COMMENT '最近1日下单件数',`order_original_amount_1d`  DECIMAL(16, 2) COMMENT '最近1日下单原始金额',`activity_reduce_amount_1d` DECIMAL(16, 2) COMMENT '最近1日活动优惠金额',`coupon_reduce_amount_1d`   DECIMAL(16, 2) COMMENT '最近1日优惠券优惠金额',`order_total_amount_1d`     DECIMAL(16, 2) COMMENT '最近1日下单最终金额'
) COMMENT '交易域用户商品粒度订单最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_sku_order_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dws_trade_user_sku_order_1d partition(dt)
selectuser_id,id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_count_1d,order_num_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1d,dt
from
(selectdt,user_id,sku_id,count(*) order_count_1d,sum(sku_num) order_num_1d,sum(split_original_amount) order_original_amount_1d,sum(nvl(split_activity_amount,0.0)) activity_reduce_amount_1d,sum(nvl(split_coupon_amount,0.0)) coupon_reduce_amount_1d,sum(split_total_amount) order_total_amount_1dfrom dwd_trade_order_detail_incgroup by dt,user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom dim_sku_fullwhere dt='2020-06-14'
)sku
on od.sku_id=sku.id;

(2)每日装载

insert overwrite table dws_trade_user_sku_order_1d partition(dt='2020-06-15')
selectuser_id,id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_count,order_num,order_original_amount,activity_reduce_amount,coupon_reduce_amount,order_total_amount
from
(selectuser_id,sku_id,count(*) order_count,sum(sku_num) order_num,sum(split_original_amount) order_original_amount,sum(nvl(split_activity_amount,0)) activity_reduce_amount,sum(nvl(split_coupon_amount,0)) coupon_reduce_amount,sum(split_total_amount) order_total_amountfrom dwd_trade_order_detail_incwhere dt='2020-06-15'group by user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom dim_sku_fullwhere dt='2020-06-15'
)sku
on od.sku_id=sku.id;

10.1.2 交易域用户商品粒度退单最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_sku_order_refund_1d;
CREATE EXTERNAL TABLE dws_trade_user_sku_order_refund_1d
(`user_id`                    STRING COMMENT '用户id',`sku_id`                     STRING COMMENT 'sku_id',`sku_name`                   STRING COMMENT 'sku名称',`category1_id`               STRING COMMENT '一级分类id',`category1_name`             STRING COMMENT '一级分类名称',`category2_id`               STRING COMMENT '一级分类id',`category2_name`             STRING COMMENT '一级分类名称',`category3_id`               STRING COMMENT '一级分类id',`category3_name`             STRING COMMENT '一级分类名称',`tm_id`                      STRING COMMENT '品牌id',`tm_name`                    STRING COMMENT '品牌名称',`order_refund_count_1d`      BIGINT COMMENT '最近1日退单次数',`order_refund_num_1d`        BIGINT COMMENT '最近1日退单件数',`order_refund_amount_1d`     DECIMAL(16, 2) COMMENT '最近1日退单金额'
) COMMENT '交易域用户商品粒度退单最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_sku_order_refund_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dws_trade_user_sku_order_refund_1d partition(dt)
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_refund_count,order_refund_num,order_refund_amount,dt
from
(selectdt,user_id,sku_id,count(*) order_refund_count,sum(refund_num) order_refund_num,sum(refund_amount) order_refund_amountfrom dwd_trade_order_refund_incgroup by dt,user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom dim_sku_fullwhere dt='2020-06-14'
)sku
on od.sku_id=sku.id;

(2)每日装载

insert overwrite table dws_trade_user_sku_order_refund_1d partition(dt='2020-06-15')
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_refund_count,order_refund_num,order_refund_amount
from
(selectuser_id,sku_id,count(*) order_refund_count,sum(refund_num) order_refund_num,sum(refund_amount) order_refund_amountfrom dwd_trade_order_refund_incwhere dt='2020-06-15'group by user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom dim_sku_fullwhere dt='2020-06-15'
)sku
on od.sku_id=sku.id;

10.1.3 交易域用户粒度订单最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_order_1d;
CREATE EXTERNAL TABLE dws_trade_user_order_1d
(`user_id`                   STRING COMMENT '用户id',`order_count_1d`            BIGINT COMMENT '最近1日下单次数',`order_num_1d`              BIGINT COMMENT '最近1日下单商品件数',`order_original_amount_1d`  DECIMAL(16, 2) COMMENT '最近1日最近1日下单原始金额',`activity_reduce_amount_1d` DECIMAL(16, 2) COMMENT '最近1日下单活动优惠金额',`coupon_reduce_amount_1d`   DECIMAL(16, 2) COMMENT '下单优惠券优惠金额',`order_total_amount_1d`     DECIMAL(16, 2) COMMENT '最近1日下单最终金额'
) COMMENT '交易域用户粒度订单最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_order_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

insert overwrite table dws_trade_user_order_1d partition(dt)
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_original_amount),sum(nvl(split_activity_amount,0)),sum(nvl(split_coupon_amount,0)),sum(split_total_amount),dt
from dwd_trade_order_detail_inc
group by user_id,dt;

(2)每日装载

insert overwrite table dws_trade_user_order_1d partition(dt='2020-06-15')
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_original_amount),sum(nvl(split_activity_amount,0)),sum(nvl(split_coupon_amount,0)),sum(split_total_amount)
from dwd_trade_order_detail_inc
where dt='2020-06-15'
group by user_id;

10.1.4 交易域用户粒度加购最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_cart_add_1d;
CREATE EXTERNAL TABLE dws_trade_user_cart_add_1d
(`user_id`           STRING COMMENT '用户id',`cart_add_count_1d` BIGINT COMMENT '最近1日加购次数',`cart_add_num_1d`   BIGINT COMMENT '最近1日加购商品件数'
) COMMENT '交易域用户粒度加购最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_cart_add_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

insert overwrite table dws_trade_user_cart_add_1d partition(dt)
selectuser_id,count(*),sum(sku_num),dt
from dwd_trade_cart_add_inc
group by user_id,dt;

(2)每日装载

insert overwrite table dws_trade_user_cart_add_1d partition(dt='2020-06-15')
selectuser_id,count(*),sum(sku_num)
from dwd_trade_cart_add_inc
where dt='2020-06-15'
group by user_id;

10.1.5 交易域用户粒度支付最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_payment_1d;
CREATE EXTERNAL TABLE dws_trade_user_payment_1d
(`user_id`           STRING COMMENT '用户id',`payment_count_1d`  BIGINT COMMENT '最近1日支付次数',`payment_num_1d`    BIGINT COMMENT '最近1日支付商品件数',`payment_amount_1d` DECIMAL(16, 2) COMMENT '最近1日支付金额'
) COMMENT '交易域用户粒度支付最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_payment_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

insert overwrite table dws_trade_user_payment_1d partition(dt)
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_payment_amount),dt
from dwd_trade_pay_detail_suc_inc
group by user_id,dt;

(2)每日装载

insert overwrite table dws_trade_user_payment_1d partition(dt='2020-06-15')
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_payment_amount)
from dwd_trade_pay_detail_suc_inc
where dt='2020-06-15'
group by user_id;

10.1.6 交易域省份粒度订单最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_province_order_1d;
CREATE EXTERNAL TABLE dws_trade_province_order_1d
(`province_id`               STRING COMMENT '用户id',`province_name`             STRING COMMENT '省份名称',`area_code`                 STRING COMMENT '地区编码',`iso_code`                  STRING COMMENT '旧版ISO-3166-2编码',`iso_3166_2`                STRING COMMENT '新版版ISO-3166-2编码',`order_count_1d`            BIGINT COMMENT '最近1日下单次数',`order_original_amount_1d`  DECIMAL(16, 2) COMMENT '最近1日下单原始金额',`activity_reduce_amount_1d` DECIMAL(16, 2) COMMENT '最近1日下单活动优惠金额',`coupon_reduce_amount_1d`   DECIMAL(16, 2) COMMENT '最近1日下单优惠券优惠金额',`order_total_amount_1d`     DECIMAL(16, 2) COMMENT '最近1日下单最终金额'
) COMMENT '交易域省份粒度订单最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_province_order_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dws_trade_province_order_1d partition(dt)
selectprovince_id,province_name,area_code,iso_code,iso_3166_2,order_count_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1d,dt
from
(selectprovince_id,count(distinct(order_id)) order_count_1d,sum(split_original_amount) order_original_amount_1d,sum(nvl(split_activity_amount,0)) activity_reduce_amount_1d,sum(nvl(split_coupon_amount,0)) coupon_reduce_amount_1d,sum(split_total_amount) order_total_amount_1d,dtfrom dwd_trade_order_detail_incgroup by province_id,dt
)o
left join
(selectid,province_name,area_code,iso_code,iso_3166_2from dim_province_fullwhere dt='2020-06-14'
)p
on o.province_id=p.id;

(2)每日装载

insert overwrite table dws_trade_province_order_1d partition(dt='2020-06-15')
selectprovince_id,province_name,area_code,iso_code,iso_3166_2,order_count_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1d
from
(selectprovince_id,count(distinct(order_id)) order_count_1d,sum(split_original_amount) order_original_amount_1d,sum(nvl(split_activity_amount,0)) activity_reduce_amount_1d,sum(nvl(split_coupon_amount,0)) coupon_reduce_amount_1d,sum(split_total_amount) order_total_amount_1dfrom dwd_trade_order_detail_incwhere dt='2020-06-15'group by province_id
)o
left join
(selectid,province_name,area_code,iso_code,iso_3166_2from dim_province_fullwhere dt='2020-06-15'
)p
on o.province_id=p.id;

10.1.7 交易域用户粒度退单最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_order_refund_1d;
CREATE EXTERNAL TABLE dws_trade_user_order_refund_1d
(`user_id`                STRING COMMENT '用户id',`order_refund_count_1d`  BIGINT COMMENT '最近1日退单次数',`order_refund_num_1d`    BIGINT COMMENT '最近1日退单商品件数',`order_refund_amount_1d` DECIMAL(16, 2) COMMENT '最近1日退单金额'
) COMMENT '交易域用户粒度退单最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_order_refund_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dws_trade_user_order_refund_1d partition(dt)
selectuser_id,count(*) order_refund_count,sum(refund_num) order_refund_num,sum(refund_amount) order_refund_amount,dt
from dwd_trade_order_refund_inc
group by user_id,dt;

(2)每日装载

insert overwrite table dws_trade_user_order_refund_1d partition(dt='2020-06-15')
selectuser_id,count(*),sum(refund_num),sum(refund_amount)
from dwd_trade_order_refund_inc
where dt='2020-06-15'
group by user_id;

10.1.8 流量域会话粒度页面浏览最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_traffic_session_page_view_1d;
CREATE EXTERNAL TABLE dws_traffic_session_page_view_1d
(`session_id`     STRING COMMENT '会话id',`mid_id`         string comment '设备id',`brand`          string comment '手机品牌',`model`          string comment '手机型号',`operate_system` string comment '操作系统',`version_code`   string comment 'app版本号',`channel`        string comment '渠道',`during_time_1d` BIGINT COMMENT '最近1日访问时长',`page_count_1d`  BIGINT COMMENT '最近1日访问页面数'
) COMMENT '流量域会话粒度页面浏览最近1日汇总表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_traffic_session_page_view_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_traffic_session_page_view_1d partition(dt='2020-06-14')
selectsession_id,mid_id,brand,model,operate_system,version_code,channel,sum(during_time),count(*)
from dwd_traffic_page_view_inc
where dt='2020-06-14'
group by session_id,mid_id,brand,model,operate_system,version_code,channel;

10.1.9 流量域访客页面粒度页面浏览最近1日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_traffic_page_visitor_page_view_1d;
CREATE EXTERNAL TABLE dws_traffic_page_visitor_page_view_1d
(`mid_id`         STRING COMMENT '访客id',`brand`          string comment '手机品牌',`model`          string comment '手机型号',`operate_system` string comment '操作系统',`page_id`        STRING COMMENT '页面id',`during_time_1d` BIGINT COMMENT '最近1日浏览时长',`view_count_1d`  BIGINT COMMENT '最近1日访问次数'
) COMMENT '流量域访客页面粒度页面浏览最近1日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_traffic_page_visitor_page_view_1d'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_traffic_page_visitor_page_view_1d partition(dt='2020-06-14')
selectmid_id,brand,model,operate_system,page_id,sum(during_time),count(*)
from dwd_traffic_page_view_inc
where dt='2020-06-14'
group by mid_id,brand,model,operate_system,page_id;

10.1.10 数据装载脚本

1)首日数据装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建dwd_to_dws_1d_init.sh

[atguigu@hadoop102 bin]$ vim dwd_to_dws_1d_init.sh

(2)编写如下内容

#!/bin/bash
APP=gmallif [ -n "$2" ] ;thendo_date=$2
else echo "请传入日期参数"exit
fidws_trade_province_order_1d="
insert overwrite table ${APP}.dws_trade_province_order_1d partition(dt)
selectprovince_id,province_name,area_code,iso_code,iso_3166_2,order_count_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1d,dt
from
(selectprovince_id,count(distinct(order_id)) order_count_1d,sum(split_original_amount) order_original_amount_1d,sum(nvl(split_activity_amount,0)) activity_reduce_amount_1d,sum(nvl(split_coupon_amount,0)) coupon_reduce_amount_1d,sum(split_total_amount) order_total_amount_1d,dtfrom ${APP}.dwd_trade_order_detail_incgroup by province_id,dt
)o
left join
(selectid,province_name,area_code,iso_code,iso_3166_2from ${APP}.dim_province_fullwhere dt='$do_date'
)p
on o.province_id=p.id;
"
dws_trade_user_cart_add_1d="
insert overwrite table ${APP}.dws_trade_user_cart_add_1d partition(dt)
selectuser_id,count(*),sum(sku_num),dt
from ${APP}.dwd_trade_cart_add_inc
group by user_id,dt;
"
dws_trade_user_order_1d="
insert overwrite table ${APP}.dws_trade_user_order_1d partition(dt)
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_original_amount),sum(nvl(split_activity_amount,0)),sum(nvl(split_coupon_amount,0)),sum(split_total_amount),dt
from ${APP}.dwd_trade_order_detail_inc
group by user_id,dt;
"
dws_trade_user_order_refund_1d="
insert overwrite table ${APP}.dws_trade_user_order_refund_1d partition(dt)
selectuser_id,count(*) order_refund_count,sum(refund_num) order_refund_num,sum(refund_amount) order_refund_amount,dt
from ${APP}.dwd_trade_order_refund_inc
group by user_id,dt;
"
dws_trade_user_payment_1d="
insert overwrite table ${APP}.dws_trade_user_payment_1d partition(dt)
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_payment_amount),dt
from ${APP}.dwd_trade_pay_detail_suc_inc
group by user_id,dt;
"
dws_trade_user_sku_order_1d="
insert overwrite table ${APP}.dws_trade_user_sku_order_1d partition(dt)
selectuser_id,id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_count_1d,order_num_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1d,dt
from
(selectdt,user_id,sku_id,count(*) order_count_1d,sum(sku_num) order_num_1d,sum(split_original_amount) order_original_amount_1d,sum(nvl(split_activity_amount,0.0)) activity_reduce_amount_1d,sum(nvl(split_coupon_amount,0.0)) coupon_reduce_amount_1d,sum(split_total_amount) order_total_amount_1dfrom ${APP}.dwd_trade_order_detail_incgroup by dt,user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom ${APP}.dim_sku_fullwhere dt='$do_date'
)sku
on od.sku_id=sku.id;
"
dws_trade_user_sku_order_refund_1d="
insert overwrite table ${APP}.dws_trade_user_sku_order_refund_1d partition(dt)
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_refund_count,order_refund_num,order_refund_amount,dt
from
(selectdt,user_id,sku_id,count(*) order_refund_count,sum(refund_num) order_refund_num,sum(refund_amount) order_refund_amountfrom ${APP}.dwd_trade_order_refund_incgroup by dt,user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom ${APP}.dim_sku_fullwhere dt='$do_date'
)sku
on od.sku_id=sku.id;
"
dws_traffic_page_visitor_page_view_1d="
insert overwrite table ${APP}.dws_traffic_page_visitor_page_view_1d partition(dt='$do_date')
selectmid_id,brand,model,operate_system,page_id,sum(during_time),count(*)
from ${APP}.dwd_traffic_page_view_inc
where dt='$do_date'
group by mid_id,brand,model,operate_system,page_id;
"
dws_traffic_session_page_view_1d="
insert overwrite table ${APP}.dws_traffic_session_page_view_1d partition(dt='$do_date')
selectsession_id,mid_id,brand,model,operate_system,version_code,channel,sum(during_time),count(*)
from ${APP}.dwd_traffic_page_view_inc
where dt='$do_date'
group by session_id,mid_id,brand,model,operate_system,version_code,channel;
"case $1 in"dws_trade_province_order_1d" )hive -e "$dws_trade_province_order_1d";;"dws_trade_user_cart_add_1d" )hive -e "$dws_trade_user_cart_add_1d";;"dws_trade_user_order_1d" )hive -e "$dws_trade_user_order_1d";;"dws_trade_user_order_refund_1d" )hive -e "$dws_trade_user_order_refund_1d";;"dws_trade_user_payment_1d" )hive -e "$dws_trade_user_payment_1d";;"dws_trade_user_sku_order_1d" )hive -e "$dws_trade_user_sku_order_1d";;"dws_trade_user_sku_order_refund_1d" )hive -e "$dws_trade_user_sku_order_refund_1d";;"dws_traffic_page_visitor_page_view_1d" )hive -e "$dws_traffic_page_visitor_page_view_1d";;"dws_traffic_session_page_view_1d" )hive -e "$dws_traffic_session_page_view_1d";;"all" )hive -e "$dws_trade_province_order_1d$dws_trade_user_cart_add_1d$dws_trade_user_order_1d$dws_trade_user_order_refund_1d$dws_trade_user_payment_1d$dws_trade_user_sku_order_1d$dws_trade_user_sku_order_refund_1d$dws_traffic_page_visitor_page_view_1d$dws_traffic_session_page_view_1d";;
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x dwd_to_dws_1d_init.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ dwd_to_dws_1d_init.sh all 2020-06-14

2)每日数据装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建dwd_to_dws_1d.sh

[atguigu@hadoop102 bin]$ vim dwd_to_dws_1d.sh

(2)编写如下内容

#!/bin/bash
APP=gmall# 如果输入的日期按照取输入日期;如果没输入日期取当前时间的前一天
if [ -n "$2" ] ;thendo_date=$2
else do_date=`date -d "-1 day" +%F`
fidws_trade_province_order_1d="
insert overwrite table ${APP}.dws_trade_province_order_1d partition(dt='$do_date')
selectprovince_id,province_name,area_code,iso_code,iso_3166_2,order_count_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1d
from
(selectprovince_id,count(distinct(order_id)) order_count_1d,sum(split_original_amount) order_original_amount_1d,sum(nvl(split_activity_amount,0)) activity_reduce_amount_1d,sum(nvl(split_coupon_amount,0)) coupon_reduce_amount_1d,sum(split_total_amount) order_total_amount_1dfrom ${APP}.dwd_trade_order_detail_incwhere dt='$do_date'group by province_id
)o
left join
(selectid,province_name,area_code,iso_code,iso_3166_2from ${APP}.dim_province_fullwhere dt='$do_date'
)p
on o.province_id=p.id;
"
dws_trade_user_cart_add_1d="
insert overwrite table ${APP}.dws_trade_user_cart_add_1d partition(dt='$do_date')
selectuser_id,count(*),sum(sku_num)
from ${APP}.dwd_trade_cart_add_inc
where dt='$do_date'
group by user_id;
"
dws_trade_user_order_1d="
insert overwrite table ${APP}.dws_trade_user_order_1d partition(dt='$do_date')
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_original_amount),sum(nvl(split_activity_amount,0)),sum(nvl(split_coupon_amount,0)),sum(split_total_amount)
from ${APP}.dwd_trade_order_detail_inc
where dt='$do_date'
group by user_id;
"
dws_trade_user_order_refund_1d="
insert overwrite table ${APP}.dws_trade_user_order_refund_1d partition(dt='$do_date')
selectuser_id,count(*),sum(refund_num),sum(refund_amount)
from ${APP}.dwd_trade_order_refund_inc
where dt='$do_date'
group by user_id;
"
dws_trade_user_payment_1d="
insert overwrite table ${APP}.dws_trade_user_payment_1d partition(dt='$do_date')
selectuser_id,count(distinct(order_id)),sum(sku_num),sum(split_payment_amount)
from ${APP}.dwd_trade_pay_detail_suc_inc
where dt='$do_date'
group by user_id;
"
dws_trade_user_sku_order_1d="
insert overwrite table ${APP}.dws_trade_user_sku_order_1d partition(dt='$do_date')
selectuser_id,id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_count,order_num,order_original_amount,activity_reduce_amount,coupon_reduce_amount,order_total_amount
from
(selectuser_id,sku_id,count(*) order_count,sum(sku_num) order_num,sum(split_original_amount) order_original_amount,sum(nvl(split_activity_amount,0)) activity_reduce_amount,sum(nvl(split_coupon_amount,0)) coupon_reduce_amount,sum(split_total_amount) order_total_amountfrom ${APP}.dwd_trade_order_detail_incwhere dt='$do_date'group by user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom ${APP}.dim_sku_fullwhere dt='$do_date'
)sku
on od.sku_id=sku.id;
"
dws_trade_user_sku_order_refund_1d="
insert overwrite table ${APP}.dws_trade_user_sku_order_refund_1d partition(dt='$do_date')
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,order_refund_count,order_refund_num,order_refund_amount
from
(selectuser_id,sku_id,count(*) order_refund_count,sum(refund_num) order_refund_num,sum(refund_amount) order_refund_amountfrom ${APP}.dwd_trade_order_refund_incwhere dt='$do_date'group by user_id,sku_id
)od
left join
(selectid,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_namefrom ${APP}.dim_sku_fullwhere dt='$do_date'
)sku
on od.sku_id=sku.id;
"
dws_traffic_page_visitor_page_view_1d="
insert overwrite table ${APP}.dws_traffic_page_visitor_page_view_1d partition(dt='$do_date')
selectmid_id,brand,model,operate_system,page_id,sum(during_time),count(*)
from ${APP}.dwd_traffic_page_view_inc
where dt='$do_date'
group by mid_id,brand,model,operate_system,page_id;
"
dws_traffic_session_page_view_1d="
insert overwrite table ${APP}.dws_traffic_session_page_view_1d partition(dt='$do_date')
selectsession_id,mid_id,brand,model,operate_system,version_code,channel,sum(during_time),count(*)
from ${APP}.dwd_traffic_page_view_inc
where dt='$do_date'
group by session_id,mid_id,brand,model,operate_system,version_code,channel;
"case $1 in"dws_trade_province_order_1d" )hive -e "$dws_trade_province_order_1d";;"dws_trade_user_cart_add_1d" )hive -e "$dws_trade_user_cart_add_1d";;"dws_trade_user_order_1d" )hive -e "$dws_trade_user_order_1d";;"dws_trade_user_order_refund_1d" )hive -e "$dws_trade_user_order_refund_1d";;"dws_trade_user_payment_1d" )hive -e "$dws_trade_user_payment_1d";;"dws_trade_user_sku_order_1d" )hive -e "$dws_trade_user_sku_order_1d";;"dws_trade_user_sku_order_refund_1d" )hive -e "$dws_trade_user_sku_order_refund_1d";;"dws_traffic_page_visitor_page_view_1d" )hive -e "$dws_traffic_page_visitor_page_view_1d";;"dws_traffic_session_page_view_1d" )hive -e "$dws_traffic_session_page_view_1d";;"all" )hive -e "$dws_trade_province_order_1d$dws_trade_user_cart_add_1d$dws_trade_user_order_1d$dws_trade_user_order_refund_1d$dws_trade_user_payment_1d$dws_trade_user_sku_order_1d$dws_trade_user_sku_order_refund_1d$dws_traffic_page_visitor_page_view_1d$dws_traffic_session_page_view_1d";;
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x dwd_to_dws_1d.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ dwd_to_dws_1d.sh all 2020-06-14

10.2 最近n日汇总表

10.2.1 交易域用户商品粒度订单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_sku_order_nd;
CREATE EXTERNAL TABLE dws_trade_user_sku_order_nd
(`user_id`                    STRING COMMENT '用户id',`sku_id`                     STRING COMMENT 'sku_id',`sku_name`                   STRING COMMENT 'sku名称',`category1_id`               STRING COMMENT '一级分类id',`category1_name`             STRING COMMENT '一级分类名称',`category2_id`               STRING COMMENT '一级分类id',`category2_name`             STRING COMMENT '一级分类名称',`category3_id`               STRING COMMENT '一级分类id',`category3_name`             STRING COMMENT '一级分类名称',`tm_id`                      STRING COMMENT '品牌id',`tm_name`                    STRING COMMENT '品牌名称',`order_count_7d`             STRING COMMENT '最近7日下单次数',`order_num_7d`               BIGINT COMMENT '最近7日下单件数',`order_original_amount_7d`   DECIMAL(16, 2) COMMENT '最近7日下单原始金额',`activity_reduce_amount_7d`  DECIMAL(16, 2) COMMENT '最近7日活动优惠金额',`coupon_reduce_amount_7d`    DECIMAL(16, 2) COMMENT '最近7日优惠券优惠金额',`order_total_amount_7d`      DECIMAL(16, 2) COMMENT '最近7日下单最终金额',`order_count_30d`            BIGINT COMMENT '最近30日下单次数',`order_num_30d`              BIGINT COMMENT '最近30日下单件数',`order_original_amount_30d`  DECIMAL(16, 2) COMMENT '最近30日下单原始金额',`activity_reduce_amount_30d` DECIMAL(16, 2) COMMENT '最近30日活动优惠金额',`coupon_reduce_amount_30d`   DECIMAL(16, 2) COMMENT '最近30日优惠券优惠金额',`order_total_amount_30d`     DECIMAL(16, 2) COMMENT '最近30日下单最终金额'
) COMMENT '交易域用户商品粒度订单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_sku_order_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_user_sku_order_nd partition(dt='2020-06-14')
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,sum(if(dt>=date_add('2020-06-14',-6),order_count_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_num_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_original_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),activity_reduce_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),coupon_reduce_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_total_amount_1d,0)),sum(order_count_1d),sum(order_num_1d),sum(order_original_amount_1d),sum(activity_reduce_amount_1d),sum(coupon_reduce_amount_1d),sum(order_total_amount_1d)
from dws_trade_user_sku_order_1d
where dt>=date_add('2020-06-14',-29)
group by  user_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name;

10.2.2 交易域用户商品粒度退单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_sku_order_refund_nd;
CREATE EXTERNAL TABLE dws_trade_user_sku_order_refund_nd
(`user_id`                     STRING COMMENT '用户id',`sku_id`                      STRING COMMENT 'sku_id',`sku_name`                    STRING COMMENT 'sku名称',`category1_id`                STRING COMMENT '一级分类id',`category1_name`              STRING COMMENT '一级分类名称',`category2_id`                STRING COMMENT '一级分类id',`category2_name`              STRING COMMENT '一级分类名称',`category3_id`                STRING COMMENT '一级分类id',`category3_name`              STRING COMMENT '一级分类名称',`tm_id`                       STRING COMMENT '品牌id',`tm_name`                     STRING COMMENT '品牌名称',`order_refund_count_7d`       BIGINT COMMENT '最近7日退单次数',`order_refund_num_7d`         BIGINT COMMENT '最近7日退单件数',`order_refund_amount_7d`      DECIMAL(16, 2) COMMENT '最近7日退单金额',`order_refund_count_30d`      BIGINT COMMENT '最近30日退单次数',`order_refund_num_30d`        BIGINT COMMENT '最近30日退单件数',`order_refund_amount_30d`     DECIMAL(16, 2) COMMENT '最近30日退单金额'
) COMMENT '交易域用户商品粒度退单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_sku_order_refund_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_user_sku_order_refund_nd partition(dt='2020-06-14')
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,sum(if(dt>=date_add('2020-06-14',-6),order_refund_count_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_refund_num_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_refund_amount_1d,0)),sum(order_refund_count_1d),sum(order_refund_num_1d),sum(order_refund_amount_1d)
from dws_trade_user_sku_order_refund_1d
where dt>=date_add('2020-06-14',-29)
and dt<='2020-06-14'
group by user_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name;

10.2.3 交易域用户粒度订单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_order_nd;
CREATE EXTERNAL TABLE dws_trade_user_order_nd
(`user_id`                    STRING COMMENT '用户id',`order_count_7d`             BIGINT COMMENT '最近7日下单次数',`order_num_7d`               BIGINT COMMENT '最近7日下单商品件数',`order_original_amount_7d`   DECIMAL(16, 2) COMMENT '最近7日下单原始金额',`activity_reduce_amount_7d`  DECIMAL(16, 2) COMMENT '最近7日下单活动优惠金额',`coupon_reduce_amount_7d`    DECIMAL(16, 2) COMMENT '最近7日下单优惠券优惠金额',`order_total_amount_7d`      DECIMAL(16, 2) COMMENT '最近7日下单最终金额',`order_count_30d`            BIGINT COMMENT '最近30日下单次数',`order_num_30d`              BIGINT COMMENT '最近30日下单商品件数',`order_original_amount_30d`  DECIMAL(16, 2) COMMENT '最近30日下单原始金额',`activity_reduce_amount_30d` DECIMAL(16, 2) COMMENT '最近30日下单活动优惠金额',`coupon_reduce_amount_30d`   DECIMAL(16, 2) COMMENT '最近30日下单优惠券优惠金额',`order_total_amount_30d`     DECIMAL(16, 2) COMMENT '最近30日下单最终金额'
) COMMENT '交易域用户粒度订单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_order_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_user_order_nd partition(dt='2020-06-14')
selectuser_id,sum(if(dt>=date_add('2020-06-14',-6),order_count_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_num_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_original_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),activity_reduce_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),coupon_reduce_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_total_amount_1d,0)),sum(order_count_1d),sum(order_num_1d),sum(order_original_amount_1d),sum(activity_reduce_amount_1d),sum(coupon_reduce_amount_1d),sum(order_total_amount_1d)
from dws_trade_user_order_1d
where dt>=date_add('2020-06-14',-29)
and dt<='2020-06-14'
group by user_id;

10.2.4 交易域用户粒度加购最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_cart_add_nd;
CREATE EXTERNAL TABLE dws_trade_user_cart_add_nd
(`user_id`            STRING COMMENT '用户id',`cart_add_count_7d`  BIGINT COMMENT '最近7日加购次数',`cart_add_num_7d`    BIGINT COMMENT '最近7日加购商品件数',`cart_add_count_30d` BIGINT COMMENT '最近30日加购次数',`cart_add_num_30d`   BIGINT COMMENT '最近30日加购商品件数'
) COMMENT '交易域用户粒度加购最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_cart_add_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_user_cart_add_nd partition(dt='2020-06-14')
selectuser_id,sum(if(dt>=date_add('2020-06-14',-6),cart_add_count_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),cart_add_num_1d,0)),sum(cart_add_count_1d),sum(cart_add_num_1d)
from dws_trade_user_cart_add_1d
where dt>=date_add('2020-06-14',-29)
and dt<='2020-06-14'
group by user_id;

10.2.5 交易域用户粒度支付最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_payment_nd;
CREATE EXTERNAL TABLE dws_trade_user_payment_nd
(`user_id`            STRING COMMENT '用户id',`payment_count_7d`   BIGINT COMMENT '最近7日支付次数',`payment_num_7d`     BIGINT COMMENT '最近7日支付商品件数',`payment_amount_7d`  DECIMAL(16, 2) COMMENT '最近7日支付金额',`payment_count_30d`  BIGINT COMMENT '最近30日支付次数',`payment_num_30d`    BIGINT COMMENT '最近30日支付商品件数',`payment_amount_30d` DECIMAL(16, 2) COMMENT '最近30日支付金额'
) COMMENT '交易域用户粒度支付最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_payment_nd'
TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_user_payment_nd partition (dt = '2020-06-14')
select user_id,sum(if(dt >= date_add('2020-06-14', -6), payment_count_1d, 0)),sum(if(dt >= date_add('2020-06-14', -6), payment_num_1d, 0)),sum(if(dt >= date_add('2020-06-14', -6), payment_amount_1d, 0)),sum(payment_count_1d),sum(payment_num_1d),sum(payment_amount_1d)
from dws_trade_user_payment_1d
where dt >= date_add('2020-06-14', -29)and dt <= '2020-06-14'
group by user_id;

10.2.6 交易域省份粒度订单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_province_order_nd;
CREATE EXTERNAL TABLE dws_trade_province_order_nd
(`province_id`                STRING COMMENT '用户id',`province_name`              STRING COMMENT '省份名称',`area_code`                  STRING COMMENT '地区编码',`iso_code`                   STRING COMMENT '旧版ISO-3166-2编码',`iso_3166_2`                 STRING COMMENT '新版版ISO-3166-2编码',`order_count_7d`             BIGINT COMMENT '最近7日下单次数',`order_original_amount_7d`   DECIMAL(16, 2) COMMENT '最近7日下单原始金额',`activity_reduce_amount_7d`  DECIMAL(16, 2) COMMENT '最近7日下单活动优惠金额',`coupon_reduce_amount_7d`    DECIMAL(16, 2) COMMENT '最近7日下单优惠券优惠金额',`order_total_amount_7d`      DECIMAL(16, 2) COMMENT '最近7日下单最终金额',`order_count_30d`            BIGINT COMMENT '最近30日下单次数',`order_original_amount_30d`  DECIMAL(16, 2) COMMENT '最近30日下单原始金额',`activity_reduce_amount_30d` DECIMAL(16, 2) COMMENT '最近30日下单活动优惠金额',`coupon_reduce_amount_30d`   DECIMAL(16, 2) COMMENT '最近30日下单优惠券优惠金额',`order_total_amount_30d`     DECIMAL(16, 2) COMMENT '最近30日下单最终金额'
) COMMENT '交易域省份粒度订单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_province_order_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_province_order_nd partition(dt='2020-06-14')
selectprovince_id,province_name,area_code,iso_code,iso_3166_2,sum(if(dt>=date_add('2020-06-14',-6),order_count_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_original_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),activity_reduce_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),coupon_reduce_amount_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_total_amount_1d,0)),sum(order_count_1d),sum(order_original_amount_1d),sum(activity_reduce_amount_1d),sum(coupon_reduce_amount_1d),sum(order_total_amount_1d)
from dws_trade_province_order_1d
where dt>=date_add('2020-06-14',-29)
and dt<='2020-06-14'
group by province_id,province_name,area_code,iso_code,iso_3166_2;

10.2.7 交易域优惠券粒度订单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_coupon_order_nd;
CREATE EXTERNAL TABLE dws_trade_coupon_order_nd
(`coupon_id`                STRING COMMENT '优惠券id',`coupon_name`              STRING COMMENT '优惠券名称',`coupon_type_code`         STRING COMMENT '优惠券类型id',`coupon_type_name`         STRING COMMENT '优惠券类型名称',`coupon_rule`              STRING COMMENT '优惠券规则',`start_date`               STRING COMMENT '发布日期',`original_amount_30d`      DECIMAL(16, 2) COMMENT '使用下单原始金额',`coupon_reduce_amount_30d` DECIMAL(16, 2) COMMENT '使用下单优惠金额'
) COMMENT '交易域优惠券粒度订单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_coupon_order_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_coupon_order_nd partition(dt='2020-06-14')
selectid,coupon_name,coupon_type_code,coupon_type_name,benefit_rule,start_date,sum(split_original_amount),sum(split_coupon_amount)
from
(selectid,coupon_name,coupon_type_code,coupon_type_name,benefit_rule,date_format(start_time,'yyyy-MM-dd') start_datefrom dim_coupon_fullwhere dt='2020-06-14'and date_format(start_time,'yyyy-MM-dd')>=date_add('2020-06-14',-29)
)cou
left join
(selectcoupon_id,order_id,split_original_amount,split_coupon_amountfrom dwd_trade_order_detail_incwhere dt>=date_add('2020-06-14',-29)and dt<='2020-06-14'and coupon_id is not null
)od
on cou.id=od.coupon_id
group by id,coupon_name,coupon_type_code,coupon_type_name,benefit_rule,start_date;

10.2.8 交易域活动粒度订单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_activity_order_nd;
CREATE EXTERNAL TABLE dws_trade_activity_order_nd
(`activity_id`                STRING COMMENT '活动id',`activity_name`              STRING COMMENT '活动名称',`activity_type_code`         STRING COMMENT '活动类型编码',`activity_type_name`         STRING COMMENT '活动类型名称',`start_date`                 STRING COMMENT '发布日期',`original_amount_30d`        DECIMAL(16, 2) COMMENT '参与活动订单原始金额',`activity_reduce_amount_30d` DECIMAL(16, 2) COMMENT '参与活动订单优惠金额'
) COMMENT '交易域活动粒度订单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_activity_order_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_activity_order_nd partition(dt='2020-06-14')
selectact.activity_id,activity_name,activity_type_code,activity_type_name,date_format(start_time,'yyyy-MM-dd'),sum(split_original_amount),sum(split_activity_amount)
from
(selectactivity_id,activity_name,activity_type_code,activity_type_name,start_timefrom dim_activity_fullwhere dt='2020-06-14'and date_format(start_time,'yyyy-MM-dd')>=date_add('2020-06-14',-29)group by activity_id, activity_name, activity_type_code, activity_type_name,start_time
)act
left join
(selectactivity_id,order_id,split_original_amount,split_activity_amountfrom dwd_trade_order_detail_incwhere dt>=date_add('2020-06-14',-29)and dt<='2020-06-14'and activity_id is not null
)od
on act.activity_id=od.activity_id
group by act.activity_id,activity_name,activity_type_code,activity_type_name,start_time;

10.2.9 交易域用户粒度退单最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_order_refund_nd;
CREATE EXTERNAL TABLE dws_trade_user_order_refund_nd
(`user_id`                 STRING COMMENT '用户id',`order_refund_count_7d`   BIGINT COMMENT '最近7日退单次数',`order_refund_num_7d`     BIGINT COMMENT '最近7日退单商品件数',`order_refund_amount_7d`  DECIMAL(16, 2) COMMENT '最近7日退单金额',`order_refund_count_30d`  BIGINT COMMENT '最近30日退单次数',`order_refund_num_30d`    BIGINT COMMENT '最近30日退单商品件数',`order_refund_amount_30d` DECIMAL(16, 2) COMMENT '最近30日退单金额'
) COMMENT '交易域用户粒度退单最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_order_refund_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_trade_user_order_refund_nd partition(dt='2020-06-14')
selectuser_id,sum(if(dt>=date_add('2020-06-14',-6),order_refund_count_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_refund_num_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),order_refund_amount_1d,0)),sum(order_refund_count_1d),sum(order_refund_num_1d),sum(order_refund_amount_1d)
from dws_trade_user_order_refund_1d
where dt>=date_add('2020-06-14',-29)
and dt<='2020-06-14'
group by user_id;

10.2.10 流量域访客页面粒度页面浏览最近n日汇总表

1)建表语句

DROP TABLE IF EXISTS dws_traffic_page_visitor_page_view_nd;
CREATE EXTERNAL TABLE dws_traffic_page_visitor_page_view_nd
(`mid_id`          STRING COMMENT '访客id',`brand`           string comment '手机品牌',`model`           string comment '手机型号',`operate_system`  string comment '操作系统',`page_id`         STRING COMMENT '页面id',`during_time_7d`  BIGINT COMMENT '最近7日浏览时长',`view_count_7d`   BIGINT COMMENT '最近7日访问次数',`during_time_30d` BIGINT COMMENT '最近30日浏览时长',`view_count_30d`  BIGINT COMMENT '最近30日访问次数'
) COMMENT '流量域访客页面粒度页面浏览最近n日汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_traffic_page_visitor_page_view_nd'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dws_traffic_page_visitor_page_view_nd partition(dt='2020-06-14')
selectmid_id,brand,model,operate_system,page_id,sum(if(dt>=date_add('2020-06-14',-6),during_time_1d,0)),sum(if(dt>=date_add('2020-06-14',-6),view_count_1d,0)),sum(during_time_1d),sum(view_count_1d)
from dws_traffic_page_visitor_page_view_1d
where dt>=date_add('2020-06-14',-29)
and dt<='2020-06-14'
group by mid_id,brand,model,operate_system,page_id;

10.2.11 数据装载脚本

1)每日数据装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建dws_1d_to_dws_nd.sh

[atguigu@hadoop102 bin]$ vim dws_1d_to_dws_nd.sh

(2)编写如下内容

#!/bin/bash
APP=gmall# 如果是输入的日期按照取输入日期;如果没输入日期取当前时间的前一天
if [ -n "$2" ] ;thendo_date=$2
else do_date=`date -d "-1 day" +%F`
fidws_trade_activity_order_nd="
insert overwrite table ${APP}.dws_trade_activity_order_nd partition(dt='$do_date')
selectact.activity_id,activity_name,activity_type_code,activity_type_name,date_format(start_time,'yyyy-MM-dd'),sum(split_original_amount),sum(split_activity_amount)
from
(selectactivity_id,activity_name,activity_type_code,activity_type_name,start_timefrom ${APP}.dim_activity_fullwhere dt='$do_date'and date_format(start_time,'yyyy-MM-dd')>=date_add('$do_date',-29)group by activity_id, activity_name, activity_type_code, activity_type_name,start_time
)act
left join
(selectactivity_id,order_id,split_original_amount,split_activity_amountfrom ${APP}.dwd_trade_order_detail_incwhere dt>=date_add('$do_date',-29)and dt<='$do_date'and activity_id is not null
)od
on act.activity_id=od.activity_id
group by act.activity_id,activity_name,activity_type_code,activity_type_name,start_time;
"
dws_trade_coupon_order_nd="
insert overwrite table ${APP}.dws_trade_coupon_order_nd partition(dt='$do_date')
selectid,coupon_name,coupon_type_code,coupon_type_name,benefit_rule,start_date,sum(split_original_amount),sum(split_coupon_amount)
from
(selectid,coupon_name,coupon_type_code,coupon_type_name,benefit_rule,date_format(start_time,'yyyy-MM-dd') start_datefrom ${APP}.dim_coupon_fullwhere dt='$do_date'and date_format(start_time,'yyyy-MM-dd')>=date_add('$do_date',-29)
)cou
left join
(selectcoupon_id,order_id,split_original_amount,split_coupon_amountfrom ${APP}.dwd_trade_order_detail_incwhere dt>=date_add('$do_date',-29)and dt<='$do_date'and coupon_id is not null
)od
on cou.id=od.coupon_id
group by id,coupon_name,coupon_type_code,coupon_type_name,benefit_rule,start_date;
"
dws_trade_province_order_nd="
insert overwrite table ${APP}.dws_trade_province_order_nd partition(dt='$do_date')
selectprovince_id,province_name,area_code,iso_code,iso_3166_2,sum(if(dt>=date_add('$do_date',-6),order_count_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_original_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),activity_reduce_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),coupon_reduce_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_total_amount_1d,0)),sum(order_count_1d),sum(order_original_amount_1d),sum(activity_reduce_amount_1d),sum(coupon_reduce_amount_1d),sum(order_total_amount_1d)
from ${APP}.dws_trade_province_order_1d
where dt>=date_add('$do_date',-29)
and dt<='$do_date'
group by province_id,province_name,area_code,iso_code,iso_3166_2;
"
dws_trade_user_cart_add_nd="
insert overwrite table ${APP}.dws_trade_user_cart_add_nd partition(dt='$do_date')
selectuser_id,sum(if(dt>=date_add('$do_date',-6),cart_add_count_1d,0)),sum(if(dt>=date_add('$do_date',-6),cart_add_num_1d,0)),sum(cart_add_count_1d),sum(cart_add_num_1d)
from ${APP}.dws_trade_user_cart_add_1d
where dt>=date_add('$do_date',-29)
and dt<='$do_date'
group by user_id;
"
dws_trade_user_order_nd="
insert overwrite table ${APP}.dws_trade_user_order_nd partition(dt='$do_date')
selectuser_id,sum(if(dt>=date_add('$do_date',-6),order_count_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_num_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_original_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),activity_reduce_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),coupon_reduce_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_total_amount_1d,0)),sum(order_count_1d),sum(order_num_1d),sum(order_original_amount_1d),sum(activity_reduce_amount_1d),sum(coupon_reduce_amount_1d),sum(order_total_amount_1d)
from ${APP}.dws_trade_user_order_1d
where dt>=date_add('$do_date',-29)
and dt<='$do_date'
group by user_id;
"
dws_trade_user_order_refund_nd="
insert overwrite table ${APP}.dws_trade_user_order_refund_nd partition(dt='$do_date')
selectuser_id,sum(if(dt>=date_add('$do_date',-6),order_refund_count_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_refund_num_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_refund_amount_1d,0)),sum(order_refund_count_1d),sum(order_refund_num_1d),sum(order_refund_amount_1d)
from ${APP}.dws_trade_user_order_refund_1d
where dt>=date_add('$do_date',-29)
and dt<='$do_date'
group by user_id;
"
dws_trade_user_payment_nd="
insert overwrite table ${APP}.dws_trade_user_payment_nd partition (dt = '$do_date')
select user_id,sum(if(dt >= date_add('$do_date', -6), payment_count_1d, 0)),sum(if(dt >= date_add('$do_date', -6), payment_num_1d, 0)),sum(if(dt >= date_add('$do_date', -6), payment_amount_1d, 0)),sum(payment_count_1d),sum(payment_num_1d),sum(payment_amount_1d)
from ${APP}.dws_trade_user_payment_1d
where dt >= date_add('$do_date', -29)and dt <= '$do_date'
group by user_id;
"
dws_trade_user_sku_order_nd="
insert overwrite table ${APP}.dws_trade_user_sku_order_nd partition(dt='$do_date')
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,sum(if(dt>=date_add('$do_date',-6),order_count_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_num_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_original_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),activity_reduce_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),coupon_reduce_amount_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_total_amount_1d,0)),sum(order_count_1d),sum(order_num_1d),sum(order_original_amount_1d),sum(activity_reduce_amount_1d),sum(coupon_reduce_amount_1d),sum(order_total_amount_1d)
from ${APP}.dws_trade_user_sku_order_1d
where dt>=date_add('$do_date',-30)
group by  user_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name;
"
dws_trade_user_sku_order_refund_nd="
insert overwrite table ${APP}.dws_trade_user_sku_order_refund_nd partition(dt='$do_date')
selectuser_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name,sum(if(dt>=date_add('$do_date',-6),order_refund_count_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_refund_num_1d,0)),sum(if(dt>=date_add('$do_date',-6),order_refund_amount_1d,0)),sum(order_refund_count_1d),sum(order_refund_num_1d),sum(order_refund_amount_1d)
from ${APP}.dws_trade_user_sku_order_refund_1d
where dt>=date_add('$do_date',-29)
and dt<='$do_date'
group by user_id,sku_id,sku_name,category1_id,category1_name,category2_id,category2_name,category3_id,category3_name,tm_id,tm_name;
"
dws_traffic_page_visitor_page_view_nd="
insert overwrite table ${APP}.dws_traffic_page_visitor_page_view_nd partition(dt='$do_date')
selectmid_id,brand,model,operate_system,page_id,sum(if(dt>=date_add('$do_date',-6),during_time_1d,0)),sum(if(dt>=date_add('$do_date',-6),view_count_1d,0)),sum(during_time_1d),sum(view_count_1d)
from ${APP}.dws_traffic_page_visitor_page_view_1d
where dt>=date_add('$do_date',-29)
and dt<='$do_date'
group by mid_id,brand,model,operate_system,page_id;
"case $1 in"dws_trade_activity_order_nd" )hive -e "$dws_trade_activity_order_nd";;"dws_trade_coupon_order_nd" )hive -e "$dws_trade_coupon_order_nd";;"dws_trade_province_order_nd" )hive -e "$dws_trade_province_order_nd";;"dws_trade_user_cart_add_nd" )hive -e "$dws_trade_user_cart_add_nd";;"dws_trade_user_order_nd" )hive -e "$dws_trade_user_order_nd";;"dws_trade_user_order_refund_nd" )hive -e "$dws_trade_user_order_refund_nd";;"dws_trade_user_payment_nd" )hive -e "$dws_trade_user_payment_nd";;"dws_trade_user_sku_order_nd" )hive -e "$dws_trade_user_sku_order_nd";;"dws_trade_user_sku_order_refund_nd" )hive -e "$dws_trade_user_sku_order_refund_nd";;"dws_traffic_page_visitor_page_view_nd" )hive -e "$dws_traffic_page_visitor_page_view_nd";;"all" )hive -e "$dws_trade_activity_order_nd$dws_trade_coupon_order_nd$dws_trade_province_order_nd$dws_trade_user_cart_add_nd$dws_trade_user_order_nd$dws_trade_user_order_refund_nd$dws_trade_user_payment_nd$dws_trade_user_sku_order_nd$dws_trade_user_sku_order_refund_nd$dws_traffic_page_visitor_page_view_nd";;
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x dws_1d_to_dws_nd.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ dws_1d_to_dws_nd.sh all 2020-06-14

10.3 历史至今汇总表

10.3.1 交易域用户粒度订单历史至今汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_order_td;
CREATE EXTERNAL TABLE dws_trade_user_order_td
(`user_id`                   STRING COMMENT '用户id',`order_date_first`          STRING COMMENT '首次下单日期',`order_date_last`           STRING COMMENT '末次下单日期',`order_count_td`            BIGINT COMMENT '下单次数',`order_num_td`              BIGINT COMMENT '购买商品件数',`original_amount_td`        DECIMAL(16, 2) COMMENT '原始金额',`activity_reduce_amount_td` DECIMAL(16, 2) COMMENT '活动优惠金额',`coupon_reduce_amount_td`   DECIMAL(16, 2) COMMENT '优惠券优惠金额',`total_amount_td`           DECIMAL(16, 2) COMMENT '最终金额'
) COMMENT '交易域用户粒度订单历史至今汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_order_td'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

insert overwrite table dws_trade_user_order_td partition(dt='2020-06-14')
selectuser_id,min(dt) login_date_first,max(dt) login_date_last,sum(order_count_1d) order_count,sum(order_num_1d) order_num,sum(order_original_amount_1d) original_amount,sum(activity_reduce_amount_1d) activity_reduce_amount,sum(coupon_reduce_amount_1d) coupon_reduce_amount,sum(order_total_amount_1d) total_amount
from dws_trade_user_order_1d
group by user_id;

(2)每日装载

insert overwrite table dws_trade_user_order_td partition(dt='2020-06-15')
selectnvl(old.user_id,new.user_id),if(new.user_id is not null and old.user_id is null,'2020-06-15',old.order_date_first),if(new.user_id is not null,'2020-06-15',old.order_date_last),nvl(old.order_count_td,0)+nvl(new.order_count_1d,0),nvl(old.order_num_td,0)+nvl(new.order_num_1d,0),nvl(old.original_amount_td,0)+nvl(new.order_original_amount_1d,0),nvl(old.activity_reduce_amount_td,0)+nvl(new.activity_reduce_amount_1d,0),nvl(old.coupon_reduce_amount_td,0)+nvl(new.coupon_reduce_amount_1d,0),nvl(old.total_amount_td,0)+nvl(new.order_total_amount_1d,0)
from
(selectuser_id,order_date_first,order_date_last,order_count_td,order_num_td,original_amount_td,activity_reduce_amount_td,coupon_reduce_amount_td,total_amount_tdfrom dws_trade_user_order_tdwhere dt=date_add('2020-06-15',-1)
)old
full outer join
(selectuser_id,order_count_1d,order_num_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1dfrom dws_trade_user_order_1dwhere dt='2020-06-15'
)new
on old.user_id=new.user_id;

10.3.2 交易域用户粒度支付历史至今汇总表

1)建表语句

DROP TABLE IF EXISTS dws_trade_user_payment_td;
CREATE EXTERNAL TABLE dws_trade_user_payment_td
(`user_id`            STRING COMMENT '用户id',`payment_date_first` STRING COMMENT '首次支付日期',`payment_date_last`  STRING COMMENT '末次支付日期',`payment_count_td`   BIGINT COMMENT '最近7日支付次数',`payment_num_td`     BIGINT COMMENT '最近7日支付商品件数',`payment_amount_td`  DECIMAL(16, 2) COMMENT '最近7日支付金额'
) COMMENT '交易域用户粒度支付历史至今汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_trade_user_payment_td'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

insert overwrite table dws_trade_user_payment_td partition(dt='2020-06-14')
selectuser_id,min(dt) payment_date_first,max(dt) payment_date_last,sum(payment_count_1d) payment_count,sum(payment_num_1d) payment_num,sum(payment_amount_1d) payment_amount
from dws_trade_user_payment_1d
group by user_id;

(2)每日装载

insert overwrite table dws_trade_user_payment_td partition(dt='2020-06-15')
selectnvl(old.user_id,new.user_id),if(old.user_id is null and new.user_id is not null,'2020-06-15',old.payment_date_first),if(new.user_id is not null,'2020-06-15',old.payment_date_last),nvl(old.payment_count_td,0)+nvl(new.payment_count_1d,0),nvl(old.payment_num_td,0)+nvl(new.payment_num_1d,0),nvl(old.payment_amount_td,0)+nvl(new.payment_amount_1d,0)
from
(selectuser_id,payment_date_first,payment_date_last,payment_count_td,payment_num_td,payment_amount_tdfrom dws_trade_user_payment_tdwhere dt=date_add('2020-06-15',-1)
)old
full outer join
(selectuser_id,payment_count_1d,payment_num_1d,payment_amount_1dfrom dws_trade_user_payment_1dwhere dt='2020-06-15'
)new
on old.user_id=new.user_id;

10.3.3 用户域用户粒度登录历史至今汇总表

1)建表语句

DROP TABLE IF EXISTS dws_user_user_login_td;
CREATE EXTERNAL TABLE dws_user_user_login_td
(`user_id`         STRING COMMENT '用户id',`login_date_last` STRING COMMENT '末次登录日期',`login_count_td`  BIGINT COMMENT '累计登录次数'
) COMMENT '用户域用户粒度登录历史至今汇总事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dws/dws_user_user_login_td'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

(1)首日装载

insert overwrite table dws_user_user_login_td partition(dt='2020-06-14')
selectu.id,nvl(login_date_last,date_format(create_time,'yyyy-MM-dd')),nvl(login_count_td,1)
from
(selectid,create_timefrom dim_user_zipwhere dt='9999-12-31'
)u
left join
(selectuser_id,max(dt) login_date_last,count(*) login_count_tdfrom dwd_user_login_incgroup by user_id
)l
on u.id=l.user_id;

(2)每日装载

insert overwrite table dws_user_user_login_td partition(dt='2020-06-15')
selectnvl(old.user_id,new.user_id),if(new.user_id is null,old.login_date_last,'2020-06-15'),nvl(old.login_count_td,0)+nvl(new.login_count_1d,0)
from
(selectuser_id,login_date_last,login_count_tdfrom dws_user_user_login_tdwhere dt=date_add('2020-06-15',-1)
)old
full outer join
(selectuser_id,count(*) login_count_1dfrom dwd_user_login_incwhere dt='2020-06-15'group by user_id
)new
on old.user_id=new.user_id;

10.3.4 数据装载脚本

1)首日数据装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建dws_1d_to_dws_td_init.sh

[atguigu@hadoop102 bin]$ vim dws_1d_to_dws_td_init.sh

(2)编写如下内容

#!/bin/bash
APP=gmallif [ -n "$2" ] ;thendo_date=$2
else echo "请传入日期参数"exit
fidws_trade_user_order_td="
insert overwrite table ${APP}.dws_trade_user_order_td partition(dt='$do_date')
selectuser_id,min(dt) login_date_first,max(dt) login_date_last,sum(order_count_1d) order_count,sum(order_num_1d) order_num,sum(order_original_amount_1d) original_amount,sum(activity_reduce_amount_1d) activity_reduce_amount,sum(coupon_reduce_amount_1d) coupon_reduce_amount,sum(order_total_amount_1d) total_amount
from ${APP}.dws_trade_user_order_1d
group by user_id;
"dws_trade_user_payment_td="
insert overwrite table ${APP}.dws_trade_user_payment_td partition(dt='$do_date')
selectuser_id,min(dt) payment_date_first,max(dt) payment_date_last,sum(payment_count_1d) payment_count,sum(payment_num_1d) payment_num,sum(payment_amount_1d) payment_amount
from ${APP}.dws_trade_user_payment_1d
group by user_id;
"dws_user_user_login_td="
insert overwrite table ${APP}.dws_user_user_login_td partition(dt='$do_date')
selectu.id,nvl(login_date_last,date_format(create_time,'yyyy-MM-dd')),nvl(login_count_td,1)
from
(selectid,create_timefrom ${APP}.dim_user_zipwhere dt='9999-12-31'
)u
left join
(selectuser_id,max(dt) login_date_last,count(*) login_count_tdfrom ${APP}.dwd_user_login_incgroup by user_id
)l
on u.id=l.user_id;
"case $1 in"dws_trade_user_order_td" )hive -e "$dws_trade_user_order_td";;"dws_trade_user_payment_td" )hive -e "$dws_trade_user_payment_td";;"dws_user_user_login_td" )hive -e "$dws_user_user_login_td";;"all" )hive -e "$dws_trade_user_order_td$dws_trade_user_payment_td$dws_user_user_login_td";;
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x dws_1d_to_dws_td_init.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ dws_1d_to_dws_td_init.sh all 2020-06-14

2)每日数据装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建dws_1d_to_dws_td.sh

[atguigu@hadoop102 bin]$ vim dws_1d_to_dws_td.sh

(2)编写如下内容

#!/bin/bash
APP=gmall# 如果输入的日期按照取输入日期;如果没输入日期取当前时间的前一天
if [ -n "$2" ] ;thendo_date=$2
else do_date=`date -d "-1 day" +%F`
fidws_trade_user_order_td="
insert overwrite table ${APP}.dws_trade_user_order_td partition(dt='$do_date')
selectnvl(old.user_id,new.user_id),if(new.user_id is not null and old.user_id is null,'$do_date',old.order_date_first),if(new.user_id is not null,'$do_date',old.order_date_last),nvl(old.order_count_td,0)+nvl(new.order_count_1d,0),nvl(old.order_num_td,0)+nvl(new.order_num_1d,0),nvl(old.original_amount_td,0)+nvl(new.order_original_amount_1d,0),nvl(old.activity_reduce_amount_td,0)+nvl(new.activity_reduce_amount_1d,0),nvl(old.coupon_reduce_amount_td,0)+nvl(new.coupon_reduce_amount_1d,0),nvl(old.total_amount_td,0)+nvl(new.order_total_amount_1d,0)
from
(selectuser_id,order_date_first,order_date_last,order_count_td,order_num_td,original_amount_td,activity_reduce_amount_td,coupon_reduce_amount_td,total_amount_tdfrom ${APP}.dws_trade_user_order_tdwhere dt=date_add('$do_date',-1)
)old
full outer join
(selectuser_id,order_count_1d,order_num_1d,order_original_amount_1d,activity_reduce_amount_1d,coupon_reduce_amount_1d,order_total_amount_1dfrom ${APP}.dws_trade_user_order_1dwhere dt='$do_date'
)new
on old.user_id=new.user_id;
"dws_trade_user_payment_td="
insert overwrite table ${APP}.dws_trade_user_payment_td partition(dt='$do_date')
selectnvl(old.user_id,new.user_id),if(old.user_id is null and new.user_id is not null,'$do_date',old.payment_date_first),if(new.user_id is not null,'$do_date',old.payment_date_last),nvl(old.payment_count_td,0)+nvl(new.payment_count_1d,0),nvl(old.payment_num_td,0)+nvl(new.payment_num_1d,0),nvl(old.payment_amount_td,0)+nvl(new.payment_amount_1d,0)
from
(selectuser_id,payment_date_first,payment_date_last,payment_count_td,payment_num_td,payment_amount_tdfrom ${APP}.dws_trade_user_payment_tdwhere dt=date_add('$do_date',-1)
)old
full outer join
(selectuser_id,payment_count_1d,payment_num_1d,payment_amount_1dfrom ${APP}.dws_trade_user_payment_1dwhere dt='$do_date'
)new
on old.user_id=new.user_id;
"dws_user_user_login_td="
insert overwrite table ${APP}.dws_user_user_login_td partition(dt='$do_date')
selectnvl(old.user_id,new.user_id),if(new.user_id is null,old.login_date_last,'$do_date'),nvl(old.login_count_td,0)+nvl(new.login_count_1d,0)
from
(selectuser_id,login_date_last,login_count_tdfrom ${APP}.dws_user_user_login_tdwhere dt=date_add('$do_date',-1)
)old
full outer join
(selectuser_id,count(*) login_count_1dfrom ${APP}.dwd_user_login_incwhere dt='$do_date'group by user_id
)new
on old.user_id=new.user_id;
"case $1 in"dws_trade_user_order_td" )hive -e "$dws_trade_user_order_td";;"dws_trade_user_payment_td" )hive -e "$dws_trade_user_payment_td";;"dws_user_user_login_td" )hive -e "$dws_user_user_login_td";;"all" )hive -e "$dws_trade_user_order_td$dws_trade_user_payment_td$dws_user_user_login_td";;
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x dws_1d_to_dws_td.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ dws_1d_to_dws_td.sh all 2020-06-14

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/85472.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

vue下载Excel文件

前端vue实现导出Excel文件 用到的是 上代码 var wb XLSX.utils.table_to_book(document.querySelector(#my-table));//关联dom节点 这个是表格绑定的id名称var wbout XLSX.write(wb, {bookType: xlsx,bookSST: true,type: array})try {FileSaver.saveAs(new Blob([wbout], {…

【教学类】公开课学号挂牌(15*15CM手工纸)

作品展示&#xff1a; 15*15CM手工纸 文本框12磅加粗。学号数字是段落写入&#xff0c;黑体270磅 背景需求 最近都在小班、中班、大班里做“Python学具测试”&#xff0c;由于都是陌生的孩子&#xff0c;上课时&#xff0c;我通常只能喊“白衣服的女孩”“花格子衣服的男孩”…

精华回顾:Web3 前沿创新者在 DESTINATION MOON 共话未来

9 月 17 日&#xff0c;由 TinTinLand 主办的「DESTINATION MOON: Web3 Dev Summit Shanghai 2023」线下活动在上海黄浦如约而至。 本次 DESTINATION MOON 活动作为 2023 上海区块链国际周的 Side Event&#xff0c;设立了 4 场主题演讲与 3 个圆桌讨论&#xff0c;聚集了诸多…

strtok()函数的使用方法

strtok() 函数用于将字符串分割成子字符串&#xff08;标记&#xff09;。它在 C 语言中非常常用&#xff0c;可以通过指定分隔符来拆分原始字符串&#xff0c;并依次返回每个子字符串。 以下是 strtok() 函数的使用方法&#xff1a; #include <stdio.h> #include <…

Python方法汇总:轻松实现功能!

在爬虫开发中&#xff0c;有时需要模拟登录网站以获取更多的数据或执行特定的操作。本文将为你总结几种常用的Python爬虫模拟登录方法&#xff0c;帮助你轻松实现登录功能&#xff0c;让你的爬虫更加强大有用。 一、基于Requests库的模拟登录 1. 使用Session对象&#xff1a;…

UOS Deepin Ubuntu Linux 开启 ssh 远程登录

UOS Deepin Ubuntu Linux 开启 ssh 远程登录 打开控制台 安装 openssh-server sudo apt -y install openssh-server修改 /etc/ssh/ssh_config 文件 sudo vim /etc/ssh/ssh_config找到 # Port 22 去掉 # 注释后 保存 重启 ssh 服务 sudo systemctl restart ssh设置 ssh 服务 开机…

数据治理-科特的重大变革八步法

约翰科特是变革管理领域最受尊敬的研究者之一,他在《领导变革》一书中总结了组织执行变革遭遇失败的八大误区。对信息管理和数据管理环境下经常出现的问题具有参考意义。 误区 过于自满 组织变革时人们所犯的最大的错误,是尚未在同事和上级中建立足够高的紧迫感的情况下就冒…

Hoeffing不等式

在李航老师的统计学习方法&#xff08;第一版中&#xff09; H o e f f i n g 不等式 Hoeffing不等式 Hoeffing不等式是这样子给出的 设 X 1 , X 2 , . . . , X N X_1,X_2,...,X_N X1​,X2​,...,XN​是独立随机变量&#xff0c;且 X i ∈ [ a i , b i ] , i 1 , 2 , . . . ,…

servlet实现登录功能【当用户当前未登陆,跳转登录页面才能访问,若已经登录了,才可以直接访问】

1. 前端 <!DOCTYPE html> <html lang"en"> <head><meta charset"UTF-8"><title>Title</title> </head> <body><form action"login" method"POST"><input type"text&q…

ros开发中编译cpp文件的2个办法

方式一&#xff1a; 在Ubuntu控制台输入指令 cd catkin_ws 进入到工作空间 然后在输入&#xff1a; catkin_make --pkg catkin_practice 注释&#xff1a;以上catkin_ws是工作空间名称&#xff0c;catkin_practice是工作空间中将要被编译的包的名称 方式二&#xff1a; …

Java 基本数据类型

目录 Java 基本数据类型 内置数据类型 引用类型 Java常量 Java 基本数据类型 变量就是申请内存来存储值。也就是说&#xff0c;当创建变量的时候&#xff0c;需要在内存中申请空间。 内存管理系统根据变量的类型为变量分配存储空间&#xff0c;分配的空间只能用来储存该类型…

获取唯一的短邀请码

/*** 获取唯一的邀请码** return the string*/private String generateUserUniqueShareCode() {Set<String> arr getSetArr();String code;do {code generateCode(arr);} while (isCodeUserExists(code));return code;}/*** Gets set arr.** return the set arr*/NotNu…

极客时间:左耳听风【文章笔记 思考总结】

本篇博客是学习过程中的笔记、思考和总结。原文链接&#xff1a;https://time.geekbang.org/column/intro/100002201 开篇词 | 洞悉技术的本质&#xff0c;享受科技的乐趣01 | 程序员如何用技术变现&#xff08;上&#xff09;02 | 程序员如何用技术变现&#xff08;下&#xf…

glibc: strlcpy

https://zine.dev/2023/07/strlcpy-and-strlcat-added-to-glibc/ https://sourceware.org/git/?pglibc.git;acommit;h454a20c8756c9c1d55419153255fc7692b3d2199 https://linux.die.net/man/3/strlcpy https://lwn.net/Articles/612244/ 从这里看&#xff0c;这个strlcpy、st…

前端控制小数点精度及数字千位分割

前端控制小数点精度及数字千位分割&#xff0c;表头居中&#xff0c;每行单元格内容居右。 前端控制小数点精度&#xff1a; <el-table-column prop"cycz" label"差异产值" header-align"center" align"right"><template s…

JVM高级性能调试

标准的JVM是配置为了高吞吐量&#xff0c;吞吐量是为了科学计算和后台运行使用&#xff0c;而互联网商业应用&#xff0c;更多是为追求更短的响应时间&#xff0c;更低的延迟Latency&#xff08;说白了就是更快速度&#xff09;&#xff0c;当用户打开网页没有快速响应&#xf…

安卓机型-MTK芯片掉串码 掉基带 如何用工具进行修复 改写参数

在早期MTK芯片机型中较多使用AP BP方式来修复mtk芯片机型的串码。目前MTK机型对于丢基带 掉串码问题大都使用MODEM META工具来进行修复串码或者改写参数。今天以一款mtk芯片机型来做个演示&#xff0c; 高通芯片类的可以参考; 高通改串相关 工具仅支持在联发科芯片组上运行的…

经纬度相关计算

最近在做经纬度相关的需求&#xff0c;遇到了2个需求。 1、根据2个经纬度计算之间的距离 2、根据1个经纬度&#xff0c;一个距离&#xff0c;求另一个经纬度 我找了好久&#xff0c;没有发现能用的api&#xff0c;高德官方貌似也没有给java的工具文档&#xff0c;希望官方能支持…

WEB使用VUE3实现地图导航跳转

我们在用手机查看网页时可以通过传入经纬度去设置目的地然后跳转到对应的地图导航软件&#xff0c;如果没有下载软件则会跳转到下载界面 注意&#xff1a; 高德地图是一定会跳转到一个新网页然后去询问用户是否需要打开软件百度和腾讯地图是直接调用软件的这个方法有缺陷&…

iOS——引用计数(一)

自动引用计数 自动引用计数&#xff08;ARC&#xff0c;Automatic Reference Counting&#xff09;是指内存管理中对引用采取自动计数的技术。 满足以下要求后&#xff0c;我们的代码就无需再次键入retain或者是release代码了&#xff1a; 使用Xcode 4.2或以上版本使用LLVM编…