数仓开发:如何计算投放效果?

背景介绍

业务介绍:用户是通过低价引流进来,然后通过复购购买高价商品,可以多次购买。低价商品和高价商品均可以退款,高价商品由于各种复杂的场景,可能会有多次退款。低价商品如果退款是全退,不存在多次退款。
业务需求:计算低价引流的流量后续的产出,复购的期限是在 30天内,退款的期限也是按 30天(退款时间-复购时间),最终交付一个按引流订单维度。

架构设计

由于引流订单和复购订单的性质不同,在数仓贴源层处理的过程中,已经将二者分离独立成表,此处是在贴源层的基础上加以处理。
将引流订单记录、复购订单记录和退款订单记录以引流订单为主表整合到一起,输出不加业务逻辑的事实表:订单链路数据表。在此基础上,再根据不同的业务逻辑处理为不同聚合表单。

数据开发

说明:为方便测试,以下使用的是 MySQL 语法,实际开发使用的是阿里云的 MaxCompute SQL,本代码的差异点在于日期函数date_add(),在 MaxCompute SQL 中,语法有一定差异,后者是date_add(<date_col_name>, 30),如果时间字段包含日期和时间,则要使用dateadd(<datetime_col_name>,30,'dd')

订单链路数据表好处理,把三个表根据用户信息进行 JOIN 即可,伪代码参考如下:

select *
from <引流订单记录>
left join <复购订单记录> on <用户信息>
left join <退款订单记录> on <用户信息>

有了订单链路数据表,接下来按 30天的复购和30天的退款期限聚合为引流订单产出表。
抽象出一个表单,数据记录如下,有 10 个字段

  • low_price_order_id:低价订单号
  • low_price_paid_time:低价订单付款时间
  • low_price_paid_amount:低价订单付款金额
  • low_price_refund_time:低价订单退款时间
  • low_price_refund_amount:低价订单退款金额
  • high_price_order_id:高价订单号
  • high_price_paid_time:高价订单付款时间
  • high_price_paid_amount:高价订单付款金额
  • high_price_refund_time:高价订单退款时间
  • high_price_refund_amount:高价订单退款金额

with user_orders as(
-- 只有低价订单
select '10001' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",null as "high_price_order_id",null as "high_price_paid_time",0 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 只有低价订单,且退款
select '10002' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",'2024-01-02' as "low_price_refund_time",1.0 as "low_price_refund_amount",null as "high_price_order_id",null as "high_price_paid_time",0 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 低价订单+一个高价订单
select '10003' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20001' as "high_price_order_id",'2024-01-01' as "high_price_paid_time",1000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 低价订单+2个高价订单
select '10004' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20002' as "high_price_order_id",'2024-01-02' as "high_price_paid_time",2000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union allselect '10004' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20004' as "high_price_order_id",'2024-01-05' as "high_price_paid_time",1000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 低价订单+1个30天内高价订单+1个30天外高价订单
select '10005' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20003' as "high_price_order_id",'2024-01-02' as "high_price_paid_time",2000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union allselect '10005' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20010' as "high_price_order_id",'2024-02-04' as "high_price_paid_time",1500 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all-- 低价订单+一个高价订单且一个退款
select '10006' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20005' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-05' as "high_price_refund_time",2000 as "high_price_refund_amount"
union all
-- 低价订单+一个高价订单且2个退款
select '10007' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20006' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-03' as "high_price_refund_time",1000 as "high_price_refund_amount"
union allselect '10007' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20006' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-04' as "high_price_refund_time",1000 as "high_price_refund_amount"
union all
-- 低价订单+一个高价订单且1个30天内退款1个30天外退款
select '10008' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20007' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-05' as "high_price_refund_time",100 as "high_price_refund_amount"
union allselect '10008' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20007' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-02-05' as "high_price_refund_time",1900 as "high_price_refund_amount"
union all
-- 低价订单+1个30天内高价订单—+1个30天外高价订单且该单有30天内退款和30天外退款
select '10009' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20008' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",1500 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union allselect '10009' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20009' as "high_price_order_id",'2024-02-04' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-02-05' as "high_price_refund_time",100 as "high_price_refund_amount"
union allselect '10009' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount",'20009' as "high_price_order_id",'2024-02-04' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-03-06' as "high_price_refund_time",1900 as "high_price_refund_amount"
)
select *
from user_orders;

数据参考如下:

  • 部分只有低价订单,可能有退款,退款可能是 30天内,可能是 30天外;
  • 部分有高价订单,高价订单可能是 30天内,可能是 30天外;
  • 部分有多笔高价订单,每笔可能是 30天内,可能是 30天外;
  • 部分高价订单有退款,退款可能是 30天内,可能是 30天外;
  • 部分高价订单有多笔退款,每笔退款可能是 30天内,可能是 30天外。

image.png

目标表字段记录如下:

  • low_price_order_id:低价订单号
  • low_price_paid_time:低价订单付款时间
  • low_price_paid_amount:低价订单付款金额
  • low_price_refund_amount:30天内低价订单退款金额
  • high_price_paid_amount:30天内高价订单付款金额
  • high_price_refund_amount:30天内高价订单退款金额

接下来开始聚合目标表。
保留低价订单唯一记录,那就直接按照低价订单聚合?
逻辑上没问题,直接聚合试试。

不加日期限制时,结构如下:

select uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount,sum(uos.low_price_refund_amount),sum(uos.high_price_paid_amount),sum(uos.high_price_refund_amount)
from user_orders uos
group by uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount

按 30 天周期限制,对每个聚合字段进行界限判断:

select uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount,sum(case when uos.low_price_refund_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.low_price_refund_amount else 0 end)     low_price_refund_amount,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.high_price_paid_amount else 0 end)       high_price_paid_amount,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) and uos.high_price_refund_time<=date_add(uos.high_price_paid_time,interval 30 day) then uos.high_price_refund_amount else 0 end)  high_price_refund_amount
from user_orders uos
group by uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount

查看结果,显然不行!当高价订单有多笔退款时,数据发散了,直接聚合时,会出现翻倍的异常。
image.png

既然高价订单发散了,为了保证唯一记录,需要分两段聚合,先按低价订单和高价订单聚合退款数据,然后再按低价订单聚合高价订单金额和退款金额,数据流转参考如下。
image.png

不加时间限制时,参考如下:

select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,sum(uos.low_price_refund_amount)  low_price_refund_amount,sum(uos.high_price_paid_amount)   high_price_paid_amount,sum(uos.high_price_refund_amount) high_price_refund_amount
from(select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount,sum(uos.high_price_refund_amount) high_price_refund_amountfrom user_orders uosgroup by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
)uos
group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time

接下来加上限制。
先按低价订单和高价订单聚合退款数据,把高价订单的退款时间减去高价订单的付款时间在30天内的退款金额聚合。
然后再按低价订单,把高价订单的付款时间减去低价订单的付款时间在 30天内的高价订单的付款金额和高价订单的退款金额聚合,同时把低价订单的退款时间减去低价订单的付款时间在 30天内的低价订单的付款金额聚合,得到最终的目标表。

select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,sum(case when uos.low_price_refund_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.low_price_refund_amount else 0 end)     low_price_refund_amount,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.high_price_paid_amount else 0 end)       high_price_paid_amount,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.high_price_refund_amount else 0 end)     high_price_refund_amount
from(select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount,sum(case when uos.high_price_refund_time<=date_add(uos.high_price_paid_time,interval 30 day) then uos.high_price_refund_amount else 0 end)  high_price_refund_amountfrom user_orders uosgroup by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
)uos
group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time

结果如下,符合需求的预期。
image.png

可视化

说明:实际业务表单还有很多其他的维度,此处仅抽象出时间维度。

有了目标表,便可以根据目标表按时间维度聚合每月、每周、每天的营收金额、退款率等数据指标。
由于案例数据较少,此处做一个按天查看营收金额和退款率的折线图,看看二者的关系走势。

  • 营收金额:低价订单付款金额-低价订单退款金额+高价订单付款金额-高价订单付款金额
  • 退款率:高价订单退款金额>0 的数量/总低价订单数,一个低价订单可以视为是一个用户,按用户数计算退款率。
select low_price_paid_time,sum(low_price_paid_amount-low_price_refund_amount+high_price_paid_amount-high_price_refund_amount) 营收金额,sum(if(high_price_refund_amount>0,1,0))/count(*) 退款率
from <引流订单产出表>
group by low_price_paid_time
order by low_price_paid_time

image.png

小结

本文介绍了怎么实现以引流的低价订单的为基本维度,按照业务 30天的间隔分别聚合低价订单退款、高价订单金额和高价订单退款金额。

采用了两层表单的设计,一层事实表,一层根据业务不同的口径进行聚合。

处理业务逻辑的时候,根据事实表的结构,采用分段聚合的逻辑先聚合【低价订单+高价订单】,然后再根据【低价订单】进行聚合,最终得到目标表。

目标表开发好之后,在该表的基础上,根据各类业务指标进行聚合并可视化,最终提交给业务方使用。

附录

附上 MaxCompute SQL 处理逻辑,此案例,将 date_add()语法修改即可。

select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,sum(case when uos.low_price_refund_time<=date_add(uos.low_price_paid_time,30) then uos.low_price_refund_amount else 0 end)     low_price_refund_amount,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,30) then uos.high_price_paid_amount else 0 end)       high_price_paid_amount,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,30) then uos.high_price_refund_amount else 0 end)     high_price_refund_amount
from(select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount,sum(case when uos.high_price_refund_time<=date_add(uos.high_price_paid_time,30) then uos.high_price_refund_amount else 0 end)  high_price_refund_amountfrom user_orders uosgroup by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
)uos
group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/diannao/24879.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【环境搭建】3.阿里云ECS服务器 安装Redis

在阿里云的 Alibaba Cloud Linux 3.2104 LTS 64位系统上安装 Redis 可以通过以下步骤完成&#xff1a; 1.更新系统软件包&#xff1a; 首先&#xff0c;更新系统软件包以确保所有软件包都是最新的&#xff1a; sudo yum update -y2.安装编译工具和依赖项&#xff1a; Redis…

使用树莓派和 L298N 来 DIY 小车底盘

树莓派小车可以作为 STEM&#xff08;科学、技术、工程、数学&#xff09;教育的工具&#xff0c;在实际操作中帮助学生理解和学习电子技术、编程和机器人原理。可以培养学生的动手能力、解决问题的能力和创新思维。 随着近年 AI 技术的高速发展&#xff0c;SLAM、VSLAM 甚至带…

2024儿科学中文核心期刊汇总,附投稿信息

第10版《中文核心期刊要目总览》入编了8本儿科学期刊&#xff0c;新入编的期刊是《临床小儿外科杂志》。常笑医学整理了儿科学核心期刊的详细参数&#xff0c;供大家在论文投稿时参考&#xff0c;有需要的赶紧收藏&#xff01; 1.《中华儿科杂志》 &#xff08;详细投稿信息请…

【NI国产替代】高速数据采集模块,最大采样率为 125 Msps,支持 FPGA 定制化

• 双通道高精度数据采集 • 支持 FPGA 定制化 • 双通道高精度采样率 最大采样率为 125 Msps12 位 ADC 分辨率 最大输入电压为 0.9 V -3 dB 带宽为 30 MHz 支持 FPGA 定制化 根据需求编程实现特定功能和性能通过定制 FPGA 实现硬件加速&#xff0c;提高系统的运算速度FPGA…

快速修改验证Sepolicy(Selinux)

一&#xff0c;判断是否为Sepolicy问题 Step1. 当某个进程出问题时&#xff0c;举个例子&#xff0c;比如so明明存在却无法link&#xff0c;那么看日志里是否有相关的avc&#xff1a; avc: denied { open } for path"/data/system/myapp.config" dev"dm-0&quo…

OpenCV学习(4.8) 图像金字塔

1.目的 在这一章当中&#xff0c; 我们将了解图像金字塔。我们将使用图像金字塔创建一个新的水果&#xff0c;“Orapple”我们将看到这些功能&#xff1a; cv.pyrUp&#xff08;&#xff09; &#xff0c; cv.pyrDown&#xff08;&#xff09; 在通常情况下我们使用大小恒定…

node的安装

node是前端开发环境&#xff0c;所以运行前端程序需要安装和配置node 1. 下载安装node 去node官网选择你需要的版本进行下载 Node.js — Download Node.js (nodejs.org) ​ 下载到本地后一路点击next傻瓜式安装&#xff0c;安装成功后测试是否安装成功 node -v 显示node版…

几种数据集格式

在机器学习和计算机视觉领域&#xff0c;有多种数据集格式被广泛使用来存储和交换数据&#xff0c;尤其是图像数据。以下是一些常见的数据集格式&#xff1a; JSON (JavaScript Object Notation): 一种轻量级的数据交换格式&#xff0c;易于人阅读和编写&#xff0c;也易于机器…

【Ardiuno】使用ESP32网络功能调用接口数据(图文)

接着上文连通wifi后&#xff0c;我们通过使用HTTPClient库进行网络相关操作&#xff0c;这里我们通过http协议进行接口调用。 为了简化操作&#xff0c;这里使用了本地服务器上的文件作为接口&#xff0c;正常操作时会调用接口后&#xff0c;将服务器返回的数据进行解析&#…

白话解读网络爬虫

网络爬虫&#xff08;Web Crawler&#xff09;&#xff0c;也称为网络蜘蛛、网络机器人或网络蠕虫&#xff0c;是一种自动化程序或脚本&#xff0c;被用来浏览互联网并收集信息。网络爬虫的主要功能是在互联网上自动地浏览网页、抓取内容并将其存储在本地或远程服务器上供后续处…

独孤思维:高考那段日子,我痛不欲生

今天是高考日。 回想自己当年高考的情景&#xff0c;还历历在目。 备考那段时间&#xff0c;每天没日没夜做卷子。 惴惴不安&#xff0c;每天焦虑&#xff0c;不得安宁。 当时还在想&#xff0c;高考完了以后&#xff0c;要怎么怎么玩&#xff0c;怎么怎么野。 但是真的到…

【模拟-BM99 顺时针旋转矩阵】

题目 BM99 顺时针旋转矩阵 描述 有一个NxN整数矩阵&#xff0c;请编写一个算法&#xff0c;将矩阵顺时针旋转90度。 给定一个NxN的矩阵&#xff0c;和矩阵的阶数N,请返回旋转后的NxN矩阵。 分析 模拟&#xff0c;写几个样例&#xff0c;分析一下新矩阵元素下标与原矩阵元素…

游戏心理学Day08

从本质上讲&#xff0c;游戏是对现实世界规律的简化和明晰化&#xff0c;并以此为基础&#xff0c;对现实世界进行建模。通过游戏&#xff0c;我们认识到艰苦的工作原来就是 幸福的来源&#xff0c;只要工作目标明确&#xff0c;充满挑战&#xff0c;反馈及时和充满社会化合作…

python记录之字符串

在Python中&#xff0c;字符串是一种非常常见且重要的数据类型&#xff0c;用于存储文本信息。下面&#xff0c;我们将对Python字符串进行深入的讲解&#xff0c;包括其基本操作、常见方法、格式化以及高级特性。 1. 字符串的创建 在Python中&#xff0c;字符串可以通过单引号…

编译原理-语法分析(实验 C语言)

语法分析 1. 实验目的 编制一个递归下降分析程序&#xff0c;实现对词法分析程序所提供的单词序列的语法检查和结构分析 2. 实验要求 利用C语言编制递归下降分析程序&#xff0c;并对简单语言进行语法分析 2.1 待分析的简单语言的语法 用扩充的BNF表示如下&#xff1a; …

牛客NC32 求平方根【简单 二分 Java/Go/C++】

题目 题目链接&#xff1a; https://www.nowcoder.com/practice/09fbfb16140b40499951f55113f2166c 思路 Java代码 import java.util.*;public class Solution {/*** 代码中的类名、方法名、参数名已经指定&#xff0c;请勿修改&#xff0c;直接返回方法规定的值即可*** para…

【python报错】TypeError: ‘dict_values‘ Object IsNot Subscriptable

【Python报错】TypeError: ‘dict_values’ object is not subscriptable 在Python中&#xff0c;字典&#xff08;dict&#xff09;提供了几种不同的视图对象&#xff0c;包括dict_keys、dict_values和dict_items。这些视图对象允许你以只读方式遍历字典的键、值或键值对。如果…

vue 创建一个新项目 以及 手动配置选项

【Vue】3.0 项目创建 自定义配置_vue3.0-CSDN博客

GPT-4o仅排第二!北大港大等6所高校联手,发布权威多模态大模型榜单!

多模态大模型视频分析能力榜单出炉&#xff1a; Gemini 1.5 Pro最强&#xff0c;GPT-4o仅排第二&#xff1f; 曾经红极一时的GPT-4V屈居第三。 3.5研究测试&#xff1a;hujiaoai.cn 4研究测试&#xff1a;askmanyai.cn Claude-3研究测试&#xff1a;hiclaude3.com 最近&#…

WordPress网站更换域名后如何重新激活elementor

在创建WordPress网站时&#xff0c;我们常常需要更改域名。但是&#xff0c;在更换域名后&#xff0c;你可能会遇到一个问题&#xff1a;WordPress后台中的Elementor插件授权状态会显示为不匹配。这时&#xff0c;就需要重新激活Elementor插件的授权。下面我会详细说明如何操作…