用户细分_基于购买历史的用户细分

用户细分

介绍 (Introduction)

The goal of this analysis was to identify different user groups based on the deals they have availed, using a discount app, in order to re-target them with offers similar to ones they have availed in the past.

该分析的目的是使用折扣应用程序基于他们所获得的交易来识别不同的用户组,以便以与他们过去所获得的类似的报价来重新定位他们。

Machine learning algorithm K-means was used to identify user segments based on their purchase behavior. Here is a 3-D illustration of what algorithm extracted.

机器学习算法K-means用于根据用户细分的购买行为识别用户细分。 这是所提取算法的3D图。

Four user segments created by k-means algorithm using purchase history of users
3D image of clusters produced by K-Means, by Muffaddal
Muffaddal的K-Means产生的星团的3D图像

术语: (Terminologies:)

Before going deeper into the analysis, let’s define some keywords being used.

在深入分析之前,让我们定义一些正在使用的关键字。

Deal Avail: When user avails discount using app.Spent: Discounted price user pays while buying an item.Saved: Amount user saved through the app.Brands: Vendors for which discounts are being offered such as Pizza Hut, GreenODeals: Discounts offered to users on different outlets and brands.

交易无效:当用户使用应用程序享受折扣时。 已用:用户在购买商品时支付的折扣价。 已保存:通过应用保存的用户数量。 品牌:为其提供折扣的供应商,例如必胜客,GreenO 交易:为不同商店和品牌的用户提供折扣。

分析 (Analysis)

资料集 (Data sets)

The behavior data set was extracted from Mixpanel using JQL. Following was used for this analysis

使用JQL从Mixpanel提取行为数据集。 以下用于此分析

Image for post
Mixpanel Data Set, by Muffaddal
Mixpanel数据集,作者:Muffaddal

userId: unique id of usersaveAmount: amount saved by user on deal availspentAmount: amount spent by user on deal availbrandName: brand for which deal was availedcount: number of deals availed by user

userId:用户的唯一ID saveAmount :用户在交易有效时所节省的金额costAmount :用户在交易有效时所消耗的金额brandName :已进行交易的品牌 :用户所进行的交易数量

Using the above data set averageSpentAmount, averageSavedAmount and dealAvailCount was calculated for each user as seen below

使用上面的数据集,为每个用户计算了averageSpentAmountaverageSavedAmountdealAvailCount ,如下所示

Image for post
Average Deal Availed Data set, by Muffaddal
Muffaddal的平均交易可用数据集

Machine Learning — K-means ClusteringThe first step of the k-mean algorithm was to find an optimal number of clusters for segmentation. There are a number of methods out there for this purpose, one of which is the elbow method using within-cluster sum square (wcss).

机器学习-K均值聚类 k均值算法的第一步是找到用于分割的最佳聚类数。 为此,存在许多方法,其中一种是使用簇内和平方(wcss)的肘方法。

Image for post
WCSS for up-to 10 clusters, by Muffaddal
Muffaddal的WCSS最多可支持10个集群

Based on the elbow method, 4, 5, and 6 clusters were used to explore the segments and 4 clusters were picked as best for the given data set.

基于弯头方法,使用4、5和6个聚类来探索这些段,并且对于给定的数据集,最好选择4个聚类。

R code for K-Means clustering
用于K均值聚类的R代码

I would recommend these courses on Data camp and Coursera if you want to learn more about user clustering and user segmentation.

如果您想了解有关用户集群和用户细分的更多信息, 我将在 数据营 Coursera 上推荐这些课程

K均值提取了哪些细分? (What Segments K-means extracted?)

Following were average stats of four identified segments:

以下是四个确定的细分市场的平均统计信息:

Image for post
Average stats of each segment
每个细分的平均统计信息
Image for post
Segments Characteristics
细分特征
Image for post
Graphical Representation of Segments Characteristics, by Muffaddal
段特征的图形表示,按Muffaddal

Users in segment 1 and 2 were high paying users with segment 1 users also had saved equally high per deal(probably availed buy 1 get 1 offers). However, the number of deals availed by these users were less than 2 (i.e. 1.3 and 1.4 respectively).

第1段和第2段的用户是高薪用户,第1段的用户每笔交易也节省了同样高的费用(可能使用“买一送一”的优惠)。 但是,这些用户获得的交易数量少于2(即分别为1.3和1.4)。

On the other hand, segment 3 and segment 4 users spent less and hence, saved less as well. However, segment 4 users had the greatest deal availed per user ratio (on average more than 9 deals availed by each user) in all 4 segments. It was the most converted cohort of users.

另一方面,第3段和第4段的用户花费较少,因此节省的也较少。 但是,在所有4个细分受众群中,细分受众群4用户的每位用户交易比例最高(平均每位用户超过9笔交易)。 这是转化率最高的用户群体。

每个细分市场中的用户总数和交易数量是多少? (What were the total number of users and the number of deals availed in each segment?)

Here is the total number of users and deals each segment users had availed.

这是每个细分用户可用的用户总数和交易数。

Image for post
Number of users in segments, by Muffaddal
细分中的用户数(按Muffaddal)
Image for post
Number of deals availed, by Muffaddal
通过Muffaddal获得的交易数量

57% of users belonged to segment 3 and only 3% of users were from the most converted segment (i.e segment 4).

57%的用户属于第3部分,而只有3%的用户来自转化率最高的部分(即第4部分)。

总体用户支出是多少? (What were overall users spending?)

Here is the spread of spending by each segment

这是每个细分受众群的支出分布

Image for post
Spending of users in each segment, by Muffaddal
每个细分领域的用户支出,按Muffaddal划分

Some of the users from segment 4 had high spending (yellow dots in segment 4) similar to segment 1 and 2 but segment 3 (which comprise of 57% of the users) didn't go for high spending deals and/or brands at all.

第4部分的一些用户具有较高的支出(第4部分中的黄点),类似于第1和第2部分,但第3部分(占57%的用户)根本没有进行高支出的交易和/或品牌推广。

每个细分市场用户偏好的品牌类型? (Type of brand each segment users preferred?)

Let’s look at what type of brand these segment users avail to understand any distinction in them.

让我们看看这些细分用户可以使用哪种类型的品牌来理解他们之间的任何区别。

Image for post
Brands users availed, by Muffaddal
Muffaddal推荐的品牌用户

Segment 1 users had availed mix of burger, pizza and fun time, Segment 2 users had availed pizza and segment 3 users had preferred burgers. While Segment 4 users (most converted users) preferred juices and other types of brands.

第1部分用户使用了汉堡,比萨饼和娱乐时间,第2部分用户使用了比萨饼,第3部分用户则选择了汉堡。 而第4类用户(转化最多的用户)则更喜欢果汁和其他类型的品牌。

每个细分市场都有哪些品牌? (What brands each segment availed?.)

Here are the top 10 brands these segmented users had availed.

以下是这些细分用户所使用的十大品牌。

Image for post
Top 10 Brands Availed by Each Segments, by Muffaddal
各细分市场排名前10位的品牌,按Muffaddal列出

Looking at the brands we can comprehend what type of brand and deals these segment users would prefer. Segment 1 & 2 users (high paying users) had availed premium brands such as Sajjad, kababi, Charcoal, California, etc while segment 3 and 4 (low paying users) had mostly opted in for medium to low tier brands.

通过查看品牌,我们可以了解哪些类型的品牌以及这些细分用户希望的交易。 第1段和第2段用户(高收入用户)曾使用过Sajjad,kababi,木炭,加利福尼亚等高级品牌,而第3段和第4段(低收入用户)则大多选择了中低档品牌。

如何运用这些结果? (How these results can be employed?)

Based on different user segments we can:

根据不同的用户群,我们可以:

1- Targeted Ads Personalize ads for each segment would increase the conversion rate as users are more likely to convert on specific brands and offers. So, for example, show Sajjad’s ads to users with higher-paying power then to users with low paying power.

1-定位广告 每个细分受众群的个性化广告将提高转化率,因为用户更有可能转化为特定品牌和优惠。 因此,例如,向高支付能力的用户展示Sajjad的广告,然后向低支付能力的用户展示。

2- In-app RecommendationsOptimize the app to recommend deals and discounts within the app that each segment users would be more interested in.

2-应用内推荐优化应用,以推荐每个细分市场用户更感兴趣的应用内交易和折扣。

摘要 (Summary)

To sum up, with data and proper efforts we were able to identify interesting information about users and their liking and were able to strategies how to engage users more based on their preferences.

综上所述,通过数据和适当的努力,我们能够识别出有关用户及其喜好的有趣信息,并能够根据用户的喜好来制定如何与用户进行更多互动的策略。

翻译自: https://towardsdatascience.com/user-segmentation-based-on-purchase-history-490c57402d53

用户细分

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/242150.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

一个字节的网络漫游故事独白

大家好,给大家介绍一下,我是一个字节。相比于你们人类据说即将达到的百岁人生的寿命,我的一生简直不直一提(我只能存活零点几个毫秒)。也许只有那些码农才会了解我,而且也只有一部分码农。那些整天做业务的…

swap最大值和平均值_SWAP:Softmax加权平均池

swap最大值和平均值Blake Elias is a Researcher at the New England Complex Systems Institute.Shawn Jain is an AI Resident at Microsoft Research.布莱克埃里亚斯 ( Blake Elias) 是 新英格兰复杂系统研究所的研究员。 Shawn Jain 是 Microsoft Research 的 AI驻地 。 …

该酷的酷该飒的飒,穿出自己的潮流前线

精选匈牙利白鸭绒填充,柔软蓬松 舒适感很强,回弹性好 没有什么异味很干净安全 宝贝穿上去保暖又舒适 树脂拉链+金属按扣,松紧帽檐+袖口 下摆还做了可调节抽绳,细节满满防风保暖很nice 短款设计相较于…

pytorch卷积可视化_使用Pytorch可视化卷积神经网络

pytorch卷积可视化Filter and Feature map Image by the author筛选和特征图作者提供的图像 When dealing with image’s and image data, CNN are the go-to architectures. Convolutional neural networks have proved to provide many state-of-the-art solutions in deep l…

Golang之轻松化解defer的温柔陷阱

defer是Go语言提供的一种用于注册延迟调用的机制:让函数或语句可以在当前函数执行完毕后(包括通过return正常结束或者panic导致的异常结束)执行。深受Go开发者的欢迎,但一不小心就会掉进它的温柔陷阱,只有深入理解它的…

u-net语义分割_使用U-Net的语义分割

u-net语义分割Picture By Martei Macru On Unsplash图片由Martei Macru On Unsplash拍摄 Semantic segmentation is a computer vision problem where we try to assign a class to each pixel . Unlike the classic image classification task where only one class value is …

我国身家超过亿元的有多少人?

目前我国身家达到亿元以上的人数,从公开数据来看大概有13万人,但如果把那些统计不到的隐形亿万富翁计算在内,我认为至少有20万以上。公开资料显示目前我国亿万富翁人数达到133000人根据胡润2018财富报告显示,目前我国(…

地理空间数据

摘要 (Summary) In this article, using Data Science and Python, I will show how different Clustering algorithms can be applied to Geospatial data in order to solve a Retail Rationalization business case.在本文中,我将使用数据科学和Python演示如何将…

嵌入式系统分类及其应用场景_词嵌入及其应用简介

嵌入式系统分类及其应用场景Before I give you an introduction on Word Embeddings, take a look at the following examples and ask yourself what is common between them:在向您介绍Word Embeddings之前,请看一下以下示例并问问自己它们之间的共同点是什么&…

山东男子5个月刷信用卡1800次,被银行处理后他选择29次取款100元

虽然我国实行的是存款自愿,取款自由的储蓄政策,客户想怎么取款,在什么时候取,取多少钱,完全是客户的权利,只要客户的账户上有钱,哪怕他每次取一毛钱取个100次都是客户的权利。但是明明可以一次性…

深发银行为什么要更名为平安银行?

深圳发展银行之所以更名为平安银行,最直接的原因是平安银行收购了深圳发展银行,然后又以平安集团作为主体,以深圳发展银行的名义收购了平安银行,最后两个人合并之后统一命名为平安银行。深圳发展银行更名为平安银行,大…

高斯过程分类和高斯过程回归_高斯过程回归建模入门

高斯过程分类和高斯过程回归Gaussian processing (GP) is quite a useful technique that enables a non-parametric Bayesian approach to modeling. It has wide applicability in areas such as regression, classification, optimization, etc. The goal of this article i…

假如购买的期房不小心烂尾了,那银行贷款是否可以不还了?

如今房价一路高升,再加上开发商融资难度越来越大,现在很多人都开始打期房的主意。期房不论是对开发商还是对购房者来说都是双赢的,开发商可以以较低的融资成本维持楼盘的开发,提高财务杠杆,而购房者可以较低的价格买房…

在银行存款5000万,能办理一张50万额度的信用卡吗?

拥有一张大额信用卡是很多人梦寐以求的事情,大额信用卡不仅实用,在关键时刻可以把钱拿出来刷卡或者取现,这是一种非常方便的融资方式。然而大额信用卡并不是说谁想申请就可以申请下来,正常情况下,10万以上额度以上的信…

hotelling变换_基于Hotelling-T²的偏最小二乘(PLS)中的变量选择

hotelling变换背景 (Background) One of the most common challenges encountered in the modeling of spectroscopic data is to select a subset of variables (i.e. wavelengths) out of a large number of variables associated with the response variable. It is common …

商业银行为什么大量组织高净值小规模活动?

在管理界有一个非常著名的定律叫做二八定律,所谓28定律就是20%的客户贡献了企业80%的利润。虽然这个定律在银行不一定适用,但同样的道理用于银行营销也是合适的。银行之所以经常组织一些高净值小规模的活动,因为这些客户的资产和价值比较高&a…

在县城投资买一辆出租车,一个月能收入多少钱?

在县城投资出租车能赚多少钱具体要看你是什么县城,比如西部的县城勉强能养活自己,中部的县城一个月能赚个5、6千,东部的小县城月赚个万元以上也有可能。具体回报率怎么样可以先算下投资一个出租车的成本投资一个出租车的构成成本比较多&#…

通过ISO镜像文件安装Ubuntu(可实现默认启动Windows的双系统)

解压文件 使用WinRAR等软件,Ubuntu ISO镜像文件中的casper文件夹解压到硬盘中的任意分区根目录,把ISO镜像也放在那个分区根目录。 使用Grub4dos启动Ubuntu 使用grub4dos启动Ubuntu,menu.lst写法如下。其中root命令指定了硬盘分区编号&#xf…

命名实体识别 实体抽取_您的公司为什么要关心命名实体的识别

命名实体识别 实体抽取Named entity recognition is the task of categorizing text into entities, such as people, locations, and dates. For example, for the sentence, On April 30, 1789, George Washington was inaugurated as the first president of the United Sta…

表达式测试

1111 (parameters) -> { statements; }//求平方 (int a) -> {return a * a;}//打印,无返回值 (int a) -> {System.out.println("a " a);}