优化 回归_使用回归优化产品价格

优化 回归

应用数据科学 (Applied data science)

Price and quantity are two fundamental measures that determine the bottom line of every business, and setting the right price is one of the most important decisions a company can make. Under-pricing hurts the company’s revenue if consumers are willing to pay more and, on the other hand, over-pricing can hurt in a similar fashion if consumers are less inclined to buy the product at a higher price.

价格和数量是确定每项业务底线的两个基本指标,而设定正确的价格是公司可以做出的最重要决定之一。 如果消费者愿意支付更高的价格,定价过低会损害公司的收入;另一方面,如果消费者不太愿意以更高的价格购买产品,那么定价过高也会以类似的方式受到损害。

So given the tricky relationship between price and sales, where is the sweet spot — the optimum price — that maximizes product sales and earns most profit?

因此,考虑到价格与销售之间的棘手关系,最佳产品的最佳销售点在哪里?这可以最大化产品的销售并获得最大的利润?

The purpose of this article is to answer this question by implementing a combination of the economic theory and a regression algorithm in Python environment.

本文的目的是通过在Python环境中实现经济理论和回归算法的结合来回答这个问题。

1.资料 (1. Data)

We are optimizing a future price based on the relationship between historical price and sales, so the first thing we need is the past data on these two indicators. For this exercise, I’m using a time series data on historical beef sales and corresponding unit prices.

我们正在根据历史价格和销售量之间的关系来优化未来价格,因此我们需要的第一件事是这两个指标的过去数据。 在本练习中,我使用有关历史牛肉销售量和相应单价的时间序列数据。

# load data
import pandas as pd
beef = pd# view first few rows
beef.tail(5
Image for post

The dataset contains a total of 91 observations of quantity-price pairs reported on a quarterly basis.

该数据集包含每季度报告的91个数量-价格对的观察值。

It is customary in data science to do exploratory data analysis (EDA), but I’m skipping that part to focus on modeling. Nevertheless, I strongly encourage taking this extra step to make sure you understand the data before building models.

数据科学中通常进行探索性数据分析(EDA),但我跳过了这一部分,而只关注建模。 不过,我强烈建议您采取额外的步骤,以确保您在构建模型之前了解数据。

2.图书馆 (2. Libraries)

We need to import libraries for three reasons: manipulating data, building the model, and visualizing the functions.

我们需要导入库的原因有三个:处理数据,构建模型和可视化功能。

We are importing numpy and pandas for creating and manipulating table, mtplotlib and seaborn for visualization and statsmodels API to build and run the regression model.

我们将导入numpypandas用于创建和操作表格, mtplotlibseaborn用于可视化和statsmodels API以构建和运行回归模型。

import numpy as np
from pandas import DataFrame
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.formula.api import ols
%matplotlib inline

2.定义利润函数 (2. Defining the profit function)

We know that revenue depends on the quantity sold and the unit price of products.

我们知道,收入取决于出售的数量和产品的单价。

We also know that profit is calculated by netting out costs from revenue.

我们也知道,利润是通过从收入中扣除成本来计算的。

Putting these two together we get the following equations:

将这两个放在一起,我们得到以下方程式:

# revenue
revenue = quantity * price # eq (1)# profit
profit = revenue - cost # eq (2)

We can rewrite the profit function by combining eq. #1 and 2 as follows:

我们可以结合等式来重写利润函数。 #1和2如下:

# revised profit function
profit = quantity * price - cost # eq (3)

Eq #3 tells us that we need three pieces of information to calculate profit: quantity, price and cost.

方程3告诉我们,我们需要三项信息来计算利润:数量,价格和成本。

3.定义需求函数 (3. Defining the demand function)

We first need to establish the relationship between quantity and price — the demand function. This demand function is estimated from a “demand curve” based on the linear relationship between price and quantity.

我们首先需要建立数量和价格之间的关系-需求函数。 根据价格和数量之间的线性关系,根据“需求曲线”估算此需求函数。

# demand curve
sns.lmplot(x = "Price", y = "Quantity",
data = beef, fig_reg = True, size = 4)
Image for post

To find that demand curve we will fit an Ordinary Least Square (OLS) regression model.

为了找到需求曲线,我们将拟合普通最小二乘(OLS)回归模型。

# fit OLS model
model = ols("Quantity ~ Price", data = beef).fit()# print model summary
print(model.summary())

The following are the regression results with the necessary coefficients needed for further analysis.

以下是回归结果以及进一步分析所需的必要系数。

Image for post

5.找到利润最大化的价格 (5. Finding the profit-maximizing price)

The coefficient we are looking for is coming from the regression model above — the intercept and the price coefficient — to measure the corresponding sales quantity. We can now plug these values into equation 3.

我们正在寻找的系数来自上面的回归模型(截距和价格系数),用于测量相应的销售量。 现在,我们可以将这些值插入方程式3。

# plugging regression coefficients
quantity = 30.05 - 0.0465 * price # eq (5)# the profit function in eq (3) becomes
profit = (30.05 - 0.0465 * price) * price - cost # eq (6)

The next step is to find the price we are looking for from a range of options. The codes below should be intuitive, but basically what we are doing here is calculating revenue for each price and the corresponding quantity sold.

下一步是从一系列选项中找到我们要寻找的价格。 下面的代码应该很直观,但是基本上我们在这里要做的是计算每个价格和相应销售数量的收入。

# a range of diffferent prices to find the optimum one
Price = [320, 330, 340, 350, 360, 370, 380, 390]# assuming a fixed cost
cost = 80Revenue = []for i in Price:
quantity_demanded = 30.05 - 0.0465 * i

# profit function
Revenue.append((i-cost) * quantity_demanded)# create data frame of price and revenue
profit = pd.DataFrame({"Price": Price, "Revenue": Revenue})#plot revenue against price
plt.plot(profit["Price"], profit["Revenue"])

If price and revenue are plotted, we can visually identify the peak of the revenue and find the price that makes the revenue at the highest point on the curve.

如果绘制了价格和收入,我们可以直观地识别收入的峰值,并找到使收入处于曲线最高点的价格。

Image for post

So we find that the maximum revenue at different price levels is reached at $3,726 when the price is set at $360.

因此,我们发现,当价格设为360美元时,在不同价格水平下的最高收入达到3,726美元。

# price at which revenue is maximum
profit[profit['Revenue'] == profit[['Revenue'].max()]
Image for post

总结和结论 (Summary and conclusions)

The purpose of this article was to demonstrate how to find the price at which the revenue or profit is maximized using a combination of economic theory and statistical modeling. In the initial steps we defined the demand and profit functions, and then ran a regression to find the parameter values needed to feed into the profit/revenue function. And finally, we checked revenues under different price levels to get the price for the corresponding maximum revenue.

本文的目的是演示如何结合经济理论和统计模型找到使收益或利润最大化的价格。 在最初的步骤中,我们定义了需求和利润函数,然后进行回归以找到输入利润/收益函数所需的参数值。 最后,我们检查了不同价格水平下的收入,以获得对应的最大收入的价格。

翻译自: https://towardsdatascience.com/optimizing-product-price-using-regression-2c17688e65ea

优化 回归

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389422.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

大数据数据科学家常用面试题_进行数据科学工作面试

大数据数据科学家常用面试题During my time as a Data Scientist, I had the chance to interview my fair share of candidates for data-related roles. While doing this, I started noticing a pattern: some kinds of (simple) mistakes were overwhelmingly frequent amo…

scrapy模拟模拟点击_模拟大流行

scrapy模拟模拟点击复杂系统 (Complex Systems) In our daily life, we encounter many complex systems where individuals are interacting with each other such as the stock market or rush hour traffic. Finding appropriate models for these complex systems may give…

vue.js python_使用Python和Vue.js自动化报告过程

vue.js pythonIf your organization does not have a data visualization solution like Tableau or PowerBI nor means to host a server to deploy open source solutions like Dash then you are probably stuck doing reports with Excel or exporting your notebooks.如果…

plsql中导入csvs_在命令行中使用sql分析csvs

plsql中导入csvsIf you are familiar with coding in SQL, there is a strong chance you do it in PgAdmin, MySQL, BigQuery, SQL Server, etc. But there are times you just want to use your SQL skills for quick analysis on a small/medium sized dataset.如果您熟悉SQ…

计算机科学必读书籍_5篇关于数据科学家的产品分类必读文章

计算机科学必读书籍Product categorization/product classification is the organization of products into their respective departments or categories. As well, a large part of the process is the design of the product taxonomy as a whole.产品分类/产品分类是将产品…

交替最小二乘矩阵分解_使用交替最小二乘矩阵分解与pyspark建立推荐系统

交替最小二乘矩阵分解pyspark上的动手推荐系统 (Hands-on recommender system on pyspark) Recommender System is an information filtering tool that seeks to predict which product a user will like, and based on that, recommends a few products to the users. For ex…

python 网页编程_通过Python编程检索网页

python 网页编程The internet and the World Wide Web (WWW), is probably the most prominent source of information today. Most of that information is retrievable through HTTP. HTTP was invented originally to share pages of hypertext (hence the name Hypertext T…

火种 ctf_分析我的火种数据

火种 ctfOriginally published at https://www.linkedin.com on March 27, 2020 (data up to date as of March 20, 2020).最初于 2020年3月27日 在 https://www.linkedin.com 上 发布 (数据截至2020年3月20日)。 Day 3 of social distancing.社会疏离的第三天。 As I sit on…

data studio_面向营销人员的Data Studio —报表指南

data studioIn this guide, we describe both the theoretical and practical sides of reporting with Google Data Studio. You can use this guide as a comprehensive cheat sheet in your everyday marketing.在本指南中,我们描述了使用Google Data Studio进行…

人流量统计系统介绍_统计介绍

人流量统计系统介绍Its very important to know about statistics . May you be a from a finance background, may you be data scientist or a data analyst, life is all about mathematics. As per the wiki definition “Statistics is the discipline that concerns the …

乐高ev3 读取外部数据_数据就是新乐高

乐高ev3 读取外部数据When I was a kid, I used to love playing with Lego. My brother and I built almost all kinds of stuff with Lego — animals, cars, houses, and even spaceships. As time went on, our creations became more ambitious and realistic. There were…

图像灰度化与二值化

图像灰度化 什么是图像灰度化? 图像灰度化并不是将单纯的图像变成灰色,而是将图片的BGR各通道以某种规律综合起来,使图片显示位灰色。 规律如下: 手动实现灰度化 首先我们采用手动灰度化的方式: 其思想就是&#…

分析citibike数据eda

数据科学 (Data Science) CitiBike is New York City’s famous bike rental company and the largest in the USA. CitiBike launched in May 2013 and has become an essential part of the transportation network. They make commute fun, efficient, and affordable — no…

上采样(放大图像)和下采样(缩小图像)(最邻近插值和双线性插值的理解和实现)

上采样和下采样 什么是上采样和下采样? • 缩小图像(或称为下采样(subsampled)或降采样(downsampled))的主要目的有 两个:1、使得图像符合显示区域的大小;2、生成对应图…

r语言绘制雷达图_用r绘制雷达蜘蛛图

r语言绘制雷达图I’ve tried several different types of NBA analytical articles within my readership who are a group of true fans of basketball. I found that the most popular articles are not those with state-of-the-art machine learning technologies, but tho…

java 分裂数字_分裂的补充:超越数字,打印物理可视化

java 分裂数字As noted in my earlier Nightingale writings, color harmony is the process of choosing colors on a Color Wheel that work well together in the composition of an image. Today, I will step further into color theory by discussing the Split Compleme…

结构化数据建模——titanic数据集的模型建立和训练(Pytorch版)

本文参考《20天吃透Pytorch》来实现titanic数据集的模型建立和训练 在书中理论的同时加入自己的理解。 一,准备数据 数据加载 titanic数据集的目标是根据乘客信息预测他们在Titanic号撞击冰山沉没后能否生存。 结构化数据一般会使用Pandas中的DataFrame进行预处理…

比赛,幸福度_幸福与生活满意度

比赛,幸福度What is the purpose of life? Is that to be happy? Why people go through all the pain and hardship? Is it to achieve happiness in some way?人生的目的是什么? 那是幸福吗? 人们为什么要经历所有的痛苦和磨难? 是通过…

带有postgres和jupyter笔记本的Titanic数据集

PostgreSQL is a powerful, open source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.PostgreSQL是一个功能强大的开源对象关系数据库系统&am…

Django学习--数据库同步操作技巧

同步数据库:使用上述两条命令同步数据库1.认识migrations目录:migrations目录作用:用来存放通过makemigrations命令生成的数据库脚本,里面的生成的脚本不要轻易修改。要正常的使用数据库同步的功能,app目录下必须要有m…