python多项式回归_如何在Python中实现多项式回归模型

python多项式回归

Let’s start with an example. We want to predict the Price of a home based on the Area and Age. The function below was used to generate Home Prices and we can pretend this is “real-world data” and our “job” is to create a model which will predict the Price based on Area and Age:

让我们从一个例子开始。 我们想根据面积和年龄来预测房屋价格。 下面的函数用于生成房屋价格,我们可以假装这是“真实数据”,而我们的“工作”是创建一个模型,该模型将根据面积和年龄预测价格:

价格= -3 *面积-10 *年龄+ 0.033 *面积²-0.0000571 *面积³+ 500 (Price = -3*Area -10*Age + 0.033*Area² -0.0000571*Area³ + 500)

Image for post
Home Prices vs Area & Age
房屋价格与面积和年龄

线性模型 (Linear Model)

Let’s suppose we just want to create a very simple Linear Regression model that predicts the Price using slope coefficients c1 and c2 and the y-intercept c0:

假设我们只想创建一个非常简单的线性回归模型,该模型使用斜率系数c1和c2以及y轴截距 c0来预测价格:

Price = c1*Area+c2*Age + c0

价格= c1 *面积+ c2 *年龄+ c0

We’ll load the data and implement Scikit-Learn’s Linear Regression. Behind the scenes, model coefficients (c0, c1, c2) are computed by minimizing the sum of squares of individual errors between target variable y and the model prediction:

我们将加载数据并实现Scikit-Learn的线性回归 。 在幕后,通过最小化目标变量y与模型预测之间的各个误差的平方和来计算模型系数(c0,c1,c2):

But you see we don’t do a very good job with this model.

但是您会看到我们在此模型上做得不好。

Image for post
Simple Linear Regression Model (Mean Relative Error: 9.5%)
简单线性回归模型(平均相对误差:9.5%)

多项式回归模型 (Polynomial Regression Model)

Next, let’s implement the Polynomial Regression model because it’s the right tool for the job. Rewriting the initial function used to generate the home Prices, where x1 = Area, and x2 = Age, we get the following:

接下来,让我们实现多项式回归模型,因为它是这项工作的正确工具。 重写用于生成房屋价格的初始函数,其中x1 =面积,x2 =年龄,我们得到以下信息:

价格= -3 * x1 -10 * x2 + 0.033 *x1²-0.0000571 *x1³+ 500 (Price = -3*x1 -10*x2 + 0.033*x1² -0.0000571*x1³ + 500)

So now instead of the Linear model (Price = c1*x1 +c2*x2 + c0), Polynomial Regression requires we transform the variables x1 and x2. For example, if we want to fit a 2nd-degree polynomial, the input variables are transformed as follows:

因此,现在多项式回归代替线性模型(价格= c1 * x1 + c2 * x2 + c0),需要转换变量x1和x2。 例如,如果要拟合二阶多项式,则输入变量的转换如下:

1, x1, x2, x1², x1x2, x2²

1,x1,x2,x1²,x1x2,x2²

But our 3rd-degree polynomial version will be:

但是我们的三阶多项式将是:

1, x1, x2, x1², x1x2, x2², x1³, x1²x2, x1x2², x2³

1,x1,x2,x1²,x1x2,x2²,x1³,x1²x2,x1x2²,x2³

Then we can use the Linear model with the polynomially transformed input features and create a Polynomial Regression model in the form of:

然后,我们可以将线性模型与多项式转换后的输入特征一起使用,并创建以下形式的多项式回归模型:

Price = 0*1 + c1*x1 + c2*x2 +c3*x1² + c4*x1x2 + … + cn*x2³ + c0

价格= 0 * 1 + c1 * x1 + c2 * x2 + c3 *x1²+ c4 * x1x2 +…+ cn *x2³+ c0

(0*1 relates to the bias (1s) column)

(0 * 1与偏置(1s)列有关)

After training the model on the data we can check the coefficients and see if they match our original function used to generate home prices:

在对数据进行模型训练之后,我们可以检查系数,看看它们是否与用于生成房屋价格的原始函数匹配:

Original Function:

原始功能:

价格= -3 * x1 -10 * x2 + 0.033 *x1²-0.0000571 *x1³+ 500 (Price = -3*x1 -10*x2 + 0.033*x1² -0.0000571*x1³ + 500)

Polynomial Regression model coefficients:

多项式回归模型系数:

Image for post
Image for post

and indeed they match!

确实匹配!

Now you can see we do a much better job.

现在您可以看到我们做得更好。

Image for post
Polynomial Regression Model (Mean Relative Error: 0%)
多项式回归模型(平均相对误差:0%)

And there you have it, now you know how to implement a Polynomial Regression model in Python. Entire code can be found here.

有了它,现在您知道如何在Python中实现多项式回归模型。 完整的代码可以在这里找到。

结束语 (Closing remarks)

  • If this were a real-world ML task, we should have split data into training and testing sets, and evaluated the model on the testing set.

    如果这是现实世界中的ML任务,我们应该将数据分为训练和测试集,并在测试集上评估模型。
  • It’s better to use other accuracy metrics such as RMSE because MRE will be undefined if there’s a 0 in the y values.

    最好使用其他精度度量标准,例如RMSE,因为如果y值中为0,则MRE将不确定。

翻译自: https://medium.com/@nikola.kuzmic945/how-to-implement-a-polynomial-regression-model-in-python-6250ce96ba61

python多项式回归

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389367.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

充分利用UC berkeleys数据科学专业

By Kyra Wong and Kendall Kikkawa黄凯拉(Kyra Wong)和菊川健多 ( Kendall Kikkawa) 什么是“数据科学”? (What is ‘Data Science’?) Data collection, an important aspect of “data science”, is not a new idea. Before the tech boom, every industry al…

文本二叉树折半查询及其截取值

using System;using System.ComponentModel;using System.Data;using System.Drawing;using System.Text;using System.Windows.Forms;using System.Collections;using System.IO;namespace CS_ScanSample1{ /// <summary> /// Logic 的摘要说明。 /// </summary> …

nn.functional 和 nn.Module入门讲解

本文来自《20天吃透Pytorch》 一&#xff0c;nn.functional 和 nn.Module 前面我们介绍了Pytorch的张量的结构操作和数学运算中的一些常用API。 利用这些张量的API我们可以构建出神经网络相关的组件(如激活函数&#xff0c;模型层&#xff0c;损失函数)。 Pytorch和神经网络…

10.30PMP试题每日一题

SC>0&#xff0c;CPI<1&#xff0c;说明项目截止到当前&#xff1a;A、进度超前&#xff0c;成本超值B、进度落后&#xff0c;成本结余C、进度超前&#xff0c;成本结余D、无法判断 答案将于明天和新题一起揭晓&#xff01; 10.29试题答案&#xff1a;A转载于:https://bl…

02-web框架

1 while True:print(server is waiting...)conn, addr server.accept()data conn.recv(1024) print(data:, data)# 1.得到请求的url路径# ------------dict/obj d["path":"/login"]# d.get(”path“)# 按着http请求协议解析数据# 专注于web业…

ai驱动数据安全治理_AI驱动的Web数据收集解决方案的新起点

ai驱动数据安全治理Data gathering consists of many time-consuming and complex activities. These include proxy management, data parsing, infrastructure management, overcoming fingerprinting anti-measures, rendering JavaScript-heavy websites at scale, and muc…

从Text文本中读值插入到数据库中

/// <summary> /// 转换数据&#xff0c;从Text文本中导入到数据库中 /// </summary> private void ChangeTextToDb() { if(File.Exists("Storage Card/Zyk.txt")) { try { this.RecNum.Visibletrue; SqlCeCommand sqlCreateTable…

Dataset和DataLoader构建数据通道

重点在第二部分的构建数据通道和第三部分的加载数据集 Pytorch通常使用Dataset和DataLoader这两个工具类来构建数据管道。 Dataset定义了数据集的内容&#xff0c;它相当于一个类似列表的数据结构&#xff0c;具有确定的长度&#xff0c;能够用索引获取数据集中的元素。 而D…

铁拳nat映射_铁拳如何重塑我的数据可视化设计流程

铁拳nat映射It’s been a full year since I’ve become an independent data visualization designer. When I first started, projects that came to me didn’t relate to my interests or skills. Over the past eight months, it’s become very clear to me that when cl…

Django2 Web 实战03-文件上传

作者&#xff1a;Hubery 时间&#xff1a;2018.10.31 接上文&#xff1a;接上文&#xff1a;Django2 Web 实战02-用户注册登录退出 视频是一种可视化媒介&#xff0c;因此视频数据库至少应该存储图像。让用户上传文件是个很大的隐患&#xff0c;因此接下来会讨论这俩话题&#…

BZOJ.2738.矩阵乘法(整体二分 二维树状数组)

题目链接 BZOJ洛谷 整体二分。把求序列第K小的树状数组改成二维树状数组就行了。 初始答案区间有点大&#xff0c;离散化一下。 因为这题是一开始给点&#xff0c;之后询问&#xff0c;so可以先处理该区间值在l~mid的修改&#xff0c;再处理询问。即二分标准可以直接用点的标号…

从数据库里读值往TEXT文本里写

/// <summary> /// 把预定内容导入到Text文档 /// </summary> private void ChangeDbToText() { this.RecNum.Visibletrue; //建立文件&#xff0c;并打开 string oneLine ""; string filename "Storage Card/YD" DateTime.Now.…

DengAI —如何应对数据科学竞赛? (EDA)

了解机器学习 (Understanding ML) This article is based on my entry into DengAI competition on the DrivenData platform. I’ve managed to score within 0.2% (14/9069 as on 02 Jun 2020). Some of the ideas presented here are strictly designed for competitions li…

Pytorch模型层简单介绍

模型层layers 深度学习模型一般由各种模型层组合而成。 torch.nn中内置了非常丰富的各种模型层。它们都属于nn.Module的子类&#xff0c;具备参数管理功能。 例如&#xff1a; nn.Linear, nn.Flatten, nn.Dropout, nn.BatchNorm2d nn.Conv2d,nn.AvgPool2d,nn.Conv1d,nn.Co…

有效沟通的技能有哪些_如何有效地展示您的数据科学或软件工程技能

有效沟通的技能有哪些What is the most important thing to do after you got your skills to be a data scientist? It has to be to show off your skills. Otherwise, there is no use of your skills. If you want to get a job or freelance or start a start-up, you ha…

java.net.SocketException: Software caused connection abort: socket write erro

场景&#xff1a;接口测试 编辑器&#xff1a;eclipse 版本&#xff1a;Version: 2018-09 (4.9.0) testng版本&#xff1a;TestNG version 6.14.0 执行testng.xml时报错信息&#xff1a; 出现此报错原因之一&#xff1a;网上有人说是testng版本与eclipse版本不一致造成的&#…

[博客..配置?]博客园美化

博客园搞定时间 -> 18年6月27日 [让我歇会儿 搞这个费脑子 代码一个都看不懂] 转载于:https://www.cnblogs.com/Steinway/p/9235437.html

使用K-Means对美因河畔法兰克福的社区进行聚类

介绍 (Introduction) This blog post summarizes the results of the Capstone Project in the IBM Data Science Specialization on Coursera. Within the project, the districts of Frankfurt am Main in Germany shall be clustered according to their venue data using t…

Pytorch损失函数losses简介

一般来说&#xff0c;监督学习的目标函数由损失函数和正则化项组成。(Objective Loss Regularization) Pytorch中的损失函数一般在训练模型时候指定。 注意Pytorch中内置的损失函数的参数和tensorflow不同&#xff0c;是y_pred在前&#xff0c;y_true在后&#xff0c;而Ten…

读取Mc1000的 唯一 ID 机器号

先引用Symbol.ResourceCoordination 然后引用命名空间 using System;using System.Security.Cryptography;using System.IO; 以下为类程序 /// <summary> /// 获取设备id /// </summary> /// <returns></returns> public static string GetDevi…