python多项式回归_在python中实现多项式回归

python多项式回归

Video Link

影片连结

You can view the code used in this Episode here: SampleCode

您可以在此处查看 此剧 集中使用的代码: SampleCode

导入我们的数据 (Importing our Data)

The first step is to import our data into python.

第一步是将我们的数据导入python。

We can do that by going on the following link: Data

我们可以通过以下链接来做到这一点: 数据

Click on “code” and download ZIP.

单击“代码”并下载ZIP。

Locate WeatherDataP.csv and copy it into your local disc under a new file called ProjectData

找到WeatherDataP.csv并将其复制到本地磁盘下名为ProjectData的新文件下

Note: WeatherData.csv and WeahterDataM.csv were used in Simple Linear Regression and Multiple Linear Regression.

注意:WeatherData.csv和WeahterDataM.csv用于简单线性回归和多重线性回归 。

Now we are ready to import our data into our Notebook:

现在我们准备将数据导入到笔记本中:

How to set up a new Notebook can be found at the start of Episode 4.3

如何设置新笔记本可以在第4.3节开始时找到

Note: Keep this medium post on a split screen so you can read and implement the code yourself.

注意:请将此帖子张贴在分屏上,以便您自己阅读和实现代码。

# Import Pandas Library, used for data manipulation
# Import matplotlib, used to plot our data
# Import numpy for linear algebra operationsimport pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Import our WeatherDataP.csv and store it in the variable rweather_data_pweather_data_p = pd.read_csv("D:\ProjectData\WeatherDataP.csv")
# Display the data in the notebookweather_data_p
Image for post

绘制数据 (Plotting our Data)

In order to check what kind of relationship Pressure forms with Humidity, we plot our two variables.

为了检查压力与湿度之间的关系,我们绘制了两个变量。

# Set our input x to Pressure, use [[]] to convert to 2D array suitable for model inputX = weather_data_p[["Pressure (millibars)"]]
y = weather_data_p.Humidity
# Produce a scatter graph of Humidity against Pressureplt.scatter(X, y, c = "black")
plt.xlabel("Pressure (millibars)")
plt.ylabel("Humidity")
Image for post

Here we see Humidity vs Pressure forms a bowl shaped relationship, reminding us of the function: y = 𝑥² .

在这里,我们看到湿度与压力之间呈碗形关系,使我们想起了函数y =𝑥²。

预处理我们的数据 (Preprocessing our Data)

This is the additional step we apply to polynomial regression, where we add the feature 𝑥² to our Model.

这是我们应用于多项式回归的附加步骤 ,在此步骤中将特征𝑥²添加到模型中。

# Import the function "PolynomialFeatures" from sklearn, to preprocess our data
# Import LinearRegression model from sklearnfrom sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
# Set PolynomialFeatures to degree 2 and store in the variable pre_process
# Degree 2 preprocesses x to 1, x and x^2
# Degree 3 preprocesses x to 1, x, x^2 and x^3
# and so on..pre_process = PolynomialFeatures(degree=2)# Transform our x input to 1, x and x^2X_poly = pre_process.fit_transform(X)# Show the transformation on the notebookX_poly
Image for post

e+.. refers to the position of the decimal place.

e + ..指小数位的位置。

E.g1.0e+00 = 1.0 [ keep the decimal point where it is ]1.0144e+03 = 1014.4 [ Move the decimal place 3 places to the right ]1.02900736e+06 = 1029007.36 [ Move the decimal place 6 to places to the right ]

例如 1.0e + 00 = 1.0 [保留小数点处的位置] 1.0144e + 03 = 1014.4 [将小数位右移3位] 1.02900736e + 06 = 1029007.36 [将小数点6向右移动]

— — — — — — — — — — — — — — — — — —

— — — — — — — — — — — — — — — — — — — —

The code above makes the following Conversion:

上面的代码进行了以下转换:

Image for post

Notice that there is a hidden column of 1’s which can be thought of as the variable associated with θ₀. Since θ₀ × 1 = θ₀ this is often left out.

请注意,有一个隐藏的1列,可以将其视为与θ₀相关的变量。 由于θ₀×1 =θ₀,因此经常被忽略。

— — — — — — — — — — — — — — — — — —

— — — — — — — — — — — — — — — — — — — —

实现多项式回归 (Implementing Polynomial Regression)

The method here remains the same as multiple linear regression in python, but here we are fitting our regression model to the preprocessed data:

此处的方法与python中的多元线性回归相同,但此处我们将回归模型拟合为预处理的数据:

pr_model = LinearRegression()# Fit our preprocessed data to the polynomial regression modelpr_model.fit(X_poly, y)# Store our predicted Humidity values in the variable y_newy_pred = pr_model.predict(X_poly)# Plot our model on our dataplt.scatter(X, y, c = "black")
plt.xlabel("Pressure (millibars)")
plt.ylabel("Humidity")
plt.plot(X, y_pred)
Image for post

We can extract θ₀, θ₁ and θ₂ using the following code:

我们可以使用以下代码提取θ₀,θ₁和θ2

theta0 = pr_model.intercept_
_, theta1, theta2 = pr_model.coef_
theta0, theta1, theta2

A “_” is used to ignore the first value in pr_model.coef as this is given by default as 0. The other two co-efficients are labelled theta1 and theta 2 respectively.

_”用于忽略pr_model.coef中的第一个值,因为默认情况下该值为0。其他两个系数分别标记为theta1和theta 2。

Image for post

Giving our polynomial regression model roughly as:

大致给出我们的多项式回归模型:

Image for post

使用我们的回归模型进行预测 (Using our Regression Model to make predictions)

# Predict humidity for a pressure of 1007 millibars
# Tranform 1007 to 1, 1007, 1007^2 suitable for input, using
# pre_process.fit_transformy_new = pr_model.predict(pre_process.fit_transform([[1007]]))
y_new
Image for post

Here we expect a Humidity value of 0.7164631 for a pressure reading of 1007 millibars.

在这里,对于1007毫巴的压力读数,我们期望的湿度值为0.7164631。

We can plot this point on our data plot using the following code:

我们可以使用以下代码在数据图上绘制该点:

plt.scatter(1007, y_new, c = "red")
Image for post

评估我们的模型 (Evaluating our Model)

To evaluate our model we are going to be using mean squared error (MSE), discussed in the previous episode, the function can easily be imported from sklearn.

为了评估我们的模型,我们将使用上一集中讨论的均方误差(MSE) ,可以轻松地从sklearn导入函数。

from sklearn.metrics import mean_squared_error
mean_squared_error(y, y_pred)
Image for post

The mean squared error for our regression model is given by: 0.003358..

我们的回归模型的均方误差为:0.003358 ..

Image for post

If we want to change our model to include 𝑥³ we can do so by simply changing PolynomialFeatures to degree 3:

如果要更改模型以包括𝑥³ ,可以通过将PolynomialFeatures更改为3级来实现

pre_process = PolynomialFeatures(degree=3)

Let’s check if this has decreased our mean squared error:

让我们检查一下这是否降低了均方误差:

Image for post

Indeed it has.

确实有。

You can change the degree used in PolynomialFeatures to anything you like and see for yourself what effect this has on our MSE.

您可以将PolynomialFeatures中使用的度数更改为您喜欢的任何值,并亲自查看这对我们的MSE有什么影响。

Ideally we want to choose the model that:

理想情况下,我们要选择以下模型:

  • Has the lowest MSE

    MSE最低

  • Does not over-fit our data

    不会过度拟合我们的数据

It is important that we plot our model on our data to ensure we don’t end up with the model shown at the end of Episode 4.6, which had an extremely low MSE but over-fitted our data.

重要的是, 我们需要在数据上绘制模型,以确保最终不会出现第4.6集末显示的模型,该模型的MSE极低,但数据过拟合。

上一集 - 下一集 (Prev Episode — Next Episode)

如有任何疑问,请留在下面! (If you have any questions please leave them below!)

Image for post

翻译自: https://medium.com/ai-in-plain-english/implementing-polynomial-regression-in-python-d9aedf520d56

python多项式回归

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389738.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Uboot 命令是如何被使用的?

有什么问题请 发邮件至syyxyoutlook.com, 欢迎交流~ 在uboot代码中命令的模式是这个样子: 这样是如何和命令行交互的呢? 在command.h 中, 我们可以看到如下宏定义 将其拆分出来: #define U_BOOT_CMD(name,maxargs,rep,cmd,usage,help) \ U_…

2029. 石子游戏 IX

2029. 石子游戏 IX Alice 和 Bob 再次设计了一款新的石子游戏。现有一行 n 个石子,每个石子都有一个关联的数字表示它的价值。给你一个整数数组 stones ,其中 stones[i] 是第 i 个石子的价值。 Alice 和 Bob 轮流进行自己的回合,Alice 先手…

大数据可视化应用_在数据可视化中应用种族平等意识

大数据可视化应用The following post is a summarized version of the article accepted to the 2020 Visualization for Communication workshop as part of the 2020 IEEE VIS conference to be held in October 2020. The full paper has been published as an OSF Preprint…

Windows10电脑系统时间校准

有时候新安装电脑系统,系统时间不对,需要主动去校准系统时间。1、点击时间 2、日期和时间设置 3、其他日期、时间和区域设置 4、设置时间和日期 5、Internet 时间 6、点击立即更新,如果更新失败就查电脑是否已联网,重试点击立即更…

webpack问题

Cannot find module webpack/lib/node/NodeTemplatePlugin 全局:npm i webpack -g npm link webpack --save-dev 转载于:https://www.cnblogs.com/doing123/p/8994269.html

397. 整数替换

397. 整数替换 给定一个正整数 n ,你可以做如下操作: 如果 n 是偶数,则用 n / 2替换 n 。 如果 n 是奇数,则可以用 n 1或n - 1替换 n 。 n 变为 1 所需的最小替换次数是多少? 示例 1: 输入:…

pd种知道每个数据的类型_每个数据科学家都应该知道的5个概念

pd种知道每个数据的类型意见 (Opinion) 目录 (Table of Contents) Introduction 介绍 Multicollinearity 多重共线性 One-Hot Encoding 一站式编码 Sampling 采样 Error Metrics 错误指标 Storytelling 评书 Summary 摘要 介绍 (Introduction) I have written about common ski…

td

单元格td设置padding,而不能设置margin。转载于:https://www.cnblogs.com/fpcbk/p/9617629.html

清除浮动的几大方法

对于刚接触到html的一些人经常会用到浮动布局,但对于浮动的使用和清除浮动来说是大为头痛的,在这里介绍几个关于清除浮动的的方法。如果你说你要的就是浮动为什么要清除浮动的话,我就真的无言以对了,那你就当我没说。 关于我们在布…

xgboost keras_用catboost lgbm xgboost和keras预测财务交易

xgboost kerasThe goal of this challenge is to predict whether a customer will make a transaction (“target” 1) or not (“target” 0). For that, we get a data set of 200 incognito variables and our submission is judged based on the Area Under Receiver Op…

2017. 网格游戏

2017. 网格游戏 给你一个下标从 0 开始的二维数组 grid ,数组大小为 2 x n ,其中 grid[r][c] 表示矩阵中 (r, c) 位置上的点数。现在有两个机器人正在矩阵上参与一场游戏。 两个机器人初始位置都是 (0, 0) ,目标位置是 (1, n-1) 。每个机器…

HUST软工1506班第2周作业成绩公布

说明 本次公布的成绩对应的作业为: 第2周个人作业:WordCount编码和测试 如果同学对作业成绩存在异议,在成绩公布的72小时内(截止日期4月26日0点)可以进行申诉,方式如下: 毕博平台的第二周在线答…

币氪共识指数排行榜0910

币氪量化数据在今天的报告中给出DASH的近期买卖信号,可以看出从今年4月中旬起到目前为止,DASH_USDT的价格总体呈现出下降的趋势。 转载于:https://www.cnblogs.com/tokpick/p/9621821.html

走出囚徒困境的方法_囚徒困境的一种计算方法

走出囚徒困境的方法You and your friend have committed a murder. A few days later, the cops pick the two of you up and put you in two separate interrogation rooms such that you have no communication with each other. You think your life is over, but the polic…

2016. 增量元素之间的最大差值

2016. 增量元素之间的最大差值 给你一个下标从 0 开始的整数数组 nums &#xff0c;该数组的大小为 n &#xff0c;请你计算 nums[j] - nums[i] 能求得的 最大差值 &#xff0c;其中 0 < i < j < n 且 nums[i] < nums[j] 。 返回 最大差值 。如果不存在满足要求的…

Zookeeper系列四:Zookeeper实现分布式锁、Zookeeper实现配置中心

一、Zookeeper实现分布式锁 分布式锁主要用于在分布式环境中保证数据的一致性。 包括跨进程、跨机器、跨网络导致共享资源不一致的问题。 1. 分布式锁的实现思路 说明&#xff1a; 这种实现会有一个缺点&#xff0c;即当有很多进程在等待锁的时候&#xff0c;在释放锁的时候会有…

resize 按钮不会被伪元素遮盖

textarea默认有个resize样式&#xff0c;效果就是下面这样 读 《css 揭秘》时发现两个亮点&#xff1a; 其实这个属性不仅适用于 textarea 元素&#xff0c;适用于下面所有元素&#xff1a;elements with overflow other than visible, and optionally replaced elements repre…

平台api对数据收集的影响_收集您的数据不是那么怪异的api

平台api对数据收集的影响A data analytics cycle starts with gathering and extraction. I hope my previous blog gave an idea about how data from common file formats are gathered using python. In this blog, I’ll focus on extracting the data from files that are…

709. 转换成小写字母

709. 转换成小写字母 给你一个字符串 s &#xff0c;将该字符串中的大写字母转换成相同的小写字母&#xff0c;返回新的字符串。 示例 1&#xff1a;输入&#xff1a;s "Hello" 输出&#xff1a;"hello"示例 2&#xff1a;输入&#xff1a;s "here…

前端技术周刊 2018-09-10:Redux Mobx

前端快爆 在 Chrome 10 周年之际&#xff0c;正式发布 69 版本&#xff0c;整体 UI 重新设计&#xff0c;同时iOS 版本重新将工具栏放置在了底部。API 层面&#xff0c;支持了 CSS Scroll Snap、前端资源锁 Web Lock API、WebWorker 里面可以跑的 OffscreenCanvas API、toggleA…