多元线性回归 python_Python中的多元线性回归

多元线性回归 python

Video Link

影片连结

This episode expands on Implementing Simple Linear Regression In Python. We extend our simple linear regression model to include more variables.

本集扩展了在Python中实现简单线性回归的方法 。 我们扩展了简单的线性回归模型以包含更多变量。

You can view the code used in this Episode here: SampleCode

您可以在此处查看 此剧 集中使用的代码: SampleCode

Setting up your programming environment can be found in the first section of Ep 4.3.

可以在Ep 4.3的第一部分中找到设置您的编程环境的步骤

导入我们的数据 (Importing our Data)

The first step is to import our data into python.

第一步是将我们的数据导入python。

We can do that by going on the following link: Data

我们可以通过以下链接进行操作: 数据

Click on “code” and download ZIP.

单击“代码”并下载ZIP。

Image for post

Locate WeatherDataM.csv and copy it into your local disc under a new file ProjectData

找到WeatherDataM.csv并将其复制到新文件ProjectData下的本地磁盘中

Note: Keep this medium post on a split screen so you can read and implement the code yourself.

注意:请将此帖子张贴在分屏上,以便您自己阅读和实现代码。

Now we are ready to implement our code into our Notebook:

现在我们准备将代码实现到笔记本中:

# Import Pandas Library, used for data manipulation
# Import matplotlib, used to plot our data
# Import nump for mathemtical operationsimport pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Import our WeatherDataM and store it in the variable weather_data_mweather_data_m = pd.read_csv("D:\ProjectData\WeatherDataM.csv")
# Display the data in the notebookweather_data_m
Image for post

Here we can see a table with all the variables we will be working with.

在这里,我们可以看到一个包含所有要使用的变量的表。

绘制数据 (Plotting our Data)

Each of our inputs X (Temperature, Wind Speed and Pressure) must form a linear relationship with our output y (Humidity) in order for our multiple linear regression model to be accurate.

我们的每个输入X(温度,风速和压力)必须与我们的输出y(湿度)形成线性关系,以便我们的多元线性回归模型准确。

Let’s plot our variables to confirm this.

让我们绘制变量以确认这一点。

Here we follow common Data Science convention, naming our inputs X and output y.

在这里,我们遵循通用的数据科学约定 ,将输入X和输出y命名为。

# Set the features of our model, these are our potential inputsweather_features = ['Temperature (C)', 'Wind Speed (km/h)', 'Pressure (millibars)']# Set the variable X to be all our input columns: Temperature, Wind Speed and PressureX = weather_data_m[weather_features]# set y to be our output column: Humidityy = weather_data_m.Humidity# plt.subplot enables us to plot mutliple graphs
# we produce scatter plots for Humidity against each of our input variablesplt.subplot(2,2,1)
plt.scatter(X['Temperature (C)'],y)
plt.subplot(2,2,2)
plt.scatter(X['Wind Speed (km/h)'],y)
plt.subplot(2,2,3)
plt.scatter(X['Pressure (millibars)'],y)
Image for post
  • Humidity against Temperature forms a strong linear relationship

    相对于温度的湿度形成很强的线性关系

  • Humidity against Wind Speed forms a linear relationship

    湿度与风速成线性关系

  • Humidity against Pressure forms no linear relationship

    相对于压力的湿度没有线性关系

Pressure can not be used in our model and is removed with the following code

压力无法在我们的模型中使用,并通过以下代码删除

X = X.drop("Pressure (millibars)", 1)

We specify the the column name went want to drop: Pressure (millibars)

我们指定要删除的列名称: 压力(毫巴)

1 represents our axis number: 1 is used for columns and 0 for rows.

1代表我们的轴号:1代表列,0代表行。

Because we are working with just two input variables we can produce a 3D scatter plot of Humidity against Temperature and Wind speed.

因为我们仅使用两个输入变量,所以可以生成湿度相对于温度和风速的3D散点图

With more variables this would not be possible, as this would require a 4D + plot which we as humans can not visualise.

有了更多的变量,这将是不可能的,因为这将需要我们人类无法看到的4D +图。

# Import library to produce a 3D plotfrom mpl_toolkits.mplot3d import Axes3Dfig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x1 = X["Temperature (C)"]
x2 = X["Wind Speed (km/h)"]
ax.scatter(x1, x2, y, c='r', marker='o')# Set axis labelsax.set_xlabel('Temperature (C)')
ax.set_ylabel('Wind Speed (km/h)')
ax.set_zlabel('Humidity')
Image for post

实现多元线性回归 (Implementing Multiple Linear Regression)

In order to calculate our Model we need to import the LinearRegression model from Sci-kit learn library. This function enables us to calculate the parameters for our model (θ₀, θ₁ and θ₂) with one line of code.

为了计算我们的模型,我们需要从Sci-kit学习库中导入LinearRegression模型。 此功能使我们能够使用一行代码来计算模型的参数 ( θ₀,θ₁和θ2)

from sklearn.linear_model import LinearRegression# Define the variable mlr_model as our linear regression model
mlr_model = LinearRegression()
mlr_model.fit(X, y)

We can then display the values for θ₀, θ₁ and θ₂:

然后我们可以显示θ₀,θ和θ2的值

θ₀ is the intercept

θ₀是截距

θ₁ and θ₂ are what we call co-efficients of the model as the come before our X variables.

θ₁和θ²是我们所谓的模型系数 ,即X变量之前的系数。

theta0 = mlr_model.intercept_
theta1, theta2 = mlr_model.coef_
theta0, theta1, theta2
Image for post

Giving our multiple linear regression model as:

给出我们的多元线性回归模型为:

ŷ = 1.14–0.031𝑥¹- 0.004𝑥²

ŷ= 1.14–0.031𝑥¹-0.004𝑥²

使用我们的回归模型进行预测 (Using our Regression Model to make predictions)

Now we have calculated our Model, it’s time to make predictions for Humidity given a Temperature and Wind speed value:

现在我们已经计算了模型,是时候根据温度和风速值对湿度进行预测了:

y_pred = mlr_model.predict([[15, 21]])
y_pred
Image for post

So a temperature of 15 °C and Wind speed of 21 km/h expects to give us a Humidity of 0.587.

因此,温度为15°C,风速为21 km / h,预计湿度为0.587。

边注 (Side note)

We reshaped all of our inputs into 2D arrays by using double square brackets ( [[]] ) which is a much more efficient method.

我们使用双方括号([[]])将所有输入重塑为2D数组,这是一种更为有效的方法。

如果您有任何疑问,请将其留在下面,希望在下一集见。 (If you have any questions please leave them below and I hope to see you in the next episode.)

Image for post

翻译自: https://medium.com/ai-in-plain-english/implementing-multiple-linear-regression-in-python-1364fc03a5a8

多元线性回归 python

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388411.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

关于apache和tomcat集群,线程是否占用实验

测试目的: 测试在apache入口的时候进入,当Tomcat的一个请求陷入死循环,或者线程进入循环无反应的时候,是否此时占用apache的线程资源。 测试原因: 如果要是影响,无论tomcat线程设置成多大,都…

爬虫之数据解析的三种方式

一,正则表达式解析 re正则就不写了,前面已经写入一篇很详细的正则表达式模块了~ 而且,在爬虫中,下面两种方式用的多一些~ 正则表达式:https://www.cnblogs.com/peng104/p/9619801.html 大致用法: pattern …

相对于硬件计算机软件就是,计算机的软件是将解决问题的方法,软件是相对于硬件来说的...

计算机网络管理软件是为计算机网络配置的系统软件。它负责对网络资源进行组织和管理,实现相互之间的通信。计算机网络管理软件包括网络操作系统和数据通信处理程序。前者用于协调网络中各计算机的操作系统及实现网络资源的传递,后者用于网络内的通信&…

数据冒险控制冒险_劳动生产率和其他冒险

数据冒险控制冒险Labor productivity is considered one of the most important indicators of a country’s well-being. However, we don’t know so much about it, let’s try to figure out how it is calculated, and how things are with it in the world (data source:…

Java后端WebSocket的Tomcat实现

原文:https://www.cnblogs.com/xdp-gacl/p/5193279.html 一.WebSocket简单介绍 随着互联网的发展,传统的HTTP协议已经很难满足Web应用日益复杂的需求了。近年来,随着HTML5的诞生,WebSocket协议被提出,它实现了浏览器与…

knn 邻居数量k的选取_选择K个最近的邻居

knn 邻居数量k的选取Classification is more-or-less just a matter of figuring out to what available group something belongs.分类或多或少只是弄清楚某个事物所属的可用组的问题。 Is Old Town Road a rap song or a country song?Old Town Road是说唱歌曲还是乡村歌曲…

EXTJS+JSP上传文件带进度条

需求来源是这样的:上传一个很大的excel文件到server, server会解析这个excel, 然后一条一条的插入到数据库,整个过程要耗费很长时间,因此当用户点击上传之后,需要显示一个进度条,并且能够根据后…

什么样的代码是好代码_什么是好代码?

什么样的代码是好代码编码最佳实践 (Coding Best-Practices) In the following section, I will introduce the topic at hand, giving you a sense of what this post will cover, and how each argument therein will be approached. Hopefully, this will help you decide w…

nginx比较apache

话说nginx在大压力的环境中比apache的表现要好,于是下载了一个来折腾一下。 下载并编译安装,我的编译过程有点特别: 1。去除调试信息,修改$nginx_setup_path/auto/cc/gcc这个文件,将 CFLAGS"$CFLAGS -g" …

计算机主板各模块复位,电脑主板复位电路工作原理分析

电源、时钟、复位是主板能正常工作的三大要素。主板在电源、时钟都正常后,复位系统发出复位信号,主板各个部件在收到复位信号后,同步进入初始化状态。如图7-11所示为复位电路的工作原理图,各个十板实现复位的电路不尽相同&#xf…

Docker制作dotnet core控制台程序镜像

(1)首先我们到某个目录下,然后在此目录下打开visual studio code. 2.编辑docker file文件如下: 3.使用dotnet new console创建控制台程序; 4.使用docker build -t daniel/console:dev .来进行打包; 5.启动并运行镜像; 6.我们可以看到打包完的镜像将近2G,因为我们使用…

在Python中使用Twitter Rest API批量搜索和下载推文

数据挖掘 , 编程 (Data Mining, Programming) Getting Twitter data获取Twitter数据 Let’s use the Tweepy package in python instead of handling the Twitter API directly. The two things we will do with the package are, authorize ourselves to use the …

Windows7 + Nginx + Memcached + Tomcat 集群 session 共享

一,环境说明 操作系统是Windows7家庭版(有点不专业哦,呵呵!),JDK是1.6的版本, Tomcat是apache-tomcat-6.0.35-windows-x86,下载链接:http://tomcat.apache.org/ Nginx…

大数据 vr csdn_VR中的数据可视化如何革命化科学

大数据 vr csdnAstronomy has become a big data discipline, and the ever growing databases in modern astronomy pose many new challenges for analysts. Scientists are more frequently turning to artificial intelligence and machine learning algorithms to analyze…

Xcode做简易计算器

1.创建一个新项目,选择“View-based Application”。输入名字“Cal”,这时会有如下界面。 2.选择Resources->CalViewController.xib并双击,便打开了资源编辑对话框。 3.我们会看到几个窗口。其中有一个上面写着Library,这里…

导入数据库怎么导入_导入必要的库

导入数据库怎么导入重点 (Top highlight)With the increasing popularity of machine learning, many traders are looking for ways in which they can “teach” a computer to trade for them. This process is called algorithmic trading (sometimes called algo-trading)…

windows查看系统版本号

windows查看系统版本号 winR,输入cmd,确定,打开命令窗口,输入msinfo32,注意要在英文状态下输入,回车。然后在弹出的窗口中就可以看到系统的具体版本号了。 winR,输入cmd,确定,打开命令窗口&…

02:Kubernetes集群部署——平台环境规划

1、官方提供的三种部署方式: minikube: Minikube是一个工具,可以在本地快速运行一个单点的Kubernetes,仅用于尝试Kubernetes或日常开发的用户使用。部署地址:https://kubernetes.io/docs/setup/minikube/kubeadm Kubea…

更便捷的画决策分支图的工具_做出更好决策的3个要素

更便捷的画决策分支图的工具Have you ever wondered:您是否曾经想过: How did Google dominate 92.1% of the search engine market share? Google如何占领搜索引擎92.1%的市场份额? How did Facebook achieve 74.1% of social media marke…

的界面跳转

在界面的跳转有两种方法,一种方法是先删除原来的界面,然后在插入新的界面:如下代码 if (self.rootViewController.view.superview nil) { [singleDollController.view removeFromSuperview]; [self.view insertSubview:rootViewControlle…