感知器 机器学习
In this post, we are going to have a look at a program written in Python3
using numpy
. We will discuss the basics of what a perceptron is, what is the delta rule and how to use it to converge the learning of the perceptron.
在本文中,我们将看一下使用numpy
用Python3
编写的程序。 我们将讨论什么是感知器 ,什么是增量规则以及如何使用它来融合感知器学习的基础知识。
什么是感知器? (What is a perceptron?)
The perceptron is an algorithm for supervised learning of binary classifiers (let’s assumer {1, 0}
). We have a linear combination of weight vector and the input data vector that is passed through an activation function and then compared to a threshold value. If the linear combination is greater than the threshold, we predict the class as 1
otherwise 0. Mathematically,
感知器是一种用于监督学习二进制分类器(让我们的假设{1, 0}
)的算法。 我们具有权重向量和输入数据向量的线性组合,该向量通过激活函数传递,然后与阈值进行比较。 如果线性组合大于阈值,则我们将类别预测为1
否则预测为0. Mathematically,
Perceptrons only represent linearly separable problems. They fail to converge if the training examples are not linearly separable. This brings into picture the delta rule.
感知器仅代表线性可分离的问题。 如果训练示例不是线性可分离的,则它们无法收敛。 这使增量规则成为现实。
The delta rule converges towards a best-fit approximation of the target concept. The key idea is to use gradient descent to search the hypothesis space of all possible weight vectors.
增量法则趋向于达到目标概念的最佳拟合。 关键思想是使用梯度下降 搜索所有可能的权重向量的假设空间。
Note: This provides the basis for “Backpropogation” algorithm.
注意:这为“反向传播”算法提供了基础。
Now, let’s discuss the problem at hand. The program will read a dataset(tab separated file) and treat the first column as the target concept. The values present in the target concept are A and B, we will consider A as +ve class or 1
and B as -ve class or 0
. The program implements the perceptron training rule in batch mode with a constant learning rate and an annealing(decreasing as the number of iterations increase) learning rate, starting with learning rate as 1.
现在,让我们讨论眼前的问题。 该程序将读取数据集 (制表符分隔的文件),并将第一列视为目标概念 。 目标概念中的值是A和B,我们将A视为+ ve类或1
,将B视为-ve类或0
。 该程序以恒定的学习率和退火(随着迭代次数增加而减少)的学习率以批处理模式实施感知器训练规则,从学习率开始为1。
where Y(x, w) is the set of samples which are misclassified. We will use the count or the number of misclassified points as our error rate(i.e. | Y(x, w)|). The output will also be a tab separated(tsv) file containing the error for each iteration, i.e. it will have 100 columns. Also, it will have 2 rows, one for normal learning rate and one for annealing learning rate.
其中Y(x,w)是错误分类的样本集。 我们将使用计数或错误分类的点数作为错误率(即| Y(x,w)|)。 输出还将是一个制表符分隔的(tsv)文件,其中包含每次迭代的错误,即它将有100列。 另外,它将有两行,一行用于正常学习率,另一行用于退火学习率。
Now, the understanding of what perceptron is, what delta rule and how we are going to use it. Let’s get started with Python3
implementation.
现在,了解什么是感知器,什么是增量规则以及我们将如何使用它。 让我们开始使用Python3
实现。
In the program, we are providing two inputs from the command line. They are:
在程序中,我们从命令行提供了两个输入。 他们是:
1. data — The location of the data file.
1. 数据 —数据文件的位置。
2. output — Where to write the tsv solution to
2. 输出 -将tsv解决方案写入何处
Therefore, the program should be able to start like this:
因此,该程序应该能够像这样启动:
python3 perceptron.py --data data.tsv --output solution.tsv
The program consists of 8 parts and we are going to have a look at them one at a time.
该程序包括8个部分,我们将一次对其进行介绍。
导入声明 (The import statements)
import argparse # to read inputs from command line
import csv # to read and process dataset
import numpy as np # to perform mathematical functions
代码执行初始化块 (The code execution initializer block)
# initialise argument parser and read arguments from command line with the respective flags and then call the main() functionif __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-d", "--data", help="Data File")
parser.add_argument("-o", "--output", help="output")
main()
main()
函数 (The main()
function)
def main():args = parser.parse_args()file, outputFile = args.data, args.outputlearningRate = 1with open(file) as tsvFile:reader = csv.reader(tsvFile, delimiter='\t')X = []Y = []for row in reader:X.append([1.0] + row[1:])if row[0] == 'A':Y.append([1])else:Y.append([0])n = len(X)X = np.array(X).astype(float)Y = np.array(Y).astype(float)W = np.zeros(X.shape[1]).astype(float)W = W.reshape(X.shape[1], 1).astype(float)normalError = calculateNormalBatchLearning(X, Y, W, learningRate)annealError = calculateAnnealBatchLearning(X, Y, W, learningRate)with open(outputFile, 'w') as tsvFile:writer = csv.writer(tsvFile, delimiter='\t')writer.writerow(normalError)writer.writerow(annealError)
The flow of the main()
function is as follows:
main()
函数的流程如下:
- Save respective command line inputs into variables 将相应的命令行输入保存到变量中
- Set starting learningRate = 1 设置开始学习率= 1
Read the dataset using
csv
anddelimiter='\t'
, store independent variables inX
and dependent variable inY
. We are adding1.0
as bias to our independent data使用
csv
和delimiter='\t'
读取数据集 ,在X
存储自变量,在Y
存储因变量。 我们将1.0
作为对独立数据的偏见- The independent and dependent data is converted to float 独立和从属数据转换为浮点型
The weight vector is initialised with zeroes with same dimensions as
X
权重向量以与
X
相同尺寸的零初始化The normalError and annealError are calculated by calling their respective methods
normalError 和annealError 通过调用它们各自的方法来计算
- Finally, the output is saved into a tsv file 最后,将输出保存到tsv文件中
computeNormalBatchLearning()函数 (The calculateNormalBatchLearning() function)
def calculateNormalBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return e
The flow of calculateNormalBatchLearning()
is as follows:
calculateNormalBatchLearning()
的流程如下:
Initialisation of a variable
e
to store the error count初始化变量
e
以存储错误计数- A loop is run for 100 iterations 循环运行100次迭代
Predicted value is computed based on the perceptron rule described earlier using calculatePredicatedValue() method
预测值是根据前面所述的感知器规则使用calculatePredicatedValue()方法计算的
Error count is calculated using the calculateError() method
错误计数是使用calculateError()方法计算的
Weights are updated based on the equation above using calculateGradient() method
权重根据上面的等式使用calculateGradient()方法进行更新
computeAnnealBatchLearning()函数 (The calculateAnnealBatchLearning() function)
def calculateAnnealBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)learningRate = 1 / (i + 1)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return e
The flow of calculateNormalBatchLearning()
is as follows:
calculateNormalBatchLearning()
的流程如下:
Initialisation of a variable
e
to store the error count初始化变量
e
以存储错误计数- A loop is run for 100 iterations 循环运行100次迭代
Predicted value is computed based on the perceptron rule described earlier using calculatePredicatedValue() method
预测值是根据前面所述的感知器规则使用calculatePredicatedValue()方法计算的
Error count is calculated using the calculateError() method
错误计数是使用calculateError()方法计算的
- Learning rate is divided by the number of the iteration 学习率除以迭代次数
Weights are updated based on the equation above using calculateGradient() method
权重根据上面的等式使用calculateGradient()方法进行更新
computePredictedValue()函数 (The calculatePredictedValue() function)
def calculatePredicatedValue(X, W):f_x = np.dot(X, W)for i in range(len(f_x)):if f_x[i][0] > 0:f_x[i][0] = 1else:f_x[i][0] = 0return f_x
As described in the perceptron image, if the linear combination of W
and X
is greater than 0
, then we predict the class as 1
otherwise 0
.
如感知器图像中所述,如果W
和X
的线性组合大于0
,则我们将类别预测为1
否则为0
。
computeError()函数 (The calculateError() function)
def calculateError(Y, f_x):errorCount = 0for i in range(len(f_x)):if Y[i][0] != f_x[i][0]:errorCount += 1return errorCount
We count the number of instances where the predicted value and the true value do not match and this becomes our error count.
我们计算了预测值和真实值不匹配的实例数,这就是我们的错误计数。
computeGradient()函数 (The calculateGradient() function)
def calculateGradient(W, X, Y, f_x, learningRate):gradient = (Y - f_x) * Xgradient = np.sum(gradient, axis=0)# gradient = np.array([float("{0:.4f}".format(val)) for val in gradient])temp = np.array(learningRate * gradient).reshape(W.shape)W = W + tempreturn gradient, W.astype(float)
This method is translation of the weight update formula mentioned above.
该方法是上述权重更新公式的转换。
Now, that the whole code is out there. Let’s have a look at the execution of the program.
现在,整个代码就在那里。 让我们看一下程序的执行。
Here is how the output looks like:
输出结果如下所示:
The final program
最终程序
import argparse
import csv
import numpy as npdef main():args = parser.parse_args()file, outputFile = args.data, args.outputlearningRate = 1with open(file) as tsvFile:reader = csv.reader(tsvFile, delimiter='\t')X = []Y = []for row in reader:X.append([1.0] + row[1:])if row[0] == 'A':Y.append([1])else:Y.append([0])n = len(X)X = np.array(X).astype(float)Y = np.array(Y).astype(float)W = np.zeros(X.shape[1]).astype(float)W = W.reshape(X.shape[1], 1).astype(float)normalError = calculateNormalBatchLearning(X, Y, W, learningRate)annealError = calculateAnnealBatchLearning(X, Y, W, learningRate)with open(outputFile, 'w') as tsvFile:writer = csv.writer(tsvFile, delimiter='\t')writer.writerow(normalError)writer.writerow(annealError)def calculateNormalBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return edef calculateAnnealBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)learningRate = 1 / (i + 1)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return edef calculateGradient(W, X, Y, f_x, learningRate):gradient = (Y - f_x) * Xgradient = np.sum(gradient, axis=0)# gradient = np.array([float("{0:.4f}".format(val)) for val in gradient])temp = np.array(learningRate * gradient).reshape(W.shape)W = W + tempreturn gradient, W.astype(float)def calculateError(Y, f_x):errorCount = 0for i in range(len(f_x)):if Y[i][0] != f_x[i][0]:errorCount += 1return errorCountdef calculatePredicatedValue(X, W):f_x = np.dot(X, W)for i in range(len(f_x)):if f_x[i][0] > 0:f_x[i][0] = 1else:f_x[i][0] = 0return f_xif __name__ == '__main__':parser = argparse.ArgumentParser()parser.add_argument("-d", "--data", help="Data File")parser.add_argument("-o", "--output", help="output")main()
翻译自: https://towardsdatascience.com/machine-learning-perceptron-implementation-b867016269ec
感知器 机器学习
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391521.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!