svm机器学习算法_SVM机器学习算法介绍

svm机器学习算法

According to OpenCV's "Introduction to Support Vector Machines", a Support Vector Machine (SVM):

根据OpenCV“支持向量机简介”，支持向量机(SVM)：

...is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples.

...是由分离的超平面正式定义的判别式分类器。换句话说，给定带标签的训练数据(监督学习)，该算法会输出对新示例进行分类的最佳超平面。

An SVM cost function seeks to approximate the logistic function with a piecewise linear. This machine learning algorithm is used for classification problems and is part of the subset of supervised learning algorithms.

SVM成本函数试图以分段线性近似逻辑函数。这种机器学习算法用于分类问题，是监督学习算法子集的一部分。

成本函数 (The Cost Function)

The Cost Function is used to train the SVM. By minimizing the value of J(theta), we can ensure that the SVM is as accurate as possible. In the equation, the functions cost1 and cost0 refer to the cost for an example where y=1 and the cost for an example where y=0. For SVMs, cost is determined by kernel (similarity) functions.

成本函数用于训练SVM。通过最小化J(theta)的值，我们可以确保SVM尽可能准确。在等式中，函数cost1和cost0表示y = 1的示例的成本和y = 0的示例的成本。对于SVM，成本由内核(相似性)函数确定。

核仁 (Kernels)

Polynomial features tend to be computationally expensive, and may increase runtime with large datasets. Instead of adding more polynomial features, it's better to add landmarks to test the proximity of other datapoints against. Each member of the training set can be considered a landmark, and a kernel is the similarity function that measures how close an input is to said landmarks.

多项式特征在计算上趋向于昂贵，并且对于大型数据集可能会增加运行时间。与其添加更多的多项式特征，不如添加界标来测试其他数据点的接近度。训练集的每个成员都可以被视为地标，并且核是相似度函数，其测量输入与所述地标的接近程度。

大保证金分类器 (Large Margin Classifier)

An SVM will find the line or hyperplane that splits the data with the largest margin possible. Though there will be outliers that sway the line in a certain direction, a C value that is small enough will enforce regularization throughout.

SVM将找到可能以最大裕量分割数据的线或超平面。尽管会有异常值在一定方向上影响直线，但足够小的C值将在整个过程中强制进行正则化。

The following is code written for training, predicting and finding accuracy for SVM in Python:

以下是编写的用于训练，预测和发现Python中SVM准确性的代码：

import numpy as npclass Svm (object):"""" Svm classifier """def __init__ (self, inputDim, outputDim):self.W = None# - Generate a random svm weight matrix to compute loss                 ##   with standard normal distribution and Standard deviation = 0.01.    #sigma =0.01self.W = sigma * np.random.randn(inputDim,outputDim)def calLoss (self, x, y, reg):"""Svm loss functionD: Input dimension.C: Number of Classes.N: Number of example.Inputs:- x: A numpy array of shape (batchSize, D).- y: A numpy array of shape (N,) where value < C.- reg: (float) regularization strength.Returns a tuple of:- loss as single float.- gradient with respect to weights self.W (dW) with the same shape of self.W."""loss = 0.0dW = np.zeros_like(self.W)# - Compute the svm loss and store to loss variable.                        ## - Compute gradient and store to dW variable.                              ## - Use L2 regularization                                                  ##Calculating score matrixs = x.dot(self.W)#Score with yis_yi = s[np.arange(x.shape[0]),y]#finding the deltadelta = s- s_yi[:,np.newaxis]+1#loss for samplesloss_i = np.maximum(0,delta)loss_i[np.arange(x.shape[0]),y]=0loss = np.sum(loss_i)/x.shape[0]#Loss with regularizationloss += reg*np.sum(self.W*self.W)#Calculating dsds = np.zeros_like(delta)ds[delta > 0] = 1ds[np.arange(x.shape[0]),y] = 0ds[np.arange(x.shape[0]),y] = -np.sum(ds, axis=1)dW = (1/x.shape[0]) * (x.T).dot(ds)dW = dW + (2* reg* self.W)return loss, dWdef train (self, x, y, lr=1e-3, reg=1e-5, iter=100, batchSize=200, verbose=False):"""Train this Svm classifier using stochastic gradient descent.D: Input dimension.C: Number of Classes.N: Number of example.Inputs:- x: training data of shape (N, D)- y: output data of shape (N, ) where value < C- lr: (float) learning rate for optimization.- reg: (float) regularization strength.- iter: (integer) total number of iterations.- batchSize: (integer) number of example in each batch running.- verbose: (boolean) Print log of loss and training accuracy.Outputs:A list containing the value of the loss at each training iteration."""# Run stochastic gradient descent to optimize W.lossHistory = []for i in range(iter):xBatch = NoneyBatch = None# - Sample batchSize from training data and save to xBatch and yBatch   ## - After sampling xBatch should have shape (batchSize, D)              ##                  yBatch (batchSize, )                                 ## - Use that sample for gradient decent optimization.                   ## - Update the weights using the gradient and the learning rate.        ##creating batchnum_train = np.random.choice(x.shape[0], batchSize)xBatch = x[num_train]yBatch = y[num_train]loss, dW = self.calLoss(xBatch,yBatch,reg)self.W= self.W - lr * dWlossHistory.append(loss)# Print loss for every 100 iterationsif verbose and i % 100 == 0 and len(lossHistory) is not 0:print ('Loop {0} loss {1}'.format(i, lossHistory[i]))return lossHistorydef predict (self, x,):"""Predict the y output.Inputs:- x: training data of shape (N, D)Returns:- yPred: output data of shape (N, ) where value < C"""yPred = np.zeros(x.shape[0])# -  Store the predict output in yPred                                    #s = x.dot(self.W)yPred = np.argmax(s, axis=1)return yPreddef calAccuracy (self, x, y):acc = 0# -  Calculate accuracy of the predict value and store to acc variable    yPred = self.predict(x)acc = np.mean(y == yPred)*100return acc