01.神经网络和深度学习 W2.神经网络基础(作业:逻辑回归 图片识别)

文章目录

    • 编程题 1
      • 1. numpy 基本函数
        • 1.1 编写 sigmoid 函数
        • 1.2 编写 sigmoid 函数的导数
        • 1.3 reshape操作
        • 1.4 标准化
        • 1.5 广播机制
      • 2. 向量化
        • 2.1 L1\L2损失函数
    • 编程题 2. 图片🐱识别
      • 1. 导入包
      • 2. 数据预览
      • 3. 算法的一般结构
      • 4. 建立算法
        • 4.1 辅助函数
        • 4.2 初始化参数
        • 4.3 前向后向传播
        • 4.4 更新参数,梯度下降
        • 4.5 合并所有函数到Model
        • 4.6 分析
        • 4.7 用自己的照片测试模型
      • 5. 总结

选择题测试,请参考 链接博文

编程题 1

1. numpy 基本函数

1.1 编写 sigmoid 函数

import mathdef basic_sigmoid(x):"""Compute sigmoid of x.Arguments:x -- A scalarReturn:s -- sigmoid(x)"""### START CODE HERE ### (≈ 1 line of code)s = 1/(1+math.pow(math.e, -x))
# or  s = 1/(1+math.exp(-x))### END CODE HERE ###return s
  • 不推荐使用 math 包,因为深度学习里很多都是向量,math 包不能对向量进行计算
### One reason why we use "numpy" instead of "math" in Deep Learning ###
x = [1, 2, 3]
basic_sigmoid(x) # you will see this give an error when you run it, because x is a vector.
# 会报错!
import numpy as np# example of np.exp
x = np.array([1, 2, 3])
print(np.exp(x)) # result is (exp(1), exp(2), exp(3))
# [ 2.71828183  7.3890561  20.08553692]
# numpy 可以对向量进行操作
  • 使用 numpy 编写的 sigmoid 函数
import numpy as np # this means you can access numpy functions by writing np.function() instead of numpy.function()def sigmoid(x):"""Compute the sigmoid of xArguments:x -- A scalar or numpy array of any sizeReturn:s -- sigmoid(x)"""### START CODE HERE ### (≈ 1 line of code)s = 1/(1+np.exp(-x))### END CODE HERE ###return s
x = np.array([1, 2, 3])
sigmoid(x)
# array([0.73105858, 0.88079708, 0.95257413])

1.2 编写 sigmoid 函数的导数

# GRADED FUNCTION: sigmoid_derivativedef sigmoid_derivative(x):"""Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.You can store the output of the sigmoid function into variables and then use it to calculate the gradient.Arguments:x -- A scalar or numpy arrayReturn:ds -- Your computed gradient."""### START CODE HERE ### (≈ 2 lines of code)s = sigmoid(x)ds = s*(1-s)### END CODE HERE ###return ds
x = np.array([1, 2, 3])
sigmoid_derivative(x)
print ("sigmoid_derivative(x) = " + str(sigmoid_derivative(x)))
# sigmoid_derivative(x) = [0.19661193 0.10499359 0.04517666]

1.3 reshape操作

将照片的数据展平,不想计算的维,可以置为 -1,会自动计算

# GRADED FUNCTION: image2vector
def image2vector(image):"""Argument:image -- a numpy array of shape (length, height, depth)Returns:v -- a vector of shape (length*height*depth, 1)"""### START CODE HERE ### (≈ 1 line of code)v = image.reshape(-1,1)### END CODE HERE ###return v
# This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y,3) where 3 represents the RGB values
image = np.array([[[ 0.67826139,  0.29380381],[ 0.90714982,  0.52835647],[ 0.4215251 ,  0.45017551]],[[ 0.92814219,  0.96677647],[ 0.85304703,  0.52351845],[ 0.19981397,  0.27417313]],[[ 0.60659855,  0.00533165],[ 0.10820313,  0.49978937],[ 0.34144279,  0.94630077]]])print ("image2vector(image) = " + str(image2vector(image)))
# 输出
image2vector(image) = [[0.67826139][0.29380381][0.90714982][0.52835647][0.4215251 ][0.45017551][0.92814219][0.96677647][0.85304703][0.52351845][0.19981397][0.27417313][0.60659855][0.00533165][0.10820313][0.49978937][0.34144279][0.94630077]]

1.4 标准化

标准化通常使得梯度下降收敛更快。

举个例子 x=[034264]x = \begin{bmatrix} 0 & 3 & 4 \\ 2 & 6 & 4 \\ \end{bmatrix}x=[023644]
那么 ∥x∥=np.linalg.norm(x,axis=1,keepdims=True)=[556]\| x\| = np.linalg.norm(x, axis = 1, keepdims = True) = \begin{bmatrix} 5 \\ \sqrt{56} \\ \end{bmatrix}x=np.linalg.norm(x,axis=1,keepdims=True)=[556]
x_normalized=x∥x∥=[03545256656456]x\_normalized = \frac{x}{\| x\|} = \begin{bmatrix} 0 & \frac{3}{5} & \frac{4}{5} \\ \frac{2}{\sqrt{56}} & \frac{6}{\sqrt{56}} & \frac{4}{\sqrt{56}} \\ \end{bmatrix}x_normalized=xx=[05625356654564]

# GRADED FUNCTION: normalizeRowsdef normalizeRows(x):"""Implement a function that normalizes each row of the matrix x (to have unit length).Argument:x -- A numpy matrix of shape (n, m)Returns:x -- The normalized (by row) numpy matrix. You are allowed to modify x."""### START CODE HERE ### (≈ 2 lines of code)# Compute x_norm as the norm 2 of x. Use np.linalg.norm(..., ord = 2, axis = ..., keepdims = True)x_norm = np.linalg.norm(x, axis=1, keepdims=True)# Divide x by its norm.x = x/x_norm### END CODE HERE ###return x
x = np.array([[0, 3, 4],[1, 6, 4]])
print("normalizeRows(x) = " + str(normalizeRows(x)))
# normalizeRows(x) = [[0.         0.6        0.8       ]
#					[0.13736056 0.82416338 0.54944226]]

1.5 广播机制

官方文档

对于行向量 x∈R1×n, softmax(x)=softmax([x1x2...xn])=[ex1∑jexjex2∑jexj...exn∑jexj]x \in \mathbb{R}^{1\times n} \text{, } softmax(x) = softmax(\begin{bmatrix} x_1 && x_2 && ... && x_n \end{bmatrix}) = \begin{bmatrix} \frac{e^{x_1}}{\sum_{j}e^{x_j}} && \frac{e^{x_2}}{\sum_{j}e^{x_j}} && ... && \frac{e^{x_n}}{\sum_{j}e^{x_j}} \end{bmatrix}xR1×nsoftmax(x)=softmax([x1x2...xn])=[jexjex1jexjex2...jexjexn]

对于矩阵 xxx ∈Rm×n\in \mathbb{R}^{m \times n}Rm×n
xijx_{ij}xij maps to the element in the ithi^{th}ith row and jthj^{th}jth column of xxx, thus we have

softmax(x)=softmax[x11x12x13…x1nx21x22x23…x2n⋮⋮⋮⋱⋮xm1xm2xm3…xmn]=[ex11∑jex1jex12∑jex1jex13∑jex1j…ex1n∑jex1jex21∑jex2jex22∑jex2jex23∑jex2j…ex2n∑jex2j⋮⋮⋮⋱⋮exm1∑jexmjexm2∑jexmjexm3∑jexmj…exmn∑jexmj]=(softmax(first row of x)softmax(second row of x)...softmax(last row of x))softmax(x) = softmax\begin{bmatrix} x_{11} & x_{12} & x_{13} & \dots & x_{1n} \\ x_{21} & x_{22} & x_{23} & \dots & x_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & x_{m3} & \dots & x_{mn} \end{bmatrix} = \begin{bmatrix} \frac{e^{x_{11}}}{\sum_{j}e^{x_{1j}}} & \frac{e^{x_{12}}}{\sum_{j}e^{x_{1j}}} & \frac{e^{x_{13}}}{\sum_{j}e^{x_{1j}}} & \dots & \frac{e^{x_{1n}}}{\sum_{j}e^{x_{1j}}} \\ \frac{e^{x_{21}}}{\sum_{j}e^{x_{2j}}} & \frac{e^{x_{22}}}{\sum_{j}e^{x_{2j}}} & \frac{e^{x_{23}}}{\sum_{j}e^{x_{2j}}} & \dots & \frac{e^{x_{2n}}}{\sum_{j}e^{x_{2j}}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \frac{e^{x_{m1}}}{\sum_{j}e^{x_{mj}}} & \frac{e^{x_{m2}}}{\sum_{j}e^{x_{mj}}} & \frac{e^{x_{m3}}}{\sum_{j}e^{x_{mj}}} & \dots & \frac{e^{x_{mn}}}{\sum_{j}e^{x_{mj}}} \end{bmatrix} = \begin{pmatrix} softmax\text{(first row of x)} \\ softmax\text{(second row of x)} \\ ... \\ softmax\text{(last row of x)} \\ \end{pmatrix} softmax(x)=softmaxx11x21xm1x12x22xm2x13x23xm3x1nx2nxmn=jex1jex11jex2jex21jexmjexm1jex1jex12jex2jex22jexmjexm2jex1jex13jex2jex23jexmjexm3jex1jex1njex2jex2njexmjexmn=softmax(first row of x)softmax(second row of x)...softmax(last row of x)

# GRADED FUNCTION: softmaxdef softmax(x):"""Calculates the softmax for each row of the input x.Your code should work for a row vector and also for matrices of shape (n, m).Argument:x -- A numpy matrix of shape (n,m)Returns:s -- A numpy matrix equal to the softmax of x, of shape (n,m)"""### START CODE HERE ### (≈ 3 lines of code)# Apply exp() element-wise to x. Use np.exp(...).x_exp = np.exp(x)# Create a vector x_sum that sums each row of x_exp. Use np.sum(..., axis = 1, keepdims = True).x_sum = np.sum(x_exp, axis=1, keepdims=True)# Compute softmax(x) by dividing x_exp by x_sum. It should automatically use numpy broadcasting.s = x_exp/x_sum### END CODE HERE ###return s
x = np.array([[9, 2, 5, 0, 0],[7, 5, 0, 0 ,0]])
print("softmax(x) = " + str(softmax(x)))
softmax(x) = [[9.80897665e-01 8.94462891e-04 1.79657674e-02 1.21052389e-041.21052389e-04][8.78679856e-01 1.18916387e-01 8.01252314e-04 8.01252314e-048.01252314e-04]]

2. 向量化

向量化计算更简洁,更高效

2.1 L1\L2损失函数

L1(y^,y)=∑i=0m∣y(i)−y^(i)∣\begin{aligned} & L_1(\hat{y}, y) = \sum_{i=0}^m|y^{(i)} - \hat{y}^{(i)}| \end{aligned}L1(y^,y)=i=0my(i)y^(i)

def L1(yhat, y):"""Arguments:yhat -- vector of size m (predicted labels)y -- vector of size m (true labels)Returns:loss -- the value of the L1 loss function defined above"""### START CODE HERE ### (≈ 1 line of code)loss = np.sum(abs(yhat-y))### END CODE HERE ###return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L1 = " + str(L1(yhat,y)))
# L1 = 1.1

L2(y^,y)=∑i=0m(y(i)−y^(i))2\begin{aligned} & L_2(\hat{y},y) = \sum_{i=0}^m(y^{(i)} - \hat{y}^{(i)})^2 \end{aligned}L2(y^,y)=i=0m(y(i)y^(i))2

import numpy as np
a = np.array([1, 2, 3])
np.dot(a, a)
14

# GRADED FUNCTION: L2def L2(yhat, y):"""Arguments:yhat -- vector of size m (predicted labels)y -- vector of size m (true labels)Returns:loss -- the value of the L2 loss function defined above"""### START CODE HERE ### (≈ 1 line of code)loss = np.dot(yhat-y, yhat-y)### END CODE HERE ###return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L2 = " + str(L2(yhat,y)))
# L2 = 0.43

编程题 2. 图片🐱识别

使用神经网络识别猫

1. 导入包

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset%matplotlib inline

2. 数据预览

弄清楚数据的维度
reshape 数据
标准化数据

有训练集,标签为 y = 1 是猫,y = 0 不是猫
有测试集,带标签的
每个图片是 3 通道的

  • 读取数据
# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
  • 预览图片
# Example of a picture
index = 24
plt.imshow(train_set_x_orig[index])
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") + "' picture.")
y = [1], it's a 'cat' picture.

cat

  • 数据大小
### START CODE HERE ### (≈ 3 lines of code)
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)
  • 将样本图片矩阵展平
# Reshape the training and test examples### START CODE HERE ### (≈ 2 lines of code)
train_set_x_flatten = train_set_x_orig.reshape(m_train, -1).T
test_set_x_flatten = test_set_x_orig.reshape(m_test, -1).T
### END CODE HERE ###print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))
train_set_x_flatten shape: (12288, 209)
train_set_y shape: (1, 209)
test_set_x_flatten shape: (12288, 50)
test_set_y shape: (1, 50)
sanity check after reshaping: [17 31 56 22 33]
  • 图片的矩阵数值为 0 - 255,标准化数据
train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

3. 算法的一般结构

用神经网络的思路,建立一个 Logistic 回归

4. 建立算法

定义模型结构(如,输入的特征个数)
初始化模型参数
循环迭代:

  1. 计算当前损失(前向传播)
  2. 计算当前梯度(后向传播)
  3. 更新参数(梯度下降)

4.1 辅助函数

  • sigmoid 函数
# GRADED FUNCTION: sigmoiddef sigmoid(z):"""Compute the sigmoid of zArguments:z -- A scalar or numpy array of any size.Return:s -- sigmoid(z)"""### START CODE HERE ### (≈ 1 line of code)s = 1/(1+np.exp(-z))### END CODE HERE ###return s

4.2 初始化参数

逻辑回归的参数可以都设置为 0(神经网络不可以)

# GRADED FUNCTION: initialize_with_zerosdef initialize_with_zeros(dim):"""This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)"""### START CODE HERE ### (≈ 1 line of code)w = np.zeros((dim, 1))b = 0### END CODE HERE ###assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, b

4.3 前向后向传播

前向传播:

  • XXX 特征
  • 计算 A=σ(wTX+b)=(a(0),a(1),...,a(m−1),a(m))A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})A=σ(wTX+b)=(a(0),a(1),...,a(m1),a(m))
  • 计算损失函数: J=−1m∑i=1my(i)log⁡(a(i))+(1−y(i))log⁡(1−a(i))J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})J=m1i=1my(i)log(a(i))+(1y(i))log(1a(i))

方程:

∂J∂w=1mX(A−Y)T\frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^TwJ=m1X(AY)T
∂J∂b=1m∑i=1m(a(i)−y(i))\frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})bJ=m1i=1m(a(i)y(i))

# GRADED FUNCTION: propagatedef propagate(w, b, X, Y):"""Implement the cost function and its gradient for the propagation explained aboveArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)Return:cost -- negative log-likelihood cost for logistic regressiondw -- gradient of the loss with respect to w, thus same shape as wdb -- gradient of the loss with respect to b, thus same shape as bTips:- Write your code step by step for the propagation. np.log(), np.dot()"""m = X.shape[1]# FORWARD PROPAGATION (FROM X TO COST)### START CODE HERE ### (≈ 2 lines of code)A = sigmoid(np.dot(w.T, X)+b)             # compute activation# w 是列向量, A 行向量,dot 矩阵乘法cost = np.sum(Y*np.log(A)+(1-Y)*np.log(1-A))/(-m)  # compute cost# Y 行向量,* 对应位置相乘### END CODE HERE #### BACKWARD PROPAGATION (TO FIND GRAD)### START CODE HERE ### (≈ 2 lines of code)dw = np.dot(X, (A-Y).T)/mdb = np.sum(A-Y, axis=1)/m### END CODE HERE ###assert(dw.shape == w.shape)assert(db.dtype == float)cost = np.squeeze(cost)assert(cost.shape == ())grads = {"dw": dw,"db": db}return grads, cost
w, b, X, Y = np.array([[1],[2]]), 2, np.array([[1,2],[3,4]]), np.array([[1,0]])
grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))
dw = [[0.99993216][1.99980262]]
db = [0.49993523]
cost = 6.000064773192205

4.4 更新参数,梯度下降

# GRADED FUNCTION: optimizedef optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):"""This function optimizes w and b by running a gradient descent algorithmArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of shape (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- True to print the loss every 100 stepsReturns:params -- dictionary containing the weights w and bias bgrads -- dictionary containing the gradients of the weights and bias with respect to the cost functioncosts -- list of all the costs computed during the optimization, this will be used to plot the learning curve.Tips:You basically need to write down two steps and iterate through them:1) Calculate the cost and the gradient for the current parameters. Use propagate().2) Update the parameters using gradient descent rule for w and b."""costs = []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)### START CODE HERE ### grads, cost = propagate(w, b, X, Y)### END CODE HERE #### Retrieve derivatives from gradsdw = grads["dw"]db = grads["db"]# update rule (≈ 2 lines of code)### START CODE HERE ###w = w - learning_rate * dwb = b - learning_rate * db### END CODE HERE #### Record the costsif i % 100 == 0:costs.append(cost)# Print the cost every 100 training examplesif print_cost and i % 100 == 0:print ("Cost after iteration %i: %f" %(i, cost))params = {"w": w,"b": b}grads = {"dw": dw,"db": db}return params, grads, costs
params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
w = [[0.1124579 ][0.23106775]]
b = [1.55930492]
dw = [[0.90158428][1.76250842]]
db = [0.43046207]
  • 可以利用学习到的参数来进行预测

计算预测值 Y^=A=σ(wTX+b)\hat{Y} = A = \sigma(w^T X + b)Y^=A=σ(wTX+b)
根据预测值进行分类,<= 0.5 标记为0,否则为1

# GRADED FUNCTION: predictdef predict(w, b, X):'''Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X'''m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the picture### START CODE HERE ### (≈ 1 line of code)A = sigmoid(np.dot(w.T, X) + b)### END CODE HERE ###for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]### START CODE HERE ### (≈ 4 lines of code)Y_prediction[0][i] = 0 if A[0][i] <= 0.5 else 1### END CODE HERE ###assert(Y_prediction.shape == (1, m))return Y_prediction
print ("predictions = " + str(predict(w, b, X)))
predictions = [[1. 1.]]

4.5 合并所有函数到Model

# GRADED FUNCTION: modeldef model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):"""Builds the logistic regression model by calling the function you've implemented previouslyArguments:X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)num_iterations -- hyperparameter representing the number of iterations to optimize the parameterslearning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()print_cost -- Set to true to print the cost every 100 iterationsReturns:d -- dictionary containing information about the model."""### START CODE HERE #### initialize parameters with zeros (≈ 1 line of code)w, b = initialize_with_zeros(X_train.shape[0])# Gradient descent (≈ 1 line of code)parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost = print_cost)# Retrieve parameters w and b from dictionary "parameters"w = parameters["w"]b = parameters["b"]# Predict test/train set examples (≈ 2 lines of code)Y_prediction_test = predict(w, b, X_test)Y_prediction_train = predict(w, b, X_train)### END CODE HERE #### Print train/test Errorsprint("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))d = {"costs": costs,"Y_prediction_test": Y_prediction_test, "Y_prediction_train" : Y_prediction_train, "w" : w, "b" : b,"learning_rate" : learning_rate,"num_iterations": num_iterations}return d
d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.584508
Cost after iteration 200: 0.466949
Cost after iteration 300: 0.376007
Cost after iteration 400: 0.331463
Cost after iteration 500: 0.303273
Cost after iteration 600: 0.279880
Cost after iteration 700: 0.260042
Cost after iteration 800: 0.242941
Cost after iteration 900: 0.228004
Cost after iteration 1000: 0.214820
Cost after iteration 1100: 0.203078
Cost after iteration 1200: 0.192544
Cost after iteration 1300: 0.183033
Cost after iteration 1400: 0.174399
Cost after iteration 1500: 0.166521
Cost after iteration 1600: 0.159305
Cost after iteration 1700: 0.152667
Cost after iteration 1800: 0.146542
Cost after iteration 1900: 0.140872
train accuracy: 99.04306220095694 %
test accuracy: 70.0 %
  • 模型在训练集上表现的很好,在测试集上一般,存在过拟合现象
# Example of a picture that was wrongly classified.
index = 24
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[int(d["Y_prediction_test"][0,index])].decode("utf-8") +  "\" picture.")
y = 1, you predicted that it is a "cat" picture.

测试结果
更改 index 可以查看 测试集的 预测值和真实值

  • 绘制代价函数、梯度
# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

在这里插入图片描述

  • 增加训练迭代次数为 3000(上面是2000)
train accuracy: 99.52153110047847 %
test accuracy: 68.0 %

在这里插入图片描述

训练集上的准确率上升,但是测试集上准确率下降,这就是过拟合了

4.6 分析

  • 不同学习率下的对比
learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:print ("learning rate is: " + str(i))models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)print ('\n' + "-------------------------------------------------------" + '\n')for i in learning_rates:plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))plt.ylabel('cost')
plt.xlabel('iterations')legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()
learning rate is: 0.01
train accuracy: 99.52153110047847 %
test accuracy: 68.0 %-------------------------------------------------------learning rate is: 0.001
train accuracy: 88.99521531100478 %
test accuracy: 64.0 %-------------------------------------------------------learning rate is: 0.0001
train accuracy: 68.42105263157895 %
test accuracy: 36.0 %-------------------------------------------------------

在这里插入图片描述

  • 学习率太大的话,容易引起震荡,导致不收敛(本例子0.01,不算太坏,最后收敛了)
  • 低的cost不意味着好的模型,要检查是否过拟合(训练集很好,测试集很差)

4.7 用自己的照片测试模型

## START CODE HERE ## (PUT YOUR IMAGE NAME) 
my_image = "cat1.jpg"   # change this to the name of your image file 
## END CODE HERE ### We preprocess the image to fit your algorithm.
fname = "images/" + my_image
image = Image.open(fname)
my_image = np.array(image.resize((num_px, num_px))).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(d["w"], d["b"], my_image)plt.imshow(image)
print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")

在这里插入图片描述
在这里插入图片描述

5. 总结

  • 处理数据很重要,数据维度,数据标准化
  • 各个独立的函数,初始化,前后向传播,梯度下降更新参数
  • 组成模型
  • 调节学习率等超参数

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/474200.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

PL/SQL程序设计以及安全管理实验遇到的问题及解决

问题一&#xff1a;当我书写PL/SQL语句调用所创建的函数时&#xff0c;报“此范围不存在名为XXX函数名”的错误。 解决&#xff1a; 我通过查阅相关资料&#xff0c;了解到&#xff1a;这种情况主要是调用的函数的参数或者函数名书写错误&#xff0c; 然而&#xff0c;我经过仔…

PowerDesigner使用教程 —— 概念数据模型 (转)

一、概念数据模型概述 概念数据模型也称信息模型&#xff0c;它以实体&#xff0d;联系(Entity-RelationShip,简称E-R)理论为基础&#xff0c;并对这一理论进行了扩充。它从用户的观点出发对信息进行建模&#xff0c;主要用于数据库的概念级设计。 通常人们先将现实世界抽…

进程的创建-Process⼦类

from multiprocessing import Process&#xff08;P必须大写 import os import time classSubProcess(Process): """创建Process的子类""" def __init__(self, num, a): super(SubProcess, self).__init__() # 执行父类Process默认的初始化方法…

阿里云 超级码力在线编程大赛初赛 第1场(第245名)

文章目录1. 比赛结果2. 题目1. 树木规划2. 正三角形拼接3. 大楼间穿梭4. 对称前后缀1. 比赛结果 通过了 3 题&#xff0c;第245名&#xff0c;进入复赛了&#xff0c;收获 T恤 一件&#xff0c;哈哈。 2. 题目 1. 树木规划 题目链接 描述 在一条直的马路上&#xff0c;…

python中的进程池Pool

初始化Pool时&#xff0c;可以指定⼀个最大进程池&#xff0c;当有新进程提交时&#xff0c;如果池还没有满&#xff0c;那么就会创建新进程请求&#xff1b;但如果池中达到最大值&#xff0c;那么就会等待&#xff0c;待池中有进程结束&#xff0c;新进程来执行。 非阻塞式&a…

AS3.0面向对象的写法,类和实例

package /*package是包路径&#xff0c;例如AS文件在ActionScript文件夹下&#xff0c;此时路径应为package ActionScript。必须有的。package中只能有一个class&#xff0c;在一个AS文件中可以有若干个package*/ {public class hello /*类的名字*/{public var helloString:Str…

小小算法题(CCF)

题目 淘金 题目描述 在一片n*m的土地上&#xff0c;每一块1*1的区域里都有一定数量的金子。这一天&#xff0c;你到这里来淘金&#xff0c;然而当地人告诉你&#xff0c;如果你要挖(x, y)区域的金子&#xff0c;就不能挖(x-1&#xff0c;y),(x1, y)以及横坐标为y-1和y1的金…

01.神经网络和深度学习 W3.浅层神经网络(作业:带一个隐藏层的神经网络)

文章目录1. 导入包2. 预览数据3. 逻辑回归4. 神经网络4.1 定义神经网络结构4.2 初始化模型参数4.3 循环4.3.1 前向传播4.3.2 计算损失4.3.3 后向传播4.3.4 梯度下降4.4 组建Model4.5 预测4.6 调节隐藏层单元个数4.7 更改激活函数4.8 更改学习率4.9 其他数据集下的表现选择题测试…

python中的异步与同步

异步&#xff1a; 多任务&#xff0c; 多个任务之间执行没有先后顺序&#xff0c;可以同时运行&#xff0c;执行的先后顺序不会有什么影响&#xff0c;存在的多条运行主线 同步&#xff1a; 多任务&#xff0c; 多个任务之间执行的时候要求有先后顺序&#xff0c;必须一个先执行…

C语言指针祥讲

前言:复杂类型说明 要了解指针,多多少少会出现一些比较复杂的类型,所以我先介绍一下如何完全理解一个复杂类型,要理解复杂类型其实很简单,一个类型里会出现很多运算符,他们也像普通的表达式一样,有优先级,其优先级和运算优先级一样,所以我总结了一下其原则:从变量名处起,根…

编程挑战(6)

组合算法&#xff1a;开一个数组&#xff0c;其下标表示1到m个数&#xff0c;数组元素的值为1表示其下标代表的数被选中&#xff0c;为0则没有选中。 首先初始化&#xff0c;将数组前n个元素置1&#xff0c;表示第一个组合为前n个数&#xff1b;然后从左到右扫描数组元素值的“…

[编程启蒙游戏] 2. 奇偶数

文章目录1. 游戏前提2. 游戏目的3. python代码1. 游戏前提 孩子知道奇偶数是什么&#xff0c;不知道也没关系 还可以采用掰手指演示&#xff0c;伸出两个手指能配对&#xff0c;所有伸出来的手指都两两配对了&#xff0c;伸出来的手指个数就是偶数如果还有1个没有找到朋友的手…

进程间通信-Queue 消息队列 先进先出

Process之间有时需要通信&#xff0c;操作系统提供了很多机制来实现进程间的通信。 multiprocessing模块的Queue实现多进程之间的数据传递&#xff0c;Queue本身是一个消息列队程序 初始化Queue()对象时&#xff08;例如&#xff1a;qQueue()&#xff09;&#xff0c;若括号中没…

过压保护(1)

征一个简单、可靠的电源过压保护电路 http://www.amobbs.com/thread-5542005-1-1.html 防过压&#xff1a;过压之后TVS导通&#xff0c;电流由正极流经自恢复保险再流经TVS到负极&#xff0c;自恢复保险升温&#xff0c;阻值变大&#xff0c;相当于断开&#xff0c;等电流撤去&…

spring boot+thmyleaf ModelAndView页面传值

如上图所示&#xff0c;当我们从后台通过ModelAndView进行传值的时候&#xff0c; 一定要注意&#xff0c;千万不要向上图那样开头加上反斜杠&#xff0c;开头加反斜杠&#xff0c;系统会默认为相对路径&#xff0c; 虽然也能找到相应的视图&#xff08;html&#xff09;&#…

python中的进程, 线程

进程是操作系统分配资源的最小单位 线程是操作系统调度执行的最小单位 定义的不同 进程是系统进行资源分配的最小单位. 线程是进程的一个实体,是CPU进行调度的基本单位,它是比进程更小的能独立运行的基本单位.线程自己基本上不拥有系统资源,只拥有一点在运行中必不可少的资源(…

LeetCode 214. 最短回文串(字符串哈希)

文章目录1. 题目2. 解题1. 题目 给定一个字符串 s&#xff0c;你可以通过在字符串前面添加字符将其转换为回文串。 找到并返回可以用这种方式转换的最短回文串。 示例 1: 输入: "aacecaaa" 输出: "aaacecaaa"示例 2: 输入: "abcd" 输出: "…

转:c#委托事件实现窗体传值通信

C#实现Winform窗口间数据交互的三种方法介绍 2010-03-15 来自&#xff1a;CNBLOG 字体大小&#xff1a;【大 中 小】摘要&#xff1a;本文分别介绍C#实现Winform窗口间数据交互的三种方法&#xff1a;修改子窗体的构造函数、给窗体添加属性或方法、通过委托的方法&#xff0c…

python中的多线程-threading

python的thread模块是比较底层的模块&#xff0c;python的threading模块是对thread做了一些包装的&#xff0c;可以更加方便的被使用 创建多线程&#xff1a; from threading import Thread import time def sing(): for i in range(3): print("唱歌") time.sleep(…

LeetCode 1566. 重复至少 K 次且长度为 M 的模式

文章目录1. 题目2. 解题1. 题目 给你一个正整数数组 arr&#xff0c;请你找出一个长度为 m 且在数组中至少重复 k 次的模式。 模式 是由一个或多个值组成的子数组&#xff08;连续的子序列&#xff09;&#xff0c;连续 重复多次但 不重叠 。 模式由其长度和重复次数定义。 …