01.神经网络和深度学习 W4.深层神经网络(作业:建立你的深度神经网络+图片猫预测)

文章目录

    • 作业1. 建立你的深度神经网络
      • 1. 导入包
      • 2. 算法主要流程
      • 3. 初始化
        • 3.1 两层神经网络
        • 3.2 多层神经网络
      • 4. 前向传播
        • 4.1 线性模块
        • 4.2 线性激活模块
        • 4.3 多层模型
      • 5. 损失函数
      • 6. 反向传播
        • 6.1 线性模块
        • 6.2 线性激活模块
        • 6.3 多层模型
        • 6.4 梯度下降、更新参数
    • 作业2. 深度神经网络应用:图像分类
      • 1. 导入包
      • 2. 数据集
      • 3. 建立模型
        • 3.1 两层神经网络
        • 3.2 多层神经网络
        • 3.3 一般步骤
      • 4. 两层神经网络
      • 5. 多层神经网络
      • 6. 结果分析
      • 7. 用自己的图片测试

测试题:参考博文

作业1. 建立你的深度神经网络

1. 导入包

import numpy as np
import h5py
import matplotlib.pyplot as plt
from testCases_v2 import *
from dnn_utils_v2 import sigmoid, sigmoid_backward, relu, relu_backward%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'%load_ext autoreload
%autoreload 2np.random.seed(1)

2. 算法主要流程

在这里插入图片描述

3. 初始化

第4节笔记:01.神经网络和深度学习 W4.深层神经网络

3.1 两层神经网络

模型结构:LINEAR -> RELU -> LINEAR -> SIGMOID
权重:np.random.randn(shape)*0.01
偏置:np.zeros(shape)

# GRADED FUNCTION: initialize_parametersdef initialize_parameters(n_x, n_h, n_y):"""Argument:n_x -- size of the input layern_h -- size of the hidden layern_y -- size of the output layerReturns:parameters -- python dictionary containing your parameters:W1 -- weight matrix of shape (n_h, n_x)b1 -- bias vector of shape (n_h, 1)W2 -- weight matrix of shape (n_y, n_h)b2 -- bias vector of shape (n_y, 1)"""np.random.seed(1)### START CODE HERE ### (≈ 4 lines of code)W1 = np.random.randn(n_h, n_x)*0.01b1 = np.zeros((n_h, 1))W2 = np.random.randn(n_y, n_h)*0.01b2 = np.zeros((n_y, 1))### END CODE HERE ###assert(W1.shape == (n_h, n_x))assert(b1.shape == (n_h, 1))assert(W2.shape == (n_y, n_h))assert(b2.shape == (n_y, 1))parameters = {"W1": W1,"b1": b1,"W2": W2,"b2": b2}return parameters    

3.2 多层神经网络

模型结构:[LINEAR -> RELU] × (L-1) -> LINEAR -> SIGMOID

# GRADED FUNCTION: initialize_parameters_deepdef initialize_parameters_deep(layer_dims):"""Arguments:layer_dims -- python array (list) containing the dimensions of each layer in our networkReturns:parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])bl -- bias vector of shape (layer_dims[l], 1)"""np.random.seed(3)parameters = {}L = len(layer_dims)            # number of layers in the networkfor l in range(1, L):### START CODE HERE ### (≈ 2 lines of code)parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1])*0.01parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))### END CODE HERE ###assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))return parameters

4. 前向传播

4.1 线性模块

向量化公式
Z[l]=W[l]A[l−1]+b[l]Z^{[l]} = W^{[l]}A^{[l-1]} +b^{[l]}Z[l]=W[l]A[l1]+b[l]

其中 A[0]=XA^{[0]} = XA[0]=X

计算 ZZZ,缓存 A,W,bA, W, bA,W,b

# GRADED FUNCTION: linear_forwarddef linear_forward(A, W, b):"""Implement the linear part of a layer's forward propagation.Arguments:A -- activations from previous layer (or input data): (size of previous layer, number of examples)W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)b -- bias vector, numpy array of shape (size of the current layer, 1)Returns:Z -- the input of the activation function, also called pre-activation parameter cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently"""### START CODE HERE ### (≈ 1 line of code)Z = np.dot(W, A) + b### END CODE HERE ###assert(Z.shape == (W.shape[0], A.shape[1]))cache = (A, W, b)return Z, cache

4.2 线性激活模块

计算激活输出 AAA,以及缓存 ZZZ (反向传播时要用到)(作业里的激活函数会返回这两项)
A[l]=g(Z[l])=g(W[l]A[l−1]+b[l])A^{[l]} = g(Z^{[l]}) = g(W^{[l]}A^{[l-1]} +b^{[l]})A[l]=g(Z[l])=g(W[l]A[l1]+b[l])
其中 ggg 是激活函数,可以是ReLuSigmoid

# GRADED FUNCTION: linear_activation_forwarddef linear_activation_forward(A_prev, W, b, activation):"""Implement the forward propagation for the LINEAR->ACTIVATION layerArguments:A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)b -- bias vector, numpy array of shape (size of the current layer, 1)activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"Returns:A -- the output of the activation function, also called the post-activation value cache -- a python dictionary containing "linear_cache" and "activation_cache";stored for computing the backward pass efficiently"""if activation == "sigmoid":# Inputs: "A_prev, W, b". Outputs: "A, activation_cache".### START CODE HERE ### (≈ 2 lines of code)Z, linear_cache = linear_forward(A_prev, W, b)A, activation_cache = sigmoid(Z)### END CODE HERE ###elif activation == "relu":# Inputs: "A_prev, W, b". Outputs: "A, activation_cache".### START CODE HERE ### (≈ 2 lines of code)Z, linear_cache = linear_forward(A_prev, W, b)A, activation_cache = relu(Z)### END CODE HERE ###assert (A.shape == (W.shape[0], A_prev.shape[1]))cache = (linear_cache, activation_cache)return A, cache

4.3 多层模型

在这里插入图片描述
前面使用 L−1L-1L1ReLu,最后使用 1 层 Sigmoid

# GRADED FUNCTION: L_model_forwarddef L_model_forward(X, parameters):"""Implement forward propagation for the [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID computationArguments:X -- data, numpy array of shape (input size, number of examples)parameters -- output of initialize_parameters_deep()Returns:AL -- last post-activation valuecaches -- list of caches containing:every cache of linear_relu_forward() (there are L-1 of them, indexed from 0 to L-2)the cache of linear_sigmoid_forward() (there is one, indexed L-1)"""caches = []A = XL = len(parameters) // 2   # number of layers in the neural network# Implement [LINEAR -> RELU]*(L-1). Add "cache" to the "caches" list.for l in range(1, L):A_prev = A ### START CODE HERE ### (≈ 2 lines of code)A, cache = linear_activation_forward(A_prev, parameters['W'+str(l)], parameters['b'+str(l)], 'relu')caches.append(cache) # 每一层的 (A,W,b, Z)### END CODE HERE #### Implement LINEAR -> SIGMOID. Add "cache" to the "caches" list.### START CODE HERE ### (≈ 2 lines of code)AL, cache = linear_activation_forward(A, parameters['W'+str(L)], parameters['b'+str(L)], 'sigmoid')caches.append(cache)### END CODE HERE ###assert(AL.shape == (1,X.shape[1]))return AL, caches

现在得到了一个完整的前向传播,AL 包含预测值,可以计算损失函数

5. 损失函数

计算损失:

−1m∑i=1m(y(i)log⁡(a[L](i))+(1−y(i))log⁡(1−a[L](i)))-\frac{1}{m} \sum\limits_{i = 1}^{m} \bigg(y^{(i)}\log\left(a^{[L] (i)}\right) + (1-y^{(i)})\log\left(1- a^{[L](i)}\right) \bigg) m1i=1m(y(i)log(a[L](i))+(1y(i))log(1a[L](i)))

# GRADED FUNCTION: compute_costdef compute_cost(AL, Y):"""Implement the cost function defined by equation (7).Arguments:AL -- probability vector corresponding to your label predictions, shape (1, number of examples)Y -- true "label" vector (for example: containing 0 if non-cat, 1 if cat), shape (1, number of examples)Returns:cost -- cross-entropy cost"""m = Y.shape[1]# Compute loss from aL and y.### START CODE HERE ### (≈ 1 lines of code)cost = np.sum(Y*np.log(AL)+(1-Y)*np.log(1-AL))/(-m)### END CODE HERE ###cost = np.squeeze(cost)      # To make sure your cost's shape is what we expect (e.g. this turns [[17]] into 17).assert(cost.shape == ())return cost

6. 反向传播

计算损失函数的梯度:

在这里插入图片描述

6.1 线性模块

在这里插入图片描述
dW[l]=∂L∂W[l]=1mdZ[l]A[l−1]TdW^{[l]} = \frac{\partial \mathcal{L} }{\partial W^{[l]}} = \frac{1}{m} dZ^{[l]} A^{[l-1] T} dW[l]=W[l]L=m1dZ[l]A[l1]T
db[l]=∂L∂b[l]=1m∑i=1mdZ[l](i)db^{[l]} = \frac{\partial \mathcal{L} }{\partial b^{[l]}} = \frac{1}{m} \sum_{i = 1}^{m} dZ^{[l](i)}db[l]=b[l]L=m1i=1mdZ[l](i)
dA[l−1]=∂L∂A[l−1]=W[l]TdZ[l]dA^{[l-1]} = \frac{\partial \mathcal{L} }{\partial A^{[l-1]}} = W^{[l] T} dZ^{[l]} dA[l1]=A[l1]L=W[l]TdZ[l]

# GRADED FUNCTION: linear_backwarddef linear_backward(dZ, cache):"""Implement the linear portion of backward propagation for a single layer (layer l)Arguments:dZ -- Gradient of the cost with respect to the linear output (of current layer l)cache -- tuple of values (A_prev, W, b) coming from the forward propagation in the current layerReturns:dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prevdW -- Gradient of the cost with respect to W (current layer l), same shape as Wdb -- Gradient of the cost with respect to b (current layer l), same shape as b"""A_prev, W, b = cachem = A_prev.shape[1]### START CODE HERE ### (≈ 3 lines of code)dW = np.dot(dZ, A_prev.T)/mdb = 1/m*np.sum(dZ, axis=1, keepdims=True)dA_prev = np.dot(W.T, dZ)### END CODE HERE ###assert (dA_prev.shape == A_prev.shape)assert (dW.shape == W.shape)assert (db.shape == b.shape)return dA_prev, dW, db

6.2 线性激活模块

dZ[l]=dA[l]∗g′(Z[l])dZ^{[l]} = dA^{[l]} * g'(Z^{[l]})dZ[l]=dA[l]g(Z[l])

# GRADED FUNCTION: linear_activation_backwarddef linear_activation_backward(dA, cache, activation):"""Implement the backward propagation for the LINEAR->ACTIVATION layer.Arguments:dA -- post-activation gradient for current layer l cache -- tuple of values (linear_cache, activation_cache) we store for computing backward propagation efficientlyactivation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"Returns:dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prevdW -- Gradient of the cost with respect to W (current layer l), same shape as Wdb -- Gradient of the cost with respect to b (current layer l), same shape as b"""linear_cache, activation_cache = cacheif activation == "relu":### START CODE HERE ### (≈ 2 lines of code)dZ = relu_backward(dA, activation_cache)dA_prev, dW, db = linear_backward(dZ, linear_cache)### END CODE HERE ###elif activation == "sigmoid":### START CODE HERE ### (≈ 2 lines of code)dZ = sigmoid_backward(dA, activation_cache)dA_prev, dW, db = linear_backward(dZ, linear_cache)### END CODE HERE ###return dA_prev, dW, db

6.3 多层模型

在这里插入图片描述
dAL=−np.divide(Y,AL)+np.divide(1−Y,1−AL)dAL = - np.divide(Y, AL) + np.divide(1 - Y, 1 - AL)dAL=np.divide(Y,AL)+np.divide(1Y,1AL)

# GRADED FUNCTION: L_model_backwarddef L_model_backward(AL, Y, caches):"""Implement the backward propagation for the [LINEAR->RELU] * (L-1) -> LINEAR -> SIGMOID groupArguments:AL -- probability vector, output of the forward propagation (L_model_forward())Y -- true "label" vector (containing 0 if non-cat, 1 if cat)caches -- list of caches containing:every cache of linear_activation_forward() with "relu" (it's caches[l], for l in range(L-1) i.e l = 0...L-2)the cache of linear_activation_forward() with "sigmoid" (it's caches[L-1])Returns:grads -- A dictionary with the gradientsgrads["dA" + str(l)] = ... grads["dW" + str(l)] = ...grads["db" + str(l)] = ... """grads = {}L = len(caches) # the number of layersm = AL.shape[1]Y = Y.reshape(AL.shape) # after this line, Y is the same shape as AL# Initializing the backpropagation### START CODE HERE ### (1 line of code)dAL = -np.divide(Y, AL) + np.divide(1-Y, 1-AL)### END CODE HERE #### Lth layer (SIGMOID -> LINEAR) gradients. # Inputs: "AL, Y, caches". # Outputs: "grads["dAL"], grads["dWL"], grads["dbL"]### START CODE HERE ### (approx. 2 lines)current_cache = caches[L-1]grads["dA" + str(L)], grads["dW" + str(L)], grads["db" + str(L)] = linear_activation_backward(dAL, current_cache, 'sigmoid')### END CODE HERE ###for l in reversed(range(L-1)):# lth layer: (RELU -> LINEAR) gradients.# Inputs: "grads["dA" + str(l + 2)], caches". # Outputs: "grads["dA" + str(l + 1)] , grads["dW" + str(l + 1)] , grads["db" + str(l + 1)] ### START CODE HERE ### (approx. 5 lines)current_cache = caches[l]dA_prev_temp, dW_temp, db_temp = linear_activation_backward(grads['dA'+str(l+2)], current_cache, 'relu')grads["dA" + str(l + 1)] = dA_prev_tempgrads["dW" + str(l + 1)] = dW_tempgrads["db" + str(l + 1)] = db_temp### END CODE HERE ###return grads

6.4 梯度下降、更新参数

W[l]=W[l]−αdW[l]W^{[l]} = W^{[l]} - \alpha \text{ } dW^{[l]}W[l]=W[l]α dW[l]
b[l]=b[l]−αdb[l]b^{[l]} = b^{[l]} - \alpha \text{ } db^{[l]}b[l]=b[l]α db[l]

# GRADED FUNCTION: update_parametersdef update_parameters(parameters, grads, learning_rate):"""Update parameters using gradient descentArguments:parameters -- python dictionary containing your parameters grads -- python dictionary containing your gradients, output of L_model_backwardReturns:parameters -- python dictionary containing your updated parameters parameters["W" + str(l)] = ... parameters["b" + str(l)] = ..."""L = len(parameters) // 2 # number of layers in the neural network# Update rule for each parameter. Use a for loop.### START CODE HERE ### (≈ 3 lines of code)for l in range(L):parameters["W" + str(l+1)] = parameters['W'+str(l+1)] - learning_rate * grads['dW'+str(l+1)]parameters["b" + str(l+1)] = parameters['b'+str(l+1)] - learning_rate * grads['db'+str(l+1)]### END CODE HERE ###return parameters

作业2. 深度神经网络应用:图像分类

使用上面的函数,建立深度神经网络,并对图片是不是进行预测。

1. 导入包

import time
import numpy as np
import h5py
import matplotlib.pyplot as plt
import scipy
from PIL import Image
from scipy import ndimage
from dnn_app_utils_v2 import *%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'%load_ext autoreload
%autoreload 2np.random.seed(1)

2. 数据集

01.神经网络和深度学习 W2.神经网络基础(作业:逻辑回归 图片识别)

使用 01W2 作业里面的数据集,逻辑回归的准确率只有 70%

  • 加载数据
train_x_orig, train_y, test_x_orig, test_y, classes = load_data()
  • 查看数据
# Example of a picture
index = 1
plt.imshow(train_x_orig[index])
print ("y = " + str(train_y[0,index]) + ". It's a " + classes[train_y[0,index]].decode("utf-8") +  " picture.")

在这里插入图片描述

  • 查看数据大小
# Explore your dataset 
m_train = train_x_orig.shape[0]
num_px = train_x_orig.shape[1]
m_test = test_x_orig.shape[0]print ("Number of training examples: " + str(m_train))
print ("Number of testing examples: " + str(m_test))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_x_orig shape: " + str(train_x_orig.shape))
print ("train_y shape: " + str(train_y.shape))
print ("test_x_orig shape: " + str(test_x_orig.shape))
print ("test_y shape: " + str(test_y.shape))
Number of training examples: 209
Number of testing examples: 50
Each image is of size: (64, 64, 3)
train_x_orig shape: (209, 64, 64, 3)
train_y shape: (1, 209)
test_x_orig shape: (50, 64, 64, 3)
test_y shape: (1, 50)
  • 图片数据向量化
    在这里插入图片描述
# Reshape the training and test examples 
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T   # The "-1" makes reshape flatten the remaining dimensions
test_x_flatten = test_x_orig.reshape(test_x_orig.shape[0], -1).T# Standardize data to have feature values between 0 and 1.
train_x = train_x_flatten/255.
test_x = test_x_flatten/255.print ("train_x's shape: " + str(train_x.shape))
print ("test_x's shape: " + str(test_x.shape))
train_x's shape: (12288, 209) # 12288 = 64 * 64 * 3
test_x's shape: (12288, 50)

3. 建立模型

3.1 两层神经网络

两层NN

3.2 多层神经网络

多层NN

3.3 一般步骤

  1. 初始化参数 / 定义超参数
  2. n_iters次 迭代循环:
    – a. 正向传播
    – b. 计算成本函数
    – c. 反向传播
    – d. 更新参数(使用参数、梯度)
  3. 使用训练好的参数 预测

4. 两层神经网络

  • 定义参数
### CONSTANTS DEFINING THE MODEL ####
n_x = 12288     # num_px * num_px * 3
n_h = 7  # 隐藏层单元个数
n_y = 1
layers_dims = (n_x, n_h, n_y)
  • 组件模型
# GRADED FUNCTION: two_layer_modeldef two_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):"""Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID.Arguments:X -- input data, of shape (n_x, number of examples)Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)layers_dims -- dimensions of the layers (n_x, n_h, n_y)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- If set to True, this will print the cost every 100 iterations Returns:parameters -- a dictionary containing W1, W2, b1, and b2"""np.random.seed(1)grads = {}costs = []                              # to keep track of the costm = X.shape[1]                           # number of examples(n_x, n_h, n_y) = layers_dims# Initialize parameters dictionary, by calling one of the functions you'd previously implemented### START CODE HERE ### (≈ 1 line of code)parameters = initialize_parameters(n_x, n_h, n_y)### END CODE HERE #### Get W1, b1, W2 and b2 from the dictionary parameters.W1 = parameters["W1"]b1 = parameters["b1"]W2 = parameters["W2"]b2 = parameters["b2"]# Loop (gradient descent)for i in range(0, num_iterations):# Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. # Inputs: "X, W1, b1". # Output: "A1, cache1, A2, cache2".### START CODE HERE ### (≈ 2 lines of code)A1, cache1 = linear_activation_forward(X, W1, b1, 'relu')A2, cache2 = linear_activation_forward(A1, W2, b2, 'sigmoid')### END CODE HERE #### Compute cost### START CODE HERE ### (≈ 1 line of code)cost = compute_cost(A2, Y)### END CODE HERE #### Initializing backward propagationdA2 = - np.divide(Y, A2) + np.divide(1 - Y, 1 - A2)# Backward propagation. # Inputs: "dA2, cache2, cache1". # Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1".### START CODE HERE ### (≈ 2 lines of code)dA1, dW2, db2 = linear_activation_backward(dA2, cache2, 'sigmoid')dA0, dW1, db1 = linear_activation_backward(dA1, cache1, 'relu')### END CODE HERE #### Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2grads['dW1'] = dW1grads['db1'] = db1grads['dW2'] = dW2grads['db2'] = db2# Update parameters.### START CODE HERE ### (approx. 1 line of code)parameters = update_parameters(parameters, grads, learning_rate)### END CODE HERE #### Retrieve W1, b1, W2, b2 from parametersW1 = parameters["W1"]b1 = parameters["b1"]W2 = parameters["W2"]b2 = parameters["b2"]# Print the cost every 100 training exampleif print_cost and i % 100 == 0:print("Cost after iteration {}: {}".format(i, np.squeeze(cost)))if print_cost and i % 100 == 0:costs.append(cost)# plot the costplt.plot(np.squeeze(costs))plt.ylabel('cost')plt.xlabel('iterations (per tens)')plt.title("Learning rate =" + str(learning_rate))plt.show()return parameters
  • 训练
parameters = two_layer_model(train_x, train_y, layers_dims = (n_x, n_h, n_y), num_iterations = 2500, print_cost=True)
Cost after iteration 0: 0.693049735659989
Cost after iteration 100: 0.6464320953428849
Cost after iteration 200: 0.6325140647912678
Cost after iteration 300: 0.6015024920354665
Cost after iteration 400: 0.5601966311605747
Cost after iteration 500: 0.5158304772764729
Cost after iteration 600: 0.4754901313943325
Cost after iteration 700: 0.43391631512257495
Cost after iteration 800: 0.4007977536203887
Cost after iteration 900: 0.35807050113237976
Cost after iteration 1000: 0.33942815383664127
Cost after iteration 1100: 0.30527536361962654
Cost after iteration 1200: 0.2749137728213016
Cost after iteration 1300: 0.24681768210614846
Cost after iteration 1400: 0.19850735037466097
Cost after iteration 1500: 0.17448318112556657
Cost after iteration 1600: 0.1708076297809689
Cost after iteration 1700: 0.11306524562164715
Cost after iteration 1800: 0.09629426845937145
Cost after iteration 1900: 0.08342617959726863
Cost after iteration 2000: 0.07439078704319078
Cost after iteration 2100: 0.06630748132267933
Cost after iteration 2200: 0.0591932950103817
Cost after iteration 2300: 0.05336140348560554
Cost after iteration 2400: 0.04855478562877016

在这里插入图片描述

  • 预测

训练集:Accuracy: 0.9999999999999998

predictions_train = predict(train_x, train_y, parameters)
# Accuracy: 0.9999999999999998

测试集:Accuracy: 0.72,比之前的逻辑回归 0.70 好一些

predictions_test = predict(test_x, test_y, parameters)
# Accuracy: 0.72

5. 多层神经网络

  • 定义参数,5层 NN
### CONSTANTS ###
layers_dims = [12288, 20, 7, 5, 1] #  5-layer model
  • 组件模型
# GRADED FUNCTION: L_layer_modeldef L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):#lr was 0.009"""Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID.Arguments:X -- data, numpy array of shape (number of examples, num_px * num_px * 3)Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)layers_dims -- list containing the input size and each layer size, of length (number of layers + 1).learning_rate -- learning rate of the gradient descent update rulenum_iterations -- number of iterations of the optimization loopprint_cost -- if True, it prints the cost every 100 stepsReturns:parameters -- parameters learnt by the model. They can then be used to predict."""np.random.seed(1)costs = []                         # keep track of cost# Parameters initialization.### START CODE HERE ###parameters = initialize_parameters_deep(layers_dims)### END CODE HERE #### Loop (gradient descent)for i in range(0, num_iterations):# Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.### START CODE HERE ### (≈ 1 line of code)AL, caches = L_model_forward(X, parameters)### END CODE HERE #### Compute cost.### START CODE HERE ### (≈ 1 line of code)cost = compute_cost(AL, Y)### END CODE HERE #### Backward propagation.### START CODE HERE ### (≈ 1 line of code)grads = L_model_backward(AL, Y, caches)### END CODE HERE #### Update parameters.### START CODE HERE ### (≈ 1 line of code)parameters = update_parameters(parameters, grads, learning_rate)### END CODE HERE #### Print the cost every 100 training exampleif print_cost and i % 100 == 0:print ("Cost after iteration %i: %f" %(i, cost))if print_cost and i % 100 == 0:costs.append(cost)# plot the costplt.plot(np.squeeze(costs))plt.ylabel('cost')plt.xlabel('iterations (per tens)')plt.title("Learning rate =" + str(learning_rate))plt.show()return parameters
  • 训练
parameters = L_layer_model(train_x, train_y, layers_dims, num_iterations = 2500, print_cost = True)
Cost after iteration 0: 0.771749
Cost after iteration 100: 0.672053
Cost after iteration 200: 0.648263
Cost after iteration 300: 0.611507
Cost after iteration 400: 0.567047
Cost after iteration 500: 0.540138
Cost after iteration 600: 0.527930
Cost after iteration 700: 0.465477
Cost after iteration 800: 0.369126
Cost after iteration 900: 0.391747
Cost after iteration 1000: 0.315187
Cost after iteration 1100: 0.272700
Cost after iteration 1200: 0.237419
Cost after iteration 1300: 0.199601
Cost after iteration 1400: 0.189263
Cost after iteration 1500: 0.161189
Cost after iteration 1600: 0.148214
Cost after iteration 1700: 0.137775
Cost after iteration 1800: 0.129740
Cost after iteration 1900: 0.121225
Cost after iteration 2000: 0.113821
Cost after iteration 2100: 0.107839
Cost after iteration 2200: 0.102855
Cost after iteration 2300: 0.100897
Cost after iteration 2400: 0.092878

在这里插入图片描述

  • 预测

训练集:Accuracy: 0.9856459330143539

pred_train = predict(train_x, train_y, parameters)
# Accuracy: 0.9856459330143539

测试集:Accuracy: 0.8,比逻辑回归 0.70,两层NN 0.72 都要好

pred_test = predict(test_x, test_y, parameters)
# Accuracy: 0.8

下一门课将会系统的学习如何调参,使得模型的效果更好

6. 结果分析

def print_mislabeled_images(classes, X, y, p):"""Plots images where predictions and truth were different.X -- datasety -- true labelsp -- predictions"""a = p + ymislabeled_indices = np.asarray(np.where(a == 1)) # 0+1, 1+0, wrong caseplt.rcParams['figure.figsize'] = (40.0, 40.0) # set default size of plotsnum_images = len(mislabeled_indices[0])for i in range(num_images):index = mislabeled_indices[1][i]plt.subplot(2, num_images, i + 1)plt.imshow(X[:,index].reshape(64,64,3), interpolation='nearest')plt.axis('off')plt.title("Prediction: " + classes[int(p[0,index])].decode("utf-8") + " \n Class: " + classes[y[0,index]].decode("utf-8"))print_mislabeled_images(classes, test_x, test_y, pred_test)

在这里插入图片描述
错误特点:

  • 猫的身体在一个不寻常的位置
  • 猫出现在一个相似颜色的背景下
  • 不常见的猫颜色和种类
  • 照相机角度
  • 图片的亮度
  • 大小程度(猫在图像中非常大或很小)

7. 用自己的图片测试

## START CODE HERE ##
my_image = "my_image.jpg" # change this to the name of your image file 
my_label_y = [1] # the true class of your image (1 -> cat, 0 -> non-cat)
## END CODE HERE ##fname = "images/" + my_image
image = Image.open(fname)
my_image = np.array(image.resize((num_px,num_px))).reshape((num_px*num_px*3,1))
my_predicted_image = predict(my_image, my_label_y, parameters)plt.imshow(image)
print ("y = " + str(np.squeeze(my_predicted_image)) + ", your L-layer model predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")
Accuracy: 1.0
y = 1.0, your L-layer model predicts a "cat" picture.

cat


我的CSDN博客地址 https://michael.blog.csdn.net/

长按或扫码关注我的公众号(Michael阿明),一起加油、一起学习进步!
Michael阿明

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/474145.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

浅析调用android的content provider(一)

在Android下,查询联系人、通话记录等,需要用到content provider。但是,调用content provider时,Android框架内部是如何做的呢?这一系列文章就是解决这个问题的,所采用的开发环境及源码都是基于Android 1.6版…

R语言第七讲 线性回归分析案例续

题目 MASS 库中包含 Boston (波士顿房价)数据集,它记录了波士顿周围 506 个街区的 medv (房价中位数)。我们将设法用 13 个预测变量如 rm (每栋住宅的平均房间数), age (平均房 龄), lstat (社会经济地位低的家庭所占比例)等来预测…

LeetCode 1129. 颜色交替的最短路径(BFS)

文章目录1. 题目2. 解题1. 题目 在一个有向图中,节点分别标记为 0, 1, ..., n-1。 这个图中的每条边不是红色就是蓝色,且存在自环或平行边。 red_edges 中的每一个 [i, j] 对表示从节点 i 到节点 j 的红色有向边。 类似地,blue_edges 中的每…

web.config文件之自定义错误节

web.config文件之自定义错误节 大家都知道&#xff0c;在开发asp.net应用程序时&#xff0c;通过web.config文件可以配置在执行 Web 请求期间发生未处理的错误时&#xff0c;ASP.NET 显示信息的方式。下面是一个典型的基本配置&#xff1a; ?<system.web><customErro…

linux中UDP程序流程、客户端、服务端

UDP--- 用户数据报协议&#xff08;User Datagram Protocol&#xff09;&#xff0c;是一个无连接的简单的面向数据报的运输层协议。 优点&#xff1a;传输速度快 缺点&#xff1a;不可靠 socket的中文意思是接插件&#xff1a; 创建socket 在 Python 中 使用socket 模块的类 …

LeetCode 1041. 困于环中的机器人

文章目录1. 题目2. 解题1. 题目 在无限的平面上&#xff0c;机器人最初位于 (0, 0) 处&#xff0c;面朝北方。机器人可以接受下列三条指令之一&#xff1a; “G”&#xff1a;直走 1 个单位“L”&#xff1a;左转 90 度“R”&#xff1a;右转 90 度 机器人按顺序执行指令 ins…

Javascript实现合并单元格

Web上的报表或表格应用&#xff0c;较为复杂的表格操作一般都比较难实现&#xff0c;这里介绍一下用ComponentOne Studio for ASP.NET Wijmo中的SpreadJS&#xff0c;可以实现一些较为复杂的表格操作&#xff0c;个人认为他模仿桌面应用的操作体验非常不错&#xff0c;虽然我并…

LeetCode 1039. 多边形三角剖分的最低得分(区间DP)

文章目录1. 题目2. 解题1. 题目 给定 N&#xff0c;想象一个凸 N 边多边形&#xff0c;其顶点按顺时针顺序依次标记为 A[0], A[i], ..., A[N-1]。 假设您将多边形剖分为 N-2 个三角形。 对于每个三角形&#xff0c;该三角形的值是顶点标记的乘积&#xff0c;三角剖分的分数是…

02.改善深层神经网络:超参数调试、正则化以及优化 W1.深度学习的实践层面

文章目录1. 训练&#xff0c;验证&#xff0c;测试集2. 偏差&#xff0c;方差3. 机器学习基础4. 正则化5. 为什么正则化预防过拟合6. dropout&#xff08;随机失活&#xff09;正则化7. 理解 dropout8. 其他正则化9. 归一化输入10. 梯度消失 / 梯度爆炸11. 神经网络权重初始化1…

R语言第十讲 逻辑斯蒂回归

模型函数介绍 Logistic Regression 虽然被称为回归&#xff0c;但其实际上是分类模型&#xff0c;并常用于二分类。Logistic Regression 因其简单、可并行化、可解释强深受工业界喜爱。 Logistic 回归的本质是&#xff1a;假设数据服从这个Logistic 分布&#xff0c;然后使用极…

阿里云 超级码力在线编程大赛初赛 第3场 题目4. 完美字符串

文章目录1. 题目2. 解题1. 题目 描述 定义若一个字符串的每个字符均为’1’&#xff0c;则该字符串称为完美字符串。 给定一个只由’0’和’1’组成的字符串s和一个整数k。 你可以对字符串进行任意次以下操作 选择字符串的一个区间长度不超过k的区间[l, r]&#xff0c;将区间…

R语言第十一讲 决策树与随机森林

概念 决策树主要有树的回归和分类方法&#xff0c;这些方法主要根据分层和分割 的方式将预测变量空间划分为一系列简单区域。对某个给定待预测的观 测值&#xff0c;用它所属区域中训练集的平均值或众数对其进行预测。 基于树的方法简便且易于解释。但预测准确性通常较低。如图…

python面试题汇总(1)

1. (1)python下多线程的限制以及多进程中传递参数的方式   python多线程有个全局解释器锁&#xff08;global interpreter lock&#xff09;&#xff0c;这个锁的意思是任一时间只能有一个线程使用解释器&#xff0c;跟单cpu跑多个程序一个意思&#xff0c;大家都是轮着用的&…

阿里云 超级码力在线编程大赛初赛 第3场 题目1. 最大公倍数

文章目录1. 题目2. 解题1. 题目 来源&#xff1a;https://tianchi.aliyun.com/oj/15179470890799741/85251759933690467 2. 解题 看的大佬的解题&#xff0c;很强&#xff01; class Solution { public:/*** param a: Left margin* param b: Right margin* return: return t…

Javascript:前端利器 之 JSDuck

背景 文档的重要性不言而喻&#xff0c;对于像Javascript这种的动态语言来说就更重要了&#xff0c;目前流行的JDoc工具挺多的&#xff0c;最好的当属JSDuck&#xff0c;可是JSDuck在Windows下的安装非常麻烦&#xff0c;这里就写下来做个备忘。 JSDuck生成的文档效果 JSDuck安…

Ubuntu 扩展内存或断电之后卡在 /dev/sda1 clean 和 /dev/sda1 recovering journal

当ubuntu虚拟机硬盘空间不够用的时候&#xff0c;往往会出现新增扩展硬盘空间之后&#xff0c;出现开机卡死的现象。 通过查阅相关资料&#xff0c;排坑如下&#xff1a; 一、原VM硬盘空间已满 当原VM硬盘空间已满的情况下&#xff0c;千万不要重启或者关机操作&#xff0c;极…

阿里云 超级码力在线编程大赛初赛 第3场 题目2. 房屋染色(DP)

文章目录1. 题目2. 解题1. 题目 有n个房子在一列直线上&#xff0c;现在Bob需要给房屋染色&#xff0c;共有k种颜色。 每个房屋染不同的颜色费用也不同&#xff0c;Bob希望有一种染色方案使得相邻的房屋颜色不同。 但Bob计算了使相邻房屋颜色不同的最小染色费用&#xff0c;发…

TCP协议以及三次握手

TCP协议&#xff0c;传输控制协议&#xff08;英语&#xff1a;TransmissionControl Protocol&#xff0c;缩写为 TCP&#xff09;是一种面向连接的、可靠的、基于字节流的传输层通信协议&#xff0c;由IETF的RFC793定义。 tcp通信需要经过创建连接、数据传送、终止连接三个步骤…

02.改善深层神经网络:超参数调试、正则化以及优化 W1.深度学习的实践层面(作业:初始化+正则化+梯度检验)

文章目录作业1&#xff1a;初始化1. 神经网络模型2. 使用 0 初始化3. 随机初始化4. He 初始化作业2&#xff1a;正则化1. 无正则化模型2. L2 正则化3. DropOut 正则化3.1 带dropout的前向传播3.2 带dropout的后向传播3.3 运行模型作业3&#xff1a;梯度检验1. 1维梯度检验2. 多…