2.深度学习练习:Logistic Regression with a Neural Network mindset

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。

课程链接:https://www.deeplearning.ai/deep-learning-specialization/

You will learn to:

  • Build the general architecture of a learning algorithm, including:
    • Initializing parameters
    • Calculating the cost function and its gradient
    • Using an optimization algorithm (gradient descent)
  • Gather all three functions above into a main model function, in the right order.

你将会学到深度学习的通用结构,包括:

  • 参数初始化
  • 计算损失函数和梯度
  • 使用优化算法(梯度下降)

目录

1 - Packages

2 - Overview of the Problem set

3 - General Architecture of the learning algorithm(掌握)

4 - Building the parts of our algorithm

4.1 - Helper functions

4.2 - Initializing parameters

4.3 - Forward and Backward propagation(掌握)

4.4 - Optimization

4.5 - Predict

5 - Merge all functions into a model


1 - Packages

First, let's run the cell below to import all the packages that you will need during this assignment.

  • numpy is the fundamental package for scientific computing with Python.
  • h5py is a common package to interact with a dataset that is stored on an H5 file.
  • matplotlib is a famous library to plot graphs in Python.
  • PIL and scipy are used here to test your model with your own picture at the end.
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset  # 这个函数在文件夹下的 py 文件中%matplotlib inline

2 - Overview of the Problem set

Problem Statement: You are given a dataset ("data.h5") containing:

- a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
- a test set of m_test images labeled as cat or non-cat
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).

每一张图片的shape:(num_px, num_px, 3),3代表3个通道(RGB)。

You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.

Let's get more familiar with the dataset. Load the data by running the following code.

Many software bugs in deep learning come from having matrix/vector dimensions that don't fit. If you can keep your matrix/vector dimensions straight you will go a long way toward eliminating many bugs.

Exercise: Find the values for:

- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)

Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

For convenience, you should now reshape images of shape (num_px, num_px, 3) in a numpy-array of shape (num_px ∗∗ num_px ∗∗ 3, 1). After this, our training (and test) dataset is a numpy-array where each column represents a flattened image. There should be m_train (respectively m_test) columns.

Exercise: Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗∗ num_px ∗∗ 3, 1).

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b∗∗c∗∗d, a) is to use:

X_flatten = X.reshape(X.shape[0], -1).T      # X.T is the transpose of X
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
test_set_x_flatten  = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T

Let's standardize our dataset.

train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

**What you need to remember:**

Common steps for pre-processing a new dataset are:

  • Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, ...)
  • Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1)
  • "Standardize" the data

新数据集常用的预处理顺序是:

  • 查看训练集、测试集大小和样本的形状(m_train,m_test,num_px等)
  • 重塑数据集,使每个样本变成为一个大小为(num_px * num_px * 3, 1)的向量
  • 数据“标准化”处理

3 - General Architecture of the learning algorithm(掌握)

You will build a Logistic Regression, using a Neural Network mindset. The following Figure explains why Logistic Regression is actually a very simple Neural Network!

Mathematical expression of the algorithm:

For one example x^{(i)}

                                                                                                                         z^{(i)} = w^T x^{(i)} + b

                                ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        \hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})

                                                                                                 ​\mathcal{L}(a^{(i)}, y^{(i)}) = - y^{(i)} \log(a^{(i)}) - (1-y^{(i)} ) \log(1-a^{(i)})

The cost is then computed by summing over all training examples:

J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})

Key steps: In this exercise, you will carry out the following steps:

  • - Initialize the parameters of the model
  • - Learn the parameters for the model by minimizing the cost  
  • - Use the learned parameters to make predictions (on the test set)
  • - Analyse the results and conclude

关键步骤:在这个练习中,你将会进行以下步骤:

  • 初始化模型参数
  • 通过降低损失来学习模型参数
  • 使用学习到的参数来预测(在测试集上预测)
  • 分析预测结果和总结

4 - Building the parts of our algorithm

The main steps for building a Neural Network are:

  1. Define the model structure (such as number of input features)
  2. Initialize the model's parameters
  3. Loop:
    • Calculate current loss (forward propagation)
    • Calculate current gradient (backward propagation)
    • Update parameters (gradient descent)

You often build 1-3 separately and integrate them into one function we call model().

4.1 - Helper functions

Exercise: Using your code from "Python Basics", implement sigmoid()

def sigmoid(z):"""Compute the sigmoid of zArguments:z -- A scalar or numpy array of any size.Return:s -- sigmoid(z)"""s = 1 / (1 + np.exp(-z))return s

4.2 - Initializing parameters

def initialize_with_zeros(dim):"""This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)"""w = np.zeros((dim, 1))b = 0assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, b

4.3 - Forward and Backward propagation(掌握)

Now that your parameters are initialized, you can do the "forward" and "backward" propagation steps for learning the parameters.

Exercise: Implement a function propagate() that computes the cost function and its gradient.

Hints:

Forward Propagation:

  • You get X
  • You compute

A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})

  •  You calculate the cost function:

J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})


Here are the two formulas you will be using:   

\frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T

\dpi{120} \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (A-Y)

# GRADED FUNCTION: propagatedef propagate(w, b, X, Y):"""Implement the cost function and its gradient for the propagation explained aboveArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)Return:cost -- negative log-likelihood cost for logistic regressiondw -- gradient of the loss with respect to w, thus same shape as wdb -- gradient of the loss with respect to b, thus same shape as bTips:- Write your code step by step for the propagation. np.log(), np.dot()"""m = X.shape[1]# FORWARD PROPAGATION (FROM X TO COST)A = sigmoid(np.dot(w.T, X) + b)cost = -1/m * np.sum(Y*np.log(A) + (1-Y)*np.log(1-A))# BACKWARD PROPAGATION (TO FIND GRAD)dw = 1/m * np.dot(X, (A-Y).T)db = 1/m * np.sum(A-Y)assert(dw.shape == w.shape)assert(db.dtype == float)cost = np.squeeze(cost)assert(cost.shape == ())grads = {"dw": dw,"db": db}return grads, cost

4.4 - Optimization

Exercise: Write down the optimization function. The goal is to learn w and b by minimizing the cost functionJ . For a parameter \theta, the update rule is\theta = \theta - \alpha \text{ }d\theta  where \alpha is the learning rate.

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):"""This function optimizes w and b by running a gradient descent algorithmArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of shape (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- True to print the loss every 100 stepsReturns:params -- dictionary containing the weights w and bias bgrads -- dictionary containing the gradients of the weights and bias with respect to the cost functioncosts -- list of all the costs computed during the optimization, this will be used to plot the learning curve.Tips:You basically need to write down two steps and iterate through them:1) Calculate the cost and the gradient for the current parameters. Use propagate().2) Update the parameters using gradient descent rule for w and b."""costs = []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)grads, cost = propagate(w, b, X, Y)# Retrieve derivatives from gradsdw = grads["dw"]db = grads["db"]# update rule (≈ 2 lines of code)w = w - learning_rate * dwb = b - learning_rate * db# Record the costsif i % 100 == 0:costs.append(cost)# Print the cost every 100 training examplesif print_cost and i % 100 == 0:print ("Cost after iteration %i: %f" %(i, cost))params = {"w": w,"b": b}grads = {"dw": dw,"db": db}return params, grads, costs

4.5 - Predict

Exercise: The previous function will output the learned w and b. We are able to use w and b to predict the labels for a dataset X. Implement the predict() function. There is two steps to computing predictions:

1.Calculate \hat{Y} = A = \sigma(w^T X + b)

2. Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector `Y_prediction`. If you wish, you can use an `if`/`else` statement in a `for` loop (though there is also a way to vectorize this). 

def predict(w, b, X):'''Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X'''m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the pictureA = sigmoid(np.dot(w.T, X) + b)for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]if A[0,i] > 0.5:Y_prediction[0, i] = 1 else: Y_prediction[0, i] = 0assert(Y_prediction.shape == (1, m))return Y_prediction

**What to remember:** You've implemented several functions that:

- Initialize (w,b)

- Optimize the loss iteratively to learn parameters (w,b):

- computing the cost and its gradient

- updating the parameters using gradient descent

- Use the learned (w,b) to predict the labels for a given set of examples

需要记住的,你已经实现了一些函数:

  • 初始化(w,b)
  • 迭代优化损失来学习参数(w,b)
  • 计算损失和其梯度
  • 使用梯度下降来更新参数
  • 使用学习到的参数来预测给定样本的标签值

5 - Merge all functions into a model

You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.

Exercise: Implement the model function. Use the following notation:

- Y_prediction for your predictions on the test set
- Y_prediction_train for your predictions on the train set
- w, costs, grads for the outputs of optimize()

def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):"""Builds the logistic regression model by calling the function you've implemented previouslyArguments:X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)num_iterations -- hyperparameter representing the number of iterations to optimize the parameterslearning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()print_cost -- Set to true to print the cost every 100 iterationsReturns:d -- dictionary containing information about the model."""# initialize parameters with zeros (≈ 1 line of code)w, b = initialize_with_zeros(X_train.shape[0])# Gradient descent (≈ 1 line of code)parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)# Retrieve parameters w and b from dictionary "parameters"w = parameters["w"]b = parameters["b"]# Predict test/train set examples (≈ 2 lines of code)Y_prediction_test = predict(w, b, X_test)Y_prediction_train = predict(w, b, X_train)# Print train/test Errorsprint("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))d = {"costs": costs,"Y_prediction_test": Y_prediction_test, "Y_prediction_train" : Y_prediction_train, "w" : w, "b" : b,"learning_rate" : learning_rate,"num_iterations": num_iterations}return dd = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/439866.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

JVM内存区域详解

Java中虚拟机在执行Java程序的过程中会将它所管理的内存区域划分为若干不同的数据区域。下面来介绍几个运行时数据区域。 一、程序计数器 1.1 简述 程序计数器&#xff08;Program Counter Register&#xff09;是一块较小的内存空间&#xff0c;它的作用可以看做是当前线程所…

3.深度学习练习:Planar data classification with one hidden layer

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ You will learn to: Implement a 2-class classification neural network with a single hidden layerUse unit…

一步步编写操作系统 11 实模式下程序分段的原因

cpu中本来是没有实模式这一称呼的&#xff0c;是因为有了保护模式后&#xff0c;为了将老的模式区别开来&#xff0c;所以称老的模式为实模式。这情况就像所有同学坐在同一个教室里&#xff0c;本来没有老同学这一概念&#xff0c;但某天老师领着一个陌生人进入教室并和大家宣布…

4.深度学习练习:Building your Deep Neural Network: Step by Step(强烈推荐)

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ fter this assignment you will be able to: Use non-linear units like ReLU to improve your modelBuild a d…

一步步编写操作系统21 x86虚拟机bochs 跟踪bios

为了让大家更好的理解bios是怎样被执行的&#xff0c;也就是计算机中第一个软件是怎样开始的&#xff0c;咱们还是先看下图3-17。在图的上面第5行&#xff0c;显示的是下一条待执行的指令&#xff0c;这是程序计数器&#xff08;PC&#xff09;中的值&#xff0c;在x86上的程序…

【CodeForces - 361D】Levko and Array (二分,dp)

题干&#xff1a; Levko has an array that consists of integers: a1, a2, ... , an. But he doesn’t like this array at all. Levko thinks that the beauty of the array a directly depends on value c(a), which can be calculated by the formula: The less value…

5.深度学习练习:Deep Neural Network for Image Classification: Application

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ After this assignment you will be able to: Build and apply a deep neural network to supervised learning…

【CodeForces - 689D】Friends and Subsequences(RMQ,二分 或单调队列)

题干&#xff1a; Mike and !Mike are old childhood rivals, they are opposite in everything they do, except programming. Today they have a problem they cannot solve on their own, but together (with you) — who knows? Every one of them has an integer seque…

6.深度学习练习:Initialization

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - Neural Network model 2 - Zero initialization 3 - Random initialization&#xff08;掌握&…

【CodeForces - 602D】Lipshitz Sequence(思维,单调栈,斜率单调性)

题干&#xff1a; A function is called Lipschitz continuous if there is a real constant Ksuch that the inequality |f(x) - f(y)| ≤ K|x - y| holds for all . Well deal with a more... discrete version of this term. For an array , we define its Lipschi…

7.深度学习练习:Regularization

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ 目录 1-Package 2 - Non-regularized model 3 - L2 Regularization&#xff08;掌握&#xff09; 4-Dropou…

深入详解JVM内存模型与JVM参数详细配置

本系列会持续更新。 JVM基本是BAT面试必考的内容&#xff0c;今天我们先从JVM内存模型开启详解整个JVM系列&#xff0c;希望看完整个系列后&#xff0c;可以轻松通过BAT关于JVM的考核。 BAT必考JVM系列专题 1.JVM内存模型 2.JVM垃圾回收算法 3.JVM垃圾回收器 4.JVM参数详解 5…

8.深度学习练习:Gradient Checking

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ 目录 1) How does gradient checking work? 2) 1-dimensional gradient checking 3) N-dimensional gradie…

9.深度学习练习:Optimization Methods

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - Gradient Descent 2 - Mini-Batch Gradient descent 3 - Momentum 4 - Adam 5 - Model with dif…

一步步编写操作系统 22 硬盘操作方法

硬盘中的指令很多&#xff0c;各指令的用法也不同。有的指令直接往command寄存器中写就行了&#xff0c;有的还要在feature寄存器中写入参数&#xff0c;最权威的方法还是要去参考ATA手册。由于本书中用到的都是简单的指令&#xff0c;所以对此抽象出一些公共的步骤仅供参考之用…

10.深度学习练习:Convolutional Neural Networks: Step by Step(强烈推荐)

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - Packages 2 - Outline of the Assignment 3 - Convolutional Neural Networks 3.1 - Zero-Paddin…

一步步编写操作系统 23 重写主引导记录mbr

本节我们在之前MBR的基础上&#xff0c;做个稍微大一点的改进&#xff0c;经过这个改进后&#xff0c;我们的MBR可以读取硬盘。听上去这可是个大“手术”呢&#xff0c;我们要将之前学过的知识都用上啦。其实没那么大啦&#xff0c;就是加了个读写磁盘的函数而已&#xff0c;哈…

11.深度学习练习:Keras tutorial - the Happy House(推荐)

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ Welcome to the first assignment of week 2. In this assignment, you will: Learn to use Keras, a high-lev…

一步步编写操作系统 24 编写内核加载器

这一节的内容并不长&#xff0c;因为在进入保护模式之前&#xff0c;我们能做的不多&#xff0c;loader是要经过实模式到保护模式的过渡&#xff0c;并最终在保护模式下加载内核。本节只实现一个简单的loader&#xff0c;本loader只在实模式下工作&#xff0c;等学习了保护模式…

12.深度学习练习:Residual Networks(注定成为经典)

本文节选自吴恩达老师《深度学习专项课程》编程作业&#xff0c;在此表示感谢。 课程链接&#xff1a;https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - The problem of very deep neural networks 2 - Building a Residual Network 2.1 - The identity…