3.深度学习练习:Planar data classification with one hidden layer

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。

课程链接:https://www.deeplearning.ai/deep-learning-specialization/

You will learn to:

  • Implement a 2-class classification neural network with a single hidden layer
  • Use units with a non-linear activation function, such as tanh
  • Compute the cross entropy loss
  • Implement forward and backward propagation

目录

1 - Packages

2 - Dataset

3 - Simple Logistic Regression

4 - Neural Network model(掌握)

4.1 - Defining the neural network structure

4.2 - Initialize the model's parameters

4.3 - The Loop

4.4 - Integrate parts 4.1, 4.2 and 4.3 in nn_model()

4.5 Predictions

4.6 - Tuning hidden layer size (optional/ungraded exercise)

5 - Performance on other datasets


1 - Packages

Let's first import all the packages that you will need during this assignment.

  • numpy is the fundamental package for scientific computing with Python.
  • sklearn provides simple and efficient tools for data mining and data analysis.
  • matplotlib is a library for plotting graphs in Python.
  • testCases provides some test examples to assess the correctness of your functions
  • planar_utils provide various useful functions used in this assignment
# Package imports
import numpy as np
import matplotlib.pyplot as plt
from testCases import *
import sklearn
import sklearn.datasets
import sklearn.linear_model
from planar_utils import plot_decision_boundary, sigmoid, load_planar_dataset, load_extra_datasets%matplotlib inlinenp.random.seed(1) # set a seed so that the results are consistent

2 - Dataset

X, Y = load_planar_dataset()# Visualize the data:
plt.scatter(X[0, :], X[1, :], c=np.squeeze(Y), s=40, cmap=plt.cm.Spectral);

You have:

- a numpy-array (matrix) X that contains your features (x1, x2)
- a numpy-array (vector) Y that contains your labels (red:0, blue:1).

Lets first get a better sense of what our data is like.

Exercise: How many training examples do you have? In addition, what is the shape of the variables X and Y?

Hint: How do you get the shape of a numpy array? (help)

shape_X = np.shape(X)
shape_Y = np.shape(Y)
m = shape_X[1]print ('The shape of X is: ' + str(shape_X))
print ('The shape of Y is: ' + str(shape_Y))
print ('I have m = %d training examples!' % (m))

3 - Simple Logistic Regression

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn's built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset.

# Train the logistic regression classifier
clf = sklearn.linear_model.LogisticRegressionCV();
clf.fit(X.T, Y.T);# Plot the decision boundary for logistic regression
plot_decision_boundary(lambda x: clf.predict(x), X, np.squeeze(Y))
plt.title("Logistic Regression")# Print accuracy
LR_predictions = clf.predict(X.T)
print ('Accuracy of logistic regression: %d ' % float((np.dot(Y,LR_predictions) + np.dot(1-Y,1-LR_predictions))/float(Y.size)*100) +'% ' + "(percentage of correctly labelled datapoints)")

Interpretation: The dataset is not linearly separable, so logistic regression doesn't perform well. Hopefully a neural network will do better. Let's try this now!


4 - Neural Network model(掌握)

Logistic regression did not work well on the "flower dataset". You are going to train a Neural Network with a single hidden layer.

Here is our model:

For one example,x^{(i)}

                                                                   

z^{[1] (i)} = W^{[1]} x^{(i)} + b^{[1] (i)}

a^{[1] (i)} = \tanh(z^{[1] (i)})

z^{[2] (i)} = W^{[2]} a^{[1] (i)} + b^{[2] (i)}

\hat{y}^{(i)} = a^{[2] (i)} = \sigma(z^{ [2] (i)})

y^{(i)}_{prediction} = \begin{cases} 1 & \mbox{if } a^{[2](i)} > 0.5 \\ 0 & \mbox{otherwise } \end{cases}

                                              

Given the predictions on all the examples, you can also compute the cost  J as follows: 

J = - \frac{1}{m} \sum\limits_{i = 0}^{m} \large\left(\small y^{(i)}\log\left(a^{[2] (i)}\right) + (1-y^{(i)})\log\left(1- a^{[2] (i)}\right) \large \right) \small

Reminder: The general methodology to build a Neural Network is to:

1. Define the neural network structure ( # of input units,  # of hidden units, etc). 
2. Initialize the model's parameters
3. Loop:
    - Implement forward propagation
    - Compute loss
    - Implement backward propagation to get the gradients
    - Update parameters (gradient descent)

4.1 - Defining the neural network structure

Exercise: Define three variables:

- n_x: the size of the input layer
- n_h: the size of the hidden layer (set this to 4) 
- n_y: the size of the output layer

Hint: Use shapes of X and Y to find n_x and n_y. Also, hard code the hidden layer size to be 4.

def layer_sizes(X, Y):"""Arguments:X -- input dataset of shape (input size, number of examples)Y -- labels of shape (output size, number of examples)Returns:n_x -- the size of the input layern_h -- the size of the hidden layern_y -- the size of the output layer"""n_x = np.shape(X)[0]n_h = 4n_y = np.shape(Y)[0]return (n_x, n_h, n_y)

4.2 - Initialize the model's parameters

Exercise: Implement the function initialize_parameters().

Instructions:

  • Make sure your parameters' sizes are right. Refer to the neural network figure above if needed.
  • You will initialize the weights matrices with random values.
    • Use: np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b).
  • You will initialize the bias vectors as zeros.
    • Use: np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.
def initialize_parameters(n_x, n_h, n_y):"""Argument:n_x -- size of the input layern_h -- size of the hidden layern_y -- size of the output layerReturns:params -- python dictionary containing your parameters:W1 -- weight matrix of shape (n_h, n_x)b1 -- bias vector of shape (n_h, 1)W2 -- weight matrix of shape (n_y, n_h)b2 -- bias vector of shape (n_y, 1)"""np.random.seed(2) # we set up a seed so that your output matches ours although the initialization is random.W1 = np.random.randn(n_h, n_x) * 0.01b1 = np.zeros((n_h, 1))W2 = np.random.randn(n_y, n_h) * 0.01b2 = np.zeros((n_y, 1))assert (W1.shape == (n_h, n_x))assert (b1.shape == (n_h, 1))assert (W2.shape == (n_y, n_h))assert (b2.shape == (n_y, 1))parameters = {"W1": W1,"b1": b1,"W2": W2,"b2": b2}return parameters

4.3 - The Loop

Question: Implement forward_propagation().

Instructions:

  • Look above at the mathematical representation of your classifier.
  • You can use the function sigmoid(). It is built-in (imported) in the notebook.
  • You can use the function np.tanh(). It is part of the numpy library.
  • The steps you have to implement are:
    1. Retrieve each parameter from the dictionary "parameters" (which is the output of initialize_parameters()) by using parameters[".."].
    2. Implement Forward Propagation. ComputeZ^{[1]}, A^{[1]}, Z^{[2]}   and A^{[2]} (the vector of all your predictions on all the examples in the training set).
  • Values needed in the backpropagation are stored in "cache". The cache will be given as an input to the backpropagation function.
def forward_propagation(X, parameters):"""Argument:X -- input data of size (n_x, m)parameters -- python dictionary containing your parameters (output of initialization function)Returns:A2 -- The sigmoid output of the second activationcache -- a dictionary containing "Z1", "A1", "Z2" and "A2"W1 = parameters['W1']b1 = parameters['b1']W2 = parameters['W2']b2 = parameters['b2']# Implement Forward Propagation to calculate A2 (probabilities)Z1 = np.dot(W1, X) + b1A1 = np.tanh(Z1)Z2 = np.dot(W2, A1) + b2A2 = sigmoid(Z2)assert(A2.shape == (1, X.shape[1]))cache = {"Z1": Z1,"A1": A1,"Z2": Z2,"A2": A2}return A2, cache

Now that you have computed A^{[2]} (in the Python variable "A2"), which contains   a^{[2](i)} for every example, you can compute the cost function as follows:

J = - \frac{1}{m} \sum\limits_{i = 0}^{m} \large{(} \small y^{(i)}\log\left(a^{[2] (i)}\right) + (1-y^{(i)})\log\left(1- a^{[2] (i)}\right) \large{)} \small

Exercise: Implement compute_cost() to compute the value of the cost ?J.

Instructions:

  • There are many ways to implement the cross-entropy loss. To help you, we give you how we would have implemented - \sum\limits_{i=0}^{m} y^{(i)}\log(a^{[2](i)})
  • logprobs = np.multiply(np.log(A2),Y)
    cost = - np.sum(logprobs)                # no need to use a for loop!
def compute_cost(A2, Y, parameters):"""Computes the cross-entropy cost given in equation (13)Arguments:A2 -- The sigmoid output of the second activation, of shape (1, number of examples)Y -- "true" labels vector of shape (1, number of examples)parameters -- python dictionary containing your parameters W1, b1, W2 and b2Returns:cost -- cross-entropy cost given equation (13)"""m = Y.shape[1] # number of example# Compute the cross-entropy costlogprobs = np.multiply(np.log(A2), Y) + np.multiply(np.log(1-A2), 1-Y)cost = -1 /m * np.sum(logprobs)cost = np.squeeze(cost)     # makes sure cost is the dimension we expect. # E.g., turns [[17]] into 17 assert(isinstance(cost, float))return cost

Using the cache computed during forward propagation, you can now implement backward propagation.

Question: Implement the function backward_propagation().

Instructions: Backpropagation is usually the hardest (most mathematical) part in deep learning. To help you, here again is the slide from the lecture on backpropagation. You'll want to use the six equations on the right of this slide, since you are building a vectorized implementation.

def backward_propagation(parameters, cache, X, Y):"""Implement the backward propagation using the instructions above.Arguments:parameters -- python dictionary containing our parameters cache -- a dictionary containing "Z1", "A1", "Z2" and "A2".X -- input data of shape (2, number of examples)Y -- "true" labels vector of shape (1, number of examples)Returns:grads -- python dictionary containing your gradients with respect to different parameters"""m = X.shape[1]# First, retrieve W1 and W2 from the dictionary "parameters".W1 = parameters['W1']W2 = parameters['W2']# Retrieve also A1 and A2 from dictionary "cache".A1 = cache['A1']A2 = cache['A2']# Backward propagation: calculate dW1, db1, dW2, db2. dZ2 = A2 - YdW2 = 1/m * np.dot(dZ2, A1.T)db2 = 1/m * np.sum(dZ2, axis = 1, keepdims = True)dZ1 = np.dot(W2.T, dZ2) * (1 - np.power(A1, 2))dW1 = 1/m * np.dot(dZ1, X.T)db1 = 1/m * np.sum(dZ1, axis = 1, keepdims = True)grads = {"dW1": dW1,"db1": db1,"dW2": dW2,"db2": db2}return grads
def update_parameters(parameters, grads, learning_rate = 1.2):"""Updates parameters using the gradient descent update rule given aboveArguments:parameters -- python dictionary containing your parameters grads -- python dictionary containing your gradients Returns:parameters -- python dictionary containing your updated parameters """# Retrieve each parameter from the dictionary "parameters" W1 = parameters['W1']b1 = parameters['b1']W2 = parameters['W2']b2 = parameters['b2']# Retrieve each gradient from the dictionary "grads"dW1 = grads['dW1']db1 = grads['db1']dW2 = grads['dW2']db2 = grads['db2']# Update rule for each parameterW1 -= learning_rate * dW1b1 -= learning_rate * db1W2 -= learning_rate * dW2b2 -= learning_rate * db2parameters = {"W1": W1,"b1": b1,"W2": W2,"b2": b2}return parameters

4.4 - Integrate parts 4.1, 4.2 and 4.3 in nn_model()

Question: Build your neural network model in nn_model().

Instructions: The neural network model has to use the previous functions in the right order.

def nn_model(X, Y, n_h, num_iterations = 10000, print_cost=False):"""Arguments:X -- dataset of shape (2, number of examples)Y -- labels of shape (1, number of examples)n_h -- size of the hidden layernum_iterations -- Number of iterations in gradient descent loopprint_cost -- if True, print the cost every 1000 iterationsReturns:parameters -- parameters learnt by the model. They can then be used to predict."""np.random.seed(3)n_x = layer_sizes(X, Y)[0]n_y = layer_sizes(X, Y)[2]# Initialize parameters, then retrieve W1, b1, W2, b2. Inputs: "n_x, n_h, n_y". Outputs = "W1, b1, W2, b2, parameters".parameters = initialize_parameters(n_x, n_h, n_y)W1 = parameters["W1"]b1 = parameters["b1"]W2 = parameters["W2"]b2 = parameters["b2"]# Loop (gradient descent)for i in range(0, num_iterations):# Forward propagation. Inputs: "X, parameters". Outputs: "A2, cache".A2, cache = forward_propagation(X, parameters)# Cost function. Inputs: "A2, Y, parameters". Outputs: "cost".cost = compute_cost(A2, Y, parameters)# Backpropagation. Inputs: "parameters, cache, X, Y". Outputs: "grads".grads = backward_propagation(parameters, cache, X, Y)# Gradient descent parameter update. Inputs: "parameters, grads". Outputs: "parameters".parameters = update_parameters(parameters, grads, learning_rate = 1.2)# Print the cost every 1000 iterationsif print_cost and i % 1000 == 0:print ("Cost after iteration %i: %f" %(i, cost))return parameters

4.5 Predictions

Question: Use your model to predict by building predict(). Use forward propagation to predict results.

def predict(parameters, X):"""Using the learned parameters, predicts a class for each example in XArguments:parameters -- python dictionary containing your parameters X -- input data of size (n_x, m)Returnspredictions -- vector of predictions of our model (red: 0 / blue: 1)"""# Computes probabilities using forward propagation, and classifies to 0/1 using 0.5 as the threshold.A2, cache = forward_propagation(X, parameters)predictions = (A2 > 0.5)return predictions

It is time to run the model and see how it performs on a planar dataset. Run the following code to test your model with a single hidden layer of ?ℎnhhidden units.

# Build a model with a n_h-dimensional hidden layer
parameters = nn_model(X, Y, n_h = 4, num_iterations = 10000, print_cost=True)# Plot the decision boundary
plot_decision_boundary(lambda x: predict(parameters, x.T), X, np.squeeze(Y))
plt.title("Decision Boundary for hidden layer size " + str(4))# Print accuracy
predictions = predict(parameters, X)
print ('Accuracy: %d' % float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100) + '%')

4.6 - Tuning hidden layer size (optional/ungraded exercise)

# This may take about 2 minutes to runplt.figure(figsize=(16, 32))
hidden_layer_sizes = [1, 2, 3, 4, 5, 10, 20]
for i, n_h in enumerate(hidden_layer_sizes):plt.subplot(5, 2, i+1)plt.title('Hidden Layer of size %d' % n_h)parameters = nn_model(X, Y, n_h, num_iterations = 5000)plot_decision_boundary(lambda x: predict(parameters, x.T), X, np.squeeze(Y))predictions = predict(parameters, X)accuracy = float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100)print ("Accuracy for {} hidden units: {} %".format(n_h, accuracy))

**You've learnt to:** - Build a complete neural network with a hidden layer - Make a good use of a non-linear unit - Implemented forward propagation and backpropagation, and trained a neural network - See the impact of varying the hidden layer size, including overfitting.


5 - Performance on other datasets

noisy_circles, noisy_moons, blobs, gaussian_quantiles, no_structure = load_extra_datasets()datasets = {"noisy_circles": noisy_circles,"noisy_moons": noisy_moons,"blobs": blobs,"gaussian_quantiles": gaussian_quantiles}dataset = "gaussian_quantiles"X, Y = datasets[dataset]
X, Y = X.T, Y.reshape(1, Y.shape[0])# make blobs binary
if dataset == "blobs":Y = Y%2# Visualize the data
plt.scatter(X[0, :], X[1, :], c = np.squeeze(Y), s=40, cmap=plt.cm.Spectral);

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/439863.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

一步步编写操作系统 11 实模式下程序分段的原因

cpu中本来是没有实模式这一称呼的,是因为有了保护模式后,为了将老的模式区别开来,所以称老的模式为实模式。这情况就像所有同学坐在同一个教室里,本来没有老同学这一概念,但某天老师领着一个陌生人进入教室并和大家宣布…

4.深度学习练习:Building your Deep Neural Network: Step by Step(强烈推荐)

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ fter this assignment you will be able to: Use non-linear units like ReLU to improve your modelBuild a d…

一步步编写操作系统21 x86虚拟机bochs 跟踪bios

为了让大家更好的理解bios是怎样被执行的,也就是计算机中第一个软件是怎样开始的,咱们还是先看下图3-17。在图的上面第5行,显示的是下一条待执行的指令,这是程序计数器(PC)中的值,在x86上的程序…

【CodeForces - 361D】Levko and Array (二分,dp)

题干: Levko has an array that consists of integers: a1, a2, ... , an. But he doesn’t like this array at all. Levko thinks that the beauty of the array a directly depends on value c(a), which can be calculated by the formula: The less value…

5.深度学习练习:Deep Neural Network for Image Classification: Application

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ After this assignment you will be able to: Build and apply a deep neural network to supervised learning…

【CodeForces - 689D】Friends and Subsequences(RMQ,二分 或单调队列)

题干: Mike and !Mike are old childhood rivals, they are opposite in everything they do, except programming. Today they have a problem they cannot solve on their own, but together (with you) — who knows? Every one of them has an integer seque…

6.深度学习练习:Initialization

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - Neural Network model 2 - Zero initialization 3 - Random initialization(掌握&…

【CodeForces - 602D】Lipshitz Sequence(思维,单调栈,斜率单调性)

题干: A function is called Lipschitz continuous if there is a real constant Ksuch that the inequality |f(x) - f(y)| ≤ K|x - y| holds for all . Well deal with a more... discrete version of this term. For an array , we define its Lipschi…

7.深度学习练习:Regularization

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ 目录 1-Package 2 - Non-regularized model 3 - L2 Regularization(掌握) 4-Dropou…

深入详解JVM内存模型与JVM参数详细配置

本系列会持续更新。 JVM基本是BAT面试必考的内容,今天我们先从JVM内存模型开启详解整个JVM系列,希望看完整个系列后,可以轻松通过BAT关于JVM的考核。 BAT必考JVM系列专题 1.JVM内存模型 2.JVM垃圾回收算法 3.JVM垃圾回收器 4.JVM参数详解 5…

8.深度学习练习:Gradient Checking

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ 目录 1) How does gradient checking work? 2) 1-dimensional gradient checking 3) N-dimensional gradie…

9.深度学习练习:Optimization Methods

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - Gradient Descent 2 - Mini-Batch Gradient descent 3 - Momentum 4 - Adam 5 - Model with dif…

一步步编写操作系统 22 硬盘操作方法

硬盘中的指令很多,各指令的用法也不同。有的指令直接往command寄存器中写就行了,有的还要在feature寄存器中写入参数,最权威的方法还是要去参考ATA手册。由于本书中用到的都是简单的指令,所以对此抽象出一些公共的步骤仅供参考之用…

10.深度学习练习:Convolutional Neural Networks: Step by Step(强烈推荐)

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - Packages 2 - Outline of the Assignment 3 - Convolutional Neural Networks 3.1 - Zero-Paddin…

一步步编写操作系统 23 重写主引导记录mbr

本节我们在之前MBR的基础上,做个稍微大一点的改进,经过这个改进后,我们的MBR可以读取硬盘。听上去这可是个大“手术”呢,我们要将之前学过的知识都用上啦。其实没那么大啦,就是加了个读写磁盘的函数而已,哈…

11.深度学习练习:Keras tutorial - the Happy House(推荐)

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ Welcome to the first assignment of week 2. In this assignment, you will: Learn to use Keras, a high-lev…

一步步编写操作系统 24 编写内核加载器

这一节的内容并不长,因为在进入保护模式之前,我们能做的不多,loader是要经过实模式到保护模式的过渡,并最终在保护模式下加载内核。本节只实现一个简单的loader,本loader只在实模式下工作,等学习了保护模式…

12.深度学习练习:Residual Networks(注定成为经典)

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ 目录 1 - The problem of very deep neural networks 2 - Building a Residual Network 2.1 - The identity…

13.深度学习练习:Autonomous driving - Car detection(YOLO实战)

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ Welcome to your week 3 programming assignment. You will learn about object detection using the very pow…

14.深度学习练习:Face Recognition for the Happy House

本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。 课程链接:https://www.deeplearning.ai/deep-learning-specialization/ Welcome to the first assignment of week 4! Here you will build a face recognition system. Many of the i…