Tensorflow

•使用图 (graph) 来表示计算任务.
• 在被称之为会话 (Session) 的上下文 (context) 中执行图.
• 使用 tensor 表示数据.
• 通过变量 (Variable) 维护状态.
• 使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据

综述

TensorFlow 用图来表示计算任务，图中的节点被称之为operation，缩写成op。
一个节点获得 0 个或者多个张量 tensor，执行计算，产生0个或多个张量。
图必须在会话(Session)里被启动，会话(Session)将图的op分发到CPU或GPU之类的设备上，同时提供执行op的方法，这些方法执行后，将产生的张量(tensor)返回

节点通常代表数学运算，边表示节点之间的某种联系，它负责传输多维数据(Tensors)。

概念描述

Tensor

Tensor的意思是张量，可以理解为tensorflow中矩阵的表示形式。Tensor的生成方式有很多种，最简单的就如

import tensorflow as tf # 在下面所有代码中，都去掉了这一行，默认已经导入
a = tf.zeros(shape=[1,2])

不过要注意，因为在训练开始前，所有的数据都是抽象的概念，也就是说，此时a只是表示这应该是个1*5的零矩阵，而没有实际赋值，也没有分配空间，所以如果此时print,就会出现如下情况:

print(a)
#===>Tensor("zeros:0", shape=(1, 2), dtype=float32)

只有在训练过程开始后，才能获得a的实际值

sess = tf.InteractiveSession()
print(sess.run(a))
#===>[[ 0.  0.]]

更详细的Tensor 的理解见你真的懂TensorFlow吗？Tensor是神马？为什么还会Flow?

Variable

故名思议，是变量的意思。一般用来表示图中的各计算参数，包括矩阵，向量等。
变量 Variable，是维护图执行过程中的状态信息的. 需要它来保持和更新参数值，是需要动态调整的。
一个变量代表着TensorFlow计算图中的一个值,能够在计算过程中使用,甚至进行修改.
例如，我要表示上图中的模型，那表达式就是

y=Relu(Wx+b)

（relu是一种激活函数，具体可见这里）这里W和b是我要用来训练的参数，那么此时这两个值就可以用Variable来表示。Variable的初始函数有很多其他选项，这里先不提，只输入一个Tensor也是可以的

W = tf.Variable(tf.zeros(shape=[1,2]))

注意，此时W一样是一个抽象的概念，而且与Tensor不同，Variable必须初始化以后才有具体的值。

tensor = tf.zeros(shape=[1,2])
variable = tf.Variable(tensor)
sess = tf.InteractiveSession()
# print(sess.run(variable))  # 会报错
sess.run(tf.initialize_all_variables()) # 对variable进行初始化
print(sess.run(variable))
#===>[[ 0.  0.]]

tf.initialize_all_variables，是预先对变量初始化，Tensorflow 的变量必须先初始化，然后才有值！而常值张量是不需要的

placeholder

又叫占位符，同样是一个抽象的概念。用于表示输入输出数据的格式。告诉系统：这里有一个值/向量/矩阵，现在我没法给你具体数值，不过我正式运行的时候会补上的！例如上式中的x和y。因为没有具体数值，所以只要指定尺寸即可

x = tf.placeholder(tf.float32,[1, 5],name='input')
y = tf.placeholder(tf.float32,[None, 5],name='input')

上面有两种形式，第一种x，表示输入是一个[1,5]的横向量。
而第二种形式，表示输入是一个[?,5]的矩阵。那么什么情况下会这么用呢?就是需要输入一批[1,5]的数据的时候。比如我有一批共10个数据，那我可以表示成[10,5]的矩阵。如果是一批5个，那就是[5,5]的矩阵。tensorflow会自动进行批处理

Session

session，也就是会话。session是抽象模型的实现者。为什么之前的代码多处要用到session？因为模型是抽象的嘛，只有实现了模型以后，才能够得到具体的值。同样，具体的参数训练，预测，甚至变量的实际值查询，都要用到session

# 启动默认图.
sess = tf.Session()
# 调用 sess 的 'run()' 方法, 传入 'product' 作为该方法的参数，
# 触发了图中三个 op (两个常量 op 和一个矩阵乘法 op)，
# 向方法表明, 我们希望取回矩阵乘法 op 的输出.
result = sess.run(product)# 返回值 'result' 是一个 numpy `ndarray` 对象.
print result
# ==> [[ 12.]]# 任务完成, 需要关闭会话以释放资源。
sess.close()

Session 对象在使用完后需要关闭以释放资源. 除了显式调用 close 外, 也可以使用 “with” 代码块来自动完成关闭动作.

with tf.Session() as sess:
result = sess.run([product])
print result

交互式使用
在 Python API 中，使用一个会话 Session 来启动图, 并调用 Session.run() 方法执行操作.
为了便于在 IPython 等交互环境使用 TensorFlow，需要用 InteractiveSession 代替 Session 类, 使用 Tensor.eval() 和 Operation.run() 方法代替 Session.run()。

使用更加方便的 InteractiveSession 类。通过它,你可以更加灵活地构建你的代码。它能让你在运行
图的时候,插入一些计算图,这些计算图是由某些操作(operations)构成的。这对于工作在交互式环境中的人们来说非常便利,比如使用IPython。如果你没有使用 InteractiveSession ,那么你需要在启动session之前构建整个计算图,然后启动该计算图。

计算 ‘x’ 减去 ‘a’：

# 进入一个交互式 TensorFlow 会话.
import tensorflow as tf
sess = tf.InteractiveSession()x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])# 使用初始化器 initializer op 的 run() 方法初始化 'x' 
x.initializer.run()# 增加一个减法 sub op, 从 'x' 减去 'a'. 运行减法 op, 输出结果 
sub = tf.sub(x, a)
print sub.eval()
# ==> [-2. -1.]

Tensorflow 调用ＧＰＵ

with tf.Session() as sess:
with tf.device("/gpu:1"):
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.],[2.]])
product = tf.matmul(matrix1, matrix2)
...

设备用字符串进行标识. 目前支持的设备包括:
• “/cpu:0” : 机器的 CPU.
• “/gpu:0” : 机器的第一个 GPU, 如果有的话.
• “/gpu:1” : 机器的第二个 GPU, 以此类推.

计算图

tensorflow的运行流程主要有2步，分别是构造模型和训练

TensorFlow 程序通常被组织成一个构建阶段和一个执行阶段. 在构建阶段, op 的执行步骤被描述成一个图. 在执行阶段, 使用会话执行执行图中的 op.

模型构建

这里我们使用官方tutorial中的mnist数据集的分类代码，公式可以写作

z=Wx+ba=softmax(z)

那么该模型的代码描述为

# 建立抽象模型
x = tf.placeholder(tf.float32, [None, 784]) # 输入占位符,None 表示其值大小不定,在这里作为第一个维度值,用以指代batch的大小,意即 x 的数量不定。
y = tf.placeholder(tf.float32, [None, 10])  # 输出占位符（预期输出）
W = tf.Variable(tf.zeros([784, 10]))        
b = tf.Variable(tf.zeros([10]))
a = tf.nn.softmax(tf.matmul(x, W) + b)      # a表示模型的实际输出

# 定义损失函数和训练方法
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(a), reduction_indices=[1])) # 损失函数为交叉熵#tf.reduce_sum 把minibatch里的每张图片的交叉熵值都加起来了。我们计算的交叉熵是指整个minibatch
的。optimizer = tf.train.GradientDescentOptimizer(0.5) # 梯度下降法，学习速率为0.5
train = optimizer.minimize(cross_entropy)  # 训练目标：最小化损失函数

可以看到这样以来，模型中的所有元素(图结构，损失函数，下降方法和训练目标)都已经包括在train里面。我们可以把train叫做训练模型。那么我们还需要测试模型

correct_prediction = tf.equal(tf.argmax(a, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

上述两行代码，tf.argmax表示找到最大值的位置(也就是预测的分类和实际的分类)，然后看看他们是否一致，是就返回true,不是就返回false,这样得到一个boolean数组。tf.cast将boolean数组转成int数组，最后求平均值，得到分类的准确率.

实际训练

有了训练模型和测试模型以后，我们就可以开始进行实际的训练了

sess = tf.InteractiveSession()      # 建立交互式会话
tf.initialize_all_variables().run() # 所有变量初始化
for i in range(1000):batch_xs, batch_ys = mnist.train.next_batch(100)    # 获得一批100个数据train.run({x: batch_xs, y: batch_ys})   # 给训练模型提供输入和输出
print(sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels}))

可以看到，在模型搭建完以后，我们只要为模型提供输入和输出，模型就能够自己进行训练和测试了。中间的求导，求梯度，反向传播等等繁杂的事情，tensorflow都会帮你自动完成。

示例代码

实际操作中，还包括了获取数据的代码

"""A very simple MNIST classifier.
See extensive documentation at
http://tensorflow.org/tutorials/mnist/beginners/index.md
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function# Import data
from tensorflow.examples.tutorials.mnist import input_dataimport tensorflow as tfflags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string('data_dir', '/tmp/data/', 'Directory for storing data') # 把数据放在/tmp/data文件夹中mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)   # 读取数据集
#标签数据是"one-hot vectors"。 一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0。# 建立抽象模型
x = tf.placeholder(tf.float32, [None, 784]) # 占位符,
#进行模型计算，a是预测，y 是实际
y = tf.placeholder(tf.float32, [None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
a = tf.nn.softmax(tf.matmul(x, W) + b)-------------------------
#输入图像２８X28(这个数组展开成一个向量,长度是 28x28 = 784。如何展开这个数组(数字间的顺序)不重要,只要保持各个图片采用相同的方式展开,展平图片的数字数组会丢失图片的二维结构信息。)
#标签数据是"one-hot vectors"。 一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0。
------------------------------------# 定义损失函数和训练方法
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(a), reduction_indices=[1]))  # 损失函数为交叉熵
optimizer = tf.train.GradientDescentOptimizer(0.5) # 梯度下降法，学习速率为0.5
train = optimizer.minimize(cross_entropy) # 训练目标：最小化损失函数#成本函数是“交叉熵”(cross-entropy)。交叉熵产生于信息论里面的信息压缩编码技术,但是它后来演变成为从博弈论到机器学习等其他领域里的重要技术手段.比较粗糙的理解是,交叉熵是用来衡量我们的预测用于描述真相的低效性.
--------------------------
# Test trained model
correct_prediction = tf.equal(tf.argmax(a, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))#tf.argmax 是一个非常有用的函数,它能给出某个tensor对象在某一维上的其数据最大值所在的索引值。由于标签向量是由0,1组成,因此最大值1所在的索引位置就是类别标签,比如 tf.argmax(y,1) 返回的是模型对于任一输入x预测到的标签值
#tf.equal 来检测我们的预测是否真实标签匹配
---------------------
# Train
sess = tf.InteractiveSession()      # 建立交互式会话
tf.initialize_all_variables().run()
for i in range(1000):batch_xs, batch_ys = mnist.train.next_batch(100)train.run({x: batch_xs, y: batch_ys})
print(sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels}))

＃执行结果
Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
WARNING:tensorflow:From <ipython-input-16-30d57c355fc3>:39: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
0.9144

得到的分类准确率在91%左右

使用变量实现一个简单的计数器

# －创建一个变量, 初始化为标量 0.  初始化定义初值
state = tf.Variable(0, name="counter")# 创建一个 op, 其作用是使 state 增加 1
one = tf.constant(1)
new_value = tf.add(state, one)
update = tf.assign(state, new_value)# 启动图后, 变量必须先经过`初始化` (init) op 初始化,
# 才真正通过Tensorflow的initialize_all_variables对这些变量赋初值
init_op = tf.initialize_all_variables()# 启动默认图, 运行 op
with tf.Session() as sess:# 运行 'init' opsess.run(init_op)# 打印 'state' 的初始值# 取回操作的输出内容, 可以在使用 Session 对象的 run() 调用 执行图时, # 传入一些 tensor, 这些 tensor 会帮助你取回结果. # 此处只取回了单个节点 state，# 也可以在运行一次 op 时一起取回多个 tensor: # result = sess.run([mul, intermed])print sess.run(state)# 运行 op, 更新 'state', 并打印 'state'for _ in range(3):sess.run(update)print sess.run(state)# 输出:# 0
# 1
# 2
# 3

过程就是：建图->启动图->运行取值

这里写图片描述

Deep MNIST for Experts

见脚本
input_data.py

# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================="""Functions for downloading and reading MNIST data."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_functionimport gzip
import os
import tempfileimport numpy
from six.moves import urllib
from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets

#!/usr/bin/env python3  
# -*- coding: utf-8 -*-  import time  
import input_data  
import tensorflow as tf'''''  
权重初始化 
初始化为一个接近0的很小的正数 
'''  
def weight_variable(shape):  initial = tf.truncated_normal(shape, stddev=0.1)  return tf.Variable(initial)  def bias_variable(shape):  initial = tf.constant(0.1, shape=shape)  return tf.Variable(initial)  ''''' 
卷积和池化，使用卷积步长为1（stride size）,0边距（padding size） 
池化用简单传统的2x2大小的模板做max pooling 
'''  
def conv2d(x, W):  return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')  def max_pool_2x2(x):  return tf.nn.max_pool(x, ksize=[1,2,2,1],  strides=[1,2,2,1], padding='SAME')  #计算开始时间  
start = time.clock()  
#MNIST数据输入  
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)  x = tf.placeholder(tf.float32,[None, 784]) #图像输入向量  
W = tf.Variable(tf.zeros([784,10]))  #权重，初始化值为全零  
b = tf.Variable(tf.zeros([10]))  #偏置，初始化值为全零  #第一层卷积，由一个卷积接一个maxpooling完成，卷积在每个  
#5x5的patch中算出32个特征。  
#卷积的权重张量形状是[5, 5, 1, 32]，前两个维度是patch的大小，  
#接着是输入的通道数目，最后是输出的通道数目。   
#而对于每一个输出通道都有一个对应的偏置量。  
W_conv1 = weight_variable([5,5,1,32])  
b_conv1 = bias_variable([32])  '''''把x变成一个4d向量，其第2、第3维对应图片的宽、高，最后一维代表图片的颜色通道数(因为是灰度图所以这里的通道数为1，如果是rgb彩色图，则为3)。 
'''  
x_image = tf.reshape(x, [-1,28,28,1])  #最后一维代表通道数目，如果是rgb则为3  
#x_image权重向量卷积，加上偏置项，之后应用ReLU函数，之后进行max_polling  
h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1) + b_conv1)  
h_pool1 = max_pool_2x2(h_conv1)  #实现第二层卷积  #每个5x5的patch会得到64个特征  
W_conv2 = weight_variable([5, 5, 32, 64])  
b_conv2 = bias_variable([64])  h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)  
h_pool2 = max_pool_2x2(h_conv2)  ''''' 
图片尺寸变为7x7，加入有1024个神经元的全连接层，把池化层输出张量reshape成向量 
乘上权重矩阵，加上偏置，然后进行ReLU 
'''  
W_fc1 = weight_variable([7*7*64,1024])  
b_fc1 = bias_variable([1024])  h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])  
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)  #Dropout， 用来防止过拟合 #加在输出层之前，训练过程中开启dropout，测试过程中关闭  
keep_prob = tf.placeholder("float")  
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)  #输出层, 添加softmax层  
W_fc2 = weight_variable([1024,10])  
b_fc2 = bias_variable([10])  y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2) + b_fc2)  #训练和评估模型  
''''' 
ADAM优化器来做梯度最速下降,feed_dict 加入参数keep_prob控制dropout比例 
'''  
y_ = tf.placeholder("float", [None,10])  
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))  #计算交叉熵  
#使用adam优化器来以0.0001的学习率来进行微调  
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)  
#判断预测标签和实际标签是否匹配  
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))  
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))  #启动创建的模型，并初始化变量  
sess = tf.Session()  
sess.run(tf.global_variables_initializer()) #开始训练模型，循环训练20000次  
for i in range(20000):  batch = mnist.train.next_batch(50)   #batch 大小设置为50  if i%100 == 0:  train_accuracy = accuracy.eval(session=sess,  feed_dict={x:batch[0], y_:batch[1], keep_prob:1.0})  print("step %d, train_accuracy %g" %(i,train_accuracy))  #神经元输出保持不变的概率 keep_prob 为0.5  train_step.run(session=sess, feed_dict={x:batch[0], y_:batch[1], keep_prob:0.5})  print("test accuracy %g" %accuracy.eval(session=sess,  feed_dict={x:mnist.test.images, y_:mnist.test.labels, keep_prob:1.0}))  end = time.clock()  
print("running time is %g s" %(end-start))

执行结果

step 0, train_accuracy 0.22
step 100, train_accuracy 0.84
step 200, train_accuracy 0.92
step 300, train_accuracy 0.9
step 400, train_accuracy 0.96
step 500, train_accuracy 0.9
step 600, train_accuracy 1
step 700, train_accuracy 0.96
step 800, train_accuracy 0.92
step 900, train_accuracy 0.98
step 1000, train_accuracy 0.94
step 1100, train_accuracy 0.9
step 1200, train_accuracy 0.96
step 1300, train_accuracy 0.98
step 1400, train_accuracy 0.96
step 1500, train_accuracy 0.98
step 1600, train_accuracy 0.96
step 1700, train_accuracy 1
step 1800, train_accuracy 1
step 1900, train_accuracy 0.98
step 2000, train_accuracy 0.98
step 2100, train_accuracy 0.98
step 2200, train_accuracy 1
step 2300, train_accuracy 0.96
step 2400, train_accuracy 1
step 2500, train_accuracy 0.98
step 2600, train_accuracy 0.98
step 2700, train_accuracy 0.98
step 2800, train_accuracy 0.98
step 2900, train_accuracy 0.96
step 3000, train_accuracy 1
step 3100, train_accuracy 1
step 3200, train_accuracy 0.98
step 3300, train_accuracy 1
step 3400, train_accuracy 0.98
step 3500, train_accuracy 0.96
step 3600, train_accuracy 0.98
step 3700, train_accuracy 0.96
step 3800, train_accuracy 1
step 3900, train_accuracy 1
step 4000, train_accuracy 1
step 4100, train_accuracy 1
step 4200, train_accuracy 0.98
step 4300, train_accuracy 1
step 4400, train_accuracy 1
step 4500, train_accuracy 0.96
step 4600, train_accuracy 1
step 4700, train_accuracy 0.96
step 4800, train_accuracy 0.98
step 4900, train_accuracy 1
step 5000, train_accuracy 1
step 5100, train_accuracy 1
step 5200, train_accuracy 1
step 5300, train_accuracy 0.98
step 5400, train_accuracy 0.98
step 5500, train_accuracy 1
step 5600, train_accuracy 1
step 5700, train_accuracy 0.98
step 5800, train_accuracy 0.98
step 5900, train_accuracy 1
step 6000, train_accuracy 0.98
step 6100, train_accuracy 1
step 6200, train_accuracy 0.98
step 6300, train_accuracy 0.98
step 6400, train_accuracy 1
step 6500, train_accuracy 0.98
step 6600, train_accuracy 0.98
step 6700, train_accuracy 1
step 6800, train_accuracy 1
step 6900, train_accuracy 1
step 7000, train_accuracy 0.98
step 7100, train_accuracy 1
step 7200, train_accuracy 0.96
step 7300, train_accuracy 0.98
step 7400, train_accuracy 0.96
step 7500, train_accuracy 1
step 7600, train_accuracy 1
step 7700, train_accuracy 1
step 7800, train_accuracy 1
step 7900, train_accuracy 1
step 8000, train_accuracy 0.98
step 8100, train_accuracy 1
step 8200, train_accuracy 1
step 8300, train_accuracy 1
step 8400, train_accuracy 0.98
step 8500, train_accuracy 0.94
step 8600, train_accuracy 1
step 8700, train_accuracy 1
step 8800, train_accuracy 1
step 8900, train_accuracy 1
step 9000, train_accuracy 0.98
step 9100, train_accuracy 1
step 9200, train_accuracy 0.98
step 9300, train_accuracy 1
step 9400, train_accuracy 1
step 9500, train_accuracy 1
step 9600, train_accuracy 1
step 9700, train_accuracy 1
step 9800, train_accuracy 1
step 9900, train_accuracy 0.98
step 10000, train_accuracy 1
step 10100, train_accuracy 0.98
step 10200, train_accuracy 1
step 10300, train_accuracy 1
step 10400, train_accuracy 1
step 10500, train_accuracy 1
step 10600, train_accuracy 1
step 10700, train_accuracy 1
step 10800, train_accuracy 1
step 10900, train_accuracy 1
step 11000, train_accuracy 1
step 11100, train_accuracy 1
step 11200, train_accuracy 1
step 11300, train_accuracy 1
step 11400, train_accuracy 0.98
step 11500, train_accuracy 1
step 11600, train_accuracy 1
step 11700, train_accuracy 1
step 11800, train_accuracy 0.98
step 11900, train_accuracy 1
step 12000, train_accuracy 1
step 12100, train_accuracy 1
step 12200, train_accuracy 0.98
step 12300, train_accuracy 1
step 12400, train_accuracy 1
step 12500, train_accuracy 1
step 12600, train_accuracy 1
step 12700, train_accuracy 1
step 12800, train_accuracy 1
step 12900, train_accuracy 1
step 13000, train_accuracy 0.98
step 13100, train_accuracy 1
step 13200, train_accuracy 1
step 13300, train_accuracy 0.98
step 13400, train_accuracy 1
step 13500, train_accuracy 1
step 13600, train_accuracy 1
step 13700, train_accuracy 1
step 13800, train_accuracy 1
step 13900, train_accuracy 1
step 14000, train_accuracy 1
step 14100, train_accuracy 1
step 14200, train_accuracy 1
step 14300, train_accuracy 0.98
step 14400, train_accuracy 1
step 14500, train_accuracy 1
step 14600, train_accuracy 1
step 14700, train_accuracy 1
step 14800, train_accuracy 1
step 14900, train_accuracy 1
step 15000, train_accuracy 1
step 15100, train_accuracy 0.98
step 15200, train_accuracy 1
step 15300, train_accuracy 1
step 15400, train_accuracy 1
step 15500, train_accuracy 1
step 15600, train_accuracy 0.98
step 15700, train_accuracy 1
step 15800, train_accuracy 1
step 15900, train_accuracy 1
step 16000, train_accuracy 1
step 16100, train_accuracy 1
step 16200, train_accuracy 1
step 16300, train_accuracy 0.98
step 16400, train_accuracy 1
step 16500, train_accuracy 1
step 16600, train_accuracy 1
step 16700, train_accuracy 1
step 16800, train_accuracy 1
step 16900, train_accuracy 1
step 17000, train_accuracy 1
step 17100, train_accuracy 1
step 17200, train_accuracy 1
step 17300, train_accuracy 1
step 17400, train_accuracy 1
step 17500, train_accuracy 1
step 17600, train_accuracy 1
step 17700, train_accuracy 1
step 17800, train_accuracy 1
step 17900, train_accuracy 1
step 18000, train_accuracy 1
step 18100, train_accuracy 1
step 18200, train_accuracy 1
step 18300, train_accuracy 0.98
step 18400, train_accuracy 1
step 18500, train_accuracy 1
step 18600, train_accuracy 1
step 18700, train_accuracy 1
step 18800, train_accuracy 1
step 18900, train_accuracy 1
step 19000, train_accuracy 1
step 19100, train_accuracy 1
step 19200, train_accuracy 0.98
step 19300, train_accuracy 1
step 19400, train_accuracy 1
step 19500, train_accuracy 1
step 19600, train_accuracy 1
step 19700, train_accuracy 1
step 19800, train_accuracy 1
step 19900, train_accuracy 1
test accuracy 0.9938
running time is 252.157 s

参考文献

tensorflow笔记：流程，概念和简单代码注释
tensorflow笔记：流程，概念和简单代码注释
TensorFlow 入门
TensorFlow 训练 MNIST 数据（二）
MNIST机器学习入门
MNIST For ML Beginners
深入MNIST
Deep MNIST for Experts
tensorflow/tensorflow/examples/tutorials/mnist/
Tensorflow英文文档
2·MNIST机器学习入门
3·深入MNIST