[关闭]
@Perfect-Demo 2018-05-01T10:18:20.000000Z 字数 12087 阅读 1082

deep_learning_month4_week1_Convolution_model_Application

机器学习深度学习

代码已上传github:
https://github.com/PerfectDemoT/my_deeplearning_homework


说明:
这是month4_week1的第一个作业,这里用tensorflow构建了一个拥有两个卷基层,两个池化层,一个全连接层的卷积神经网络。
用来检测手指比划数字。

有一个坑,大家要小心:
在执行foward propagation那部分的代码时,有可能你的代码都是正确的,但是你的运行结果却与juypter notebook上的expected output的结果不一样。我在同学的电脑上试图运行相同的代码,结果发现可以正常运行,且结果正确;但是在自己电脑上运行的结果却不一样。虽然不知道原因,但是有一个解决办法:那就是换成老版本的tensorflow。我最初使用的就是tensorflow1.6.0版本,后来换成了1.2.0的版本就可以正确输出结果了。

下面一步步看代码

1. 准备

1. 先导入包:

  1. import math
  2. import numpy as np
  3. import h5py
  4. import matplotlib.pyplot as plt
  5. import scipy
  6. from PIL import Image
  7. from scipy import ndimage
  8. import tensorflow as tf
  9. from tensorflow.python.framework import ops
  10. from cnn_utils import *
  11. %matplotlib inline
  12. np.random.seed(1)

2. 导入数据

  1. # Loading the data (signs)
  2. X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()

3. 数据可视化看看一幅图

  1. # Example of a picture
  2. index = 6
  3. plt.imshow(X_train_orig[index])
  4. print ("y = " + str(np.squeeze(Y_train_orig[:, index])))

图片可视化

4. 查看数据的规模进行

  1. X_train = X_train_orig/255.
  2. X_test = X_test_orig/255.
  3. Y_train = convert_to_one_hot(Y_train_orig, 6).T
  4. Y_test = convert_to_one_hot(Y_test_orig, 6).T
  5. print ("number of training examples = " + str(X_train.shape[0]))
  6. print ("number of test examples = " + str(X_test.shape[0]))
  7. print ("X_train shape: " + str(X_train.shape))
  8. print ("Y_train shape: " + str(Y_train.shape))
  9. print ("X_test shape: " + str(X_test.shape))
  10. print ("Y_test shape: " + str(Y_test.shape))
  11. conv_layers = {}

2. 函数编写

1. 编写占位符函数,方便之后赋值运行

  1. # GRADED FUNCTION: create_placeholders
  2. def create_placeholders(n_H0, n_W0, n_C0, n_y):
  3. """
  4. Creates the placeholders for the tensorflow session.
  5. Arguments:
  6. n_H0 -- scalar, height of an input image
  7. n_W0 -- scalar, width of an input image
  8. n_C0 -- scalar, number of channels of the input
  9. n_y -- scalar, number of classes
  10. Returns:
  11. X -- placeholder for the data input, of shape [None, n_H0, n_W0, n_C0] and dtype "float"
  12. Y -- placeholder for the input labels, of shape [None, n_y] and dtype "float"
  13. """
  14. ### START CODE HERE ### (≈2 lines)
  15. X = tf.placeholder(name='X', shape=(None, n_H0, n_W0, n_C0), dtype=tf.float32)
  16. Y = tf.placeholder(name='Y', shape=(None, n_y), dtype=tf.float32)
  17. ### END CODE HERE ###
  18. return X, Y

测试代码:

  1. X, Y = create_placeholders(64, 64, 3, 6)
  2. print ("X = " + str(X))
  3. print ("Y = " + str(Y))

结果:

  1. X = Tensor("Placeholder:0", shape=(?, 64, 64, 3), dtype=float32)
  2. Y = Tensor("Placeholder_1:0", shape=(?, 6), dtype=float32)

2. 随机初始化参数

用到了tf.contrib.layers.xavier_initializer(seed = 0)函数,并且注意,tf.get_variable()内部参数的设定

  1. def initialize_parameters():
  2. """
  3. Initializes weight parameters to build a neural network with tensorflow. The shapes are:
  4. W1 : [4, 4, 3, 8]
  5. W2 : [2, 2, 8, 16]
  6. Returns:
  7. parameters -- a dictionary of tensors containing W1, W2
  8. """
  9. tf.set_random_seed(1) # so that your "random" numbers match ours
  10. ### START CODE HERE ### (approx. 2 lines of code)
  11. W1 = tf.get_variable(name='W1', dtype=tf.float32, shape=(4, 4, 3, 8), initializer=tf.contrib.layers.xavier_initializer(seed = 0))
  12. W2 = tf.get_variable(name='W2', dtype=tf.float32, shape=(2, 2, 8, 16), initializer=tf.contrib.layers.xavier_initializer(seed = 0))
  13. ### END CODE HERE ###
  14. parameters = {"W1": W1,
  15. "W2": W2}
  16. return parameters

当然,其中的seed的设定是为了让值和expected的结果一样
输出一下:

  1. tf.reset_default_graph()
  2. with tf.Session() as sess_test:
  3. parameters = initialize_parameters()
  4. init = tf.global_variables_initializer()
  5. sess_test.run(init)
  6. print("W1 = " + str(parameters["W1"].eval()[1,1,1]))
  7. print("W2 = " + str(parameters["W2"].eval()[1,1,1]))

结果为:

  1. W1 = [ 0.00131723 0.14176141 -0.04434952 0.09197326 0.14984085 -0.03514394
  2. -0.06847463 0.05245192]
  3. W2 = [-0.08566415 0.17750949 0.11974221 0.16773748 -0.0830943 -0.08058
  4. -0.00577033 -0.14643836 0.24162132 -0.05857408 -0.19055021 0.1345228
  5. -0.22779644 -0.1601823 -0.16117483 -0.10286498]

3. 前向传播(这里有坑)

有可能你的代码都是正确的,但是你的运行结果却与juypter notebook上的expected output的结果不一样。我在同学的电脑上试图运行相同的代码,结果发现可以正常运行,且结果正确;但是在自己电脑上运行的结果却不一样。虽然不知道原因,但是有一个解决办法:那就是换成老版本的tensorflow。我最初使用的就是tensorflow1.6.0版本,后来换成了1.2.0的版本就可以正确输出结果了。

解释:
对于函数 tf.nn.conv2d(input , filter , strides , padding , use_cudnn_on_gpu=None , name=None) :

input:指卷积需要输入的参数,具有这样的shape[batch, in_height, in_width, in_channels],分别是[batch张图片, 每张图片高度为in_height, 每张图片宽度为in_width, 图像通道为in_channels]。

filter:指用来做卷积的滤波器,当然滤波器也需要有相应参数,滤波器的shape为[filter_height, filter_width, in_channels, out_channels],分别对应[滤波器高度, 滤波器宽度, 接受图像的通道数, 卷积后通道数],其中第三个参数 in_channels需要与input中的第四个参数 in_channels一致,out_channels第一看的话有些不好理解,如rgb输入三通道图,我们的滤波器的out_channels设为1的话,就是三通道对应值相加,最后输出一个卷积核。

strides:代表步长,其值可以直接默认一个数,也可以是一个四维数如[1,2,1,1],则其意思是水平方向卷积步长为第二个参数2,垂直方向步长为1.其中第一和第四个参数我还不是很明白,请大佬指点,貌似和通道有关系。

padding:代表填充方式,参数只有两种,SAME和VALID,SAME比VALID的填充方式多了一列,比如一个3*3图像用2*2的滤波器进行卷积,当步长设为2的时候,会缺少一列,则进行第二次卷积的时候,VALID发现余下的窗口不足2*2会直接把第三列去掉,SAME则会填充一列,填充值为0。

use_cudnn_on_gpu:bool类型,是否使用cudnn加速,默认为true。大概意思是是否使用gpu加速,还没搞太懂。

name:给返回的tensor命名。给输出feature map起名字。

tf.nn.max_pool(value, ksize, strides, padding, name=None)

value:池化的输入,一般池化层接在卷积层的后面,所以输出通常为feature map。feature map依旧是[batch, in_height, in_width, in_channels]这样的参数。

ksize:池化窗口的大小,参数为四维向量,通常取[1, height, width, 1],因为我们不想在batch和channels上做池化,所以这两个维度设为了1。ps:估计面tf.nn.conv2d中stries的四个取值也有 相同的意思。

stries:步长,同样是一个四维向量。

padding:填充方式同样只有两种不重复了。

tf.contrib.layers.flatten(P) 的参数意义。

tf.contrib.layers.flatten(P): given an input P, this function flattens each example into a 1D vector it while maintaining the batch-size. It returns a flattened tensor with shape [batch_size, k].

tf.contrib.layers.fully_connected(F, num_outputs)

tf.contrib.layers.fully_connected(F, num_outputs): given a the flattened input F, it returns the output computed using a fully connected layer. You can read the full documentation

下面看看代码:

  1. def forward_propagation(X, parameters):
  2. """
  3. Implements the forward propagation for the model:
  4. CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED
  5. Arguments:
  6. X -- input dataset placeholder, of shape (input size, number of examples)
  7. parameters -- python dictionary containing your parameters "W1", "W2"
  8. the shapes are given in initialize_parameters
  9. Returns:
  10. Z3 -- the output of the last LINEAR unit
  11. """
  12. # Retrieve the parameters from the dictionary "parameters"
  13. W1 = parameters['W1']
  14. W2 = parameters['W2']
  15. ### START CODE HERE ###
  16. # CONV2D: stride of 1, padding 'SAME'
  17. Z1 = tf.nn.conv2d(input=X, filter=W1, strides=(1, 1, 1, 1), padding='SAME')
  18. # RELU
  19. A1 = tf.nn.relu(Z1)
  20. # MAXPOOL: window 8x8, sride 8, padding 'SAME'
  21. P1 = tf.nn.max_pool(value=A1, ksize=(1, 8, 8, 1), strides=(1, 8, 8, 1), padding='SAME')
  22. # CONV2D: filters W2, stride 1, padding 'SAME'
  23. Z2 = tf.nn.conv2d(input=P1, filter=W2, strides=(1, 1, 1, 1), padding='SAME')
  24. # RELU
  25. A2 = tf.nn.relu(Z2)
  26. # MAXPOOL: window 4x4, stride 4, padding 'SAME'
  27. P2 = tf.nn.max_pool(value=A2, ksize=(1, 4, 4, 1), strides=(1, 4, 4, 1), padding='SAME')
  28. # FLATTEN
  29. P2 = tf.contrib.layers.flatten(inputs=P2)
  30. # FULLY-CONNECTED without non-linear activation function (not not call softmax).
  31. # 6 neurons in output layer. Hint: one of the arguments should be "activation_fn=None"
  32. Z3 = tf.contrib.layers.fully_connected(P2, 6, activation_fn=None)
  33. ### END CODE HERE ###
  34. return Z3

输出一下:

  1. tf.reset_default_graph()
  2. with tf.Session() as sess:
  3. np.random.seed(1)
  4. X, Y = create_placeholders(64, 64, 3, 6)
  5. parameters = initialize_parameters()
  6. Z3 = forward_propagation(X, parameters)
  7. init = tf.global_variables_initializer()
  8. sess.run(init)
  9. a = sess.run(Z3, {X: np.random.randn(2,64,64,3), Y: np.random.randn(2,6)})
  10. print("Z3 = " + str(a))

结果:

  1. Z3 = [[-0.44670227 -1.57208765 -1.53049231 -2.31013036 -1.29104376 0.46852064]
  2. [-0.17601591 -1.57972014 -1.4737016 -2.61672091 -1.00810647 0.5747785 ]]

4.介绍cost函数

这里用的是softmax回归,借助tensorflow框架,只需要一行代码即可完成cost

  1. # GRADED FUNCTION: compute_cost
  2. def compute_cost(Z3, Y):
  3. """
  4. Computes the cost
  5. Arguments:
  6. Z3 -- output of forward propagation (output of the last LINEAR unit), of shape (6, number of examples)
  7. Y -- "true" labels vector placeholder, same shape as Z3
  8. Returns:
  9. cost - Tensor of the cost function
  10. """
  11. ### START CODE HERE ### (1 line of code)
  12. cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z3, labels=Y))
  13. ### END CODE HERE ###
  14. return cost

输出结果:

  1. tf.reset_default_graph()
  2. with tf.Session() as sess:
  3. np.random.seed(1)
  4. X, Y = create_placeholders(64, 64, 3, 6)
  5. parameters = initialize_parameters()
  6. Z3 = forward_propagation(X, parameters)
  7. cost = compute_cost(Z3, Y)
  8. init = tf.global_variables_initializer()
  9. sess.run(init)
  10. a = sess.run(cost, {X: np.random.randn(4,64,64,3), Y: np.random.randn(4,6)})
  11. print("cost = " + str(a))

结果:

  1. cost = 2.91034

3. 整合(model)函数

这个函数运用了前面的的所有的函数,创建占位符函数,随机初始化函数,前向传播函数,反向传播函数,cost函数。
然后用了mini-batch每一个Batch大小为64,对于反向传播,只需要一行。是下面这个:

下面我们来看看代码:

  1. def model(X_train, Y_train, X_test, Y_test, learning_rate=0.009,
  2. num_epochs=100, minibatch_size=64, print_cost=True):
  3. """
  4. Implements a three-layer ConvNet in Tensorflow:
  5. CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED
  6. Arguments:
  7. X_train -- training set, of shape (None, 64, 64, 3)
  8. Y_train -- test set, of shape (None, n_y = 6)
  9. X_test -- training set, of shape (None, 64, 64, 3)
  10. Y_test -- test set, of shape (None, n_y = 6)
  11. learning_rate -- learning rate of the optimization
  12. num_epochs -- number of epochs of the optimization loop
  13. minibatch_size -- size of a minibatch
  14. print_cost -- True to print the cost every 100 epochs
  15. Returns:
  16. train_accuracy -- real number, accuracy on the train set (X_train)
  17. test_accuracy -- real number, testing accuracy on the test set (X_test)
  18. parameters -- parameters learnt by the model. They can then be used to predict.
  19. """
  20. ops.reset_default_graph() # to be able to rerun the model without overwriting tf variables
  21. tf.set_random_seed(1) # to keep results consistent (tensorflow seed)
  22. seed = 3 # to keep results consistent (numpy seed)
  23. (m, n_H0, n_W0, n_C0) = X_train.shape
  24. n_y = Y_train.shape[1]
  25. costs = [] # To keep track of the cost
  26. # Create Placeholders of the correct shape
  27. ### START CODE HERE ### (1 line)
  28. X, Y = create_placeholders(n_H0, n_W0, n_C0, n_y)
  29. ### END CODE HERE ###
  30. # Initialize parameters
  31. ### START CODE HERE ### (1 line)
  32. parameters = initialize_parameters()
  33. ### END CODE HERE ###
  34. # Forward propagation: Build the forward propagation in the tensorflow graph
  35. ### START CODE HERE ### (1 line)
  36. Z3 = forward_propagation(X, parameters)
  37. ### END CODE HERE ###
  38. # Cost function: Add cost function to tensorflow graph
  39. ### START CODE HERE ### (1 line)
  40. cost = compute_cost(Z3, Y)
  41. ### END CODE HERE ###
  42. # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer that minimizes the cost.
  43. ### START CODE HERE ### (1 line)
  44. optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
  45. ### END CODE HERE ###
  46. # Initialize all the variables globally
  47. init = tf.global_variables_initializer()
  48. # Start the session to compute the tensorflow graph
  49. with tf.Session() as sess:
  50. # Run the initialization
  51. sess.run(init)
  52. # Do the training loop
  53. for epoch in range(num_epochs):
  54. minibatch_cost = 0.
  55. num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set
  56. seed = seed + 1
  57. minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
  58. for minibatch in minibatches:
  59. # Select a minibatch
  60. (minibatch_X, minibatch_Y) = minibatch
  61. # IMPORTANT: The line that runs the graph on a minibatch.
  62. # Run the session to execute the optimizer and the cost, the feedict should contain a minibatch for (X,Y).
  63. ### START CODE HERE ### (1 line)
  64. _, temp_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})
  65. ### END CODE HERE ###
  66. minibatch_cost += temp_cost / num_minibatches
  67. # Print the cost every epoch
  68. if print_cost == True and epoch % 5 == 0:
  69. print("Cost after epoch %i: %f" % (epoch, minibatch_cost))
  70. if print_cost == True and epoch % 1 == 0:
  71. costs.append(minibatch_cost)
  72. # plot the cost
  73. plt.plot(np.squeeze(costs))
  74. plt.ylabel('cost')
  75. plt.xlabel('iterations (per tens)')
  76. plt.title("Learning rate =" + str(learning_rate))
  77. plt.show()
  78. # Calculate the correct predictions
  79. predict_op = tf.argmax(Z3, 1)
  80. correct_prediction = tf.equal(predict_op, tf.argmax(Y, 1))
  81. # Calculate accuracy on the test set
  82. accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
  83. print(accuracy)
  84. train_accuracy = accuracy.eval({X: X_train, Y: Y_train})
  85. test_accuracy = accuracy.eval({X: X_test, Y: Y_test})
  86. print("Train Accuracy:", train_accuracy)
  87. print("Test Accuracy:", test_accuracy)
  88. return train_accuracy, test_accuracy, parameters

输出一下:

  1. _, _, parameters = model(X_train, Y_train, X_test, Y_test)

结果(迭代了100次,所以有20个输出(每隔五个输出一次))

  1. Cost after epoch 0: 1.917929
  2. Cost after epoch 5: 1.506757
  3. Cost after epoch 10: 0.955359
  4. Cost after epoch 15: 0.845802
  5. Cost after epoch 20: 0.701174
  6. Cost after epoch 25: 0.571977
  7. Cost after epoch 30: 0.518435
  8. Cost after epoch 35: 0.495806
  9. Cost after epoch 40: 0.429827
  10. Cost after epoch 45: 0.407291
  11. Cost after epoch 50: 0.366394
  12. Cost after epoch 55: 0.376922
  13. Cost after epoch 60: 0.299491
  14. Cost after epoch 65: 0.338870
  15. Cost after epoch 70: 0.316400
  16. Cost after epoch 75: 0.310413
  17. Cost after epoch 80: 0.249549
  18. Cost after epoch 85: 0.243457
  19. Cost after epoch 90: 0.200031
  20. Cost after epoch 95: 0.175452

cost曲线图片如下
cost

准确度计算:

  1. Tensor("Mean_1:0", shape=(), dtype=float32)
  2. Train Accuracy: 0.940741
  3. Test Accuracy: 0.783333

4. 大家可以用自己的图片试试

  1. fname = "images/thumbs_up.jpg"
  2. image = np.array(ndimage.imread(fname, flatten=False))
  3. my_image = scipy.misc.imresize(image, size=(64,64))
  4. plt.imshow(my_image)

好了,已经完成了用tensorflow搭建一个卷积神经网络

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注