[关闭]
@Perfect-Demo 2018-05-11T10:16:13.000000Z 字数 14422 阅读 1621

deep_learning_month4_week2_Residual_Networks

机器学习深度学习

代码已上传github:
https://github.com/PerfectDemoT/my_deeplearning_homework


这是一个搭建一个50层的残差网络的个人参考笔记。
里面用到了大量的keras的封装函数,所以更加简洁,有利于快速实现需求。(另外,最后50层的残差网络用CPU训练起来是真的久,1000张图一次迭代可以弄几分钟,再一次体会到训练几周不是梦,不对,是自己的电脑根本跑不动大数据。。。)

另外最后的应用实例是用图像识别来看我们用手比划的到底是数字几(可以是0-5)

下面来看看残差网络的具体实现步骤:

1. 残差块概念

残差块

如上图所示,左边是普通网络,右边是残差网络(这就是个残差块)。可以看作这样:


并且约定:



残差网络大概就是这么个意思,下面我们具体看看对于CNN,我们是如何具体操作的。

2. CNN的残差块构建

1. 先导入包

  1. import numpy as np
  2. from keras import layers
  3. from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D
  4. from keras.models import Model, load_model
  5. from keras.preprocessing import image
  6. from keras.utils import layer_utils
  7. from keras.utils.data_utils import get_file
  8. from keras.applications.imagenet_utils import preprocess_input
  9. import pydot
  10. from IPython.display import SVG
  11. from keras.utils.vis_utils import model_to_dot
  12. from keras.utils import plot_model
  13. from resnets_utils import *
  14. from keras.initializers import glorot_uniform
  15. import scipy.misc
  16. from matplotlib.pyplot import imshow
  17. %matplotlib inline
  18. import keras.backend as K
  19. K.set_image_data_format('channels_last')
  20. K.set_learning_phase(1)

2. identity block示意图以及实现

两层的identity block
identity block

三层的identity block
identity block

接下来做好准备,我们会用到大量的keras的封装函数,并且之前已经导入了,所以这里不需要带前缀什么的,直接用名字使用,所以不要觉得奇怪。

  1. # GRADED FUNCTION: identity_block
  2. def identity_block(X, f, filters, stage, block):
  3. """
  4. Implementation of the identity block as defined in Figure 3
  5. Arguments:
  6. X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
  7. f -- integer, specifying the shape of the middle CONV's window for the main path
  8. filters -- python list of integers, defining the number of filters in the CONV layers of the main path
  9. stage -- integer, used to name the layers, depending on their position in the network
  10. block -- string/character, used to name the layers, depending on their position in the network
  11. Returns:
  12. X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
  13. """
  14. # defining name basis
  15. conv_name_base = 'res' + str(stage) + block + '_branch'
  16. bn_name_base = 'bn' + str(stage) + block + '_branch'
  17. # Retrieve Filters
  18. F1, F2, F3 = filters
  19. # Save the input value. You'll need this later to add back to the main path.
  20. X_shortcut = X
  21. # First component of main path
  22. X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X)
  23. X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
  24. X = Activation('relu')(X)
  25. ### START CODE HERE ###
  26. # Second component of main path (≈3 lines)
  27. X = Conv2D(filters=F2, kernel_size=(f,f), strides=(1,1), padding='same', name=conv_name_base+'2b', kernel_initializer=glorot_uniform(seed=0))(X)
  28. X = BatchNormalization(axis=3, name=bn_name_base + '2b')(X)
  29. X = Activation('relu')(X)
  30. # Third component of main path (≈2 lines)
  31. X = Conv2D(filters=F3, kernel_size=(1,1), strides=(1,1), padding='valid', name=conv_name_base + '2c', kernel_initializer=glorot_uniform(seed=0))(X)
  32. X = BatchNormalization(axis=3, name=bn_name_base + '2c')(X)
  33. # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
  34. X = Add()([X, X_shortcut])
  35. X = Activation('relu')(X)
  36. ### END CODE HERE ###
  37. return X

可以看出来,这写的是对于三层的情况。并且为了区分,我们还意义对他们进行了编号(那个a,b,c就是)。

值得一提的是,如果对keras不熟悉的同学,这里的理解可能会遇到一些困难,可能需要恶补一下。。。

接下来我们输出检验一下

  1. tf.reset_default_graph()
  2. with tf.Session() as test:
  3. np.random.seed(1)
  4. A_prev = tf.placeholder("float", [3, 4, 4, 6])
  5. X = np.random.randn(3, 4, 4, 6)
  6. A = identity_block(A_prev, f = 2, filters = [2, 4, 6], stage = 1, block = 'a')
  7. test.run(tf.global_variables_initializer())
  8. out = test.run([A], feed_dict={A_prev: X, K.learning_phase(): 0})
  9. print("out = " + str(out[0][1][1][0]))

参考结果:

  1. out = [ 0.94822997 0. 1.16101444 2.747859 0. 1.36677003]

3. convolutional block示意图及其实现

convolutional block

从图中可以看出,我们对shortcut路径,不是简单的相加了,而是将先卷积在BatchNorm。原因是为了解决残差块的输入以及输出维度不匹配的情况,即并不能直接加上的情况。

下面看看代码实现,和之前的identity block没有大的区别,只是shortcut那里多了一项。

  1. # GRADED FUNCTION: convolutional_block
  2. def convolutional_block(X, f, filters, stage, block, s = 2):
  3. """
  4. Implementation of the convolutional block as defined in Figure 4
  5. Arguments:
  6. X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
  7. f -- integer, specifying the shape of the middle CONV's window for the main path
  8. filters -- python list of integers, defining the number of filters in the CONV layers of the main path
  9. stage -- integer, used to name the layers, depending on their position in the network
  10. block -- string/character, used to name the layers, depending on their position in the network
  11. s -- Integer, specifying the stride to be used
  12. Returns:
  13. X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
  14. """
  15. # defining name basis
  16. conv_name_base = 'res' + str(stage) + block + '_branch'
  17. bn_name_base = 'bn' + str(stage) + block + '_branch'
  18. # Retrieve Filters
  19. F1, F2, F3 = filters
  20. # Save the input value
  21. X_shortcut = X
  22. ##### MAIN PATH #####
  23. # First component of main path
  24. X = Conv2D(F1, (1, 1), strides = (s,s), name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X)
  25. X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
  26. X = Activation('relu')(X)
  27. ### START CODE HERE ###
  28. # Second component of main path (≈3 lines)
  29. X = Conv2D(filters=F2, kernel_size=(f,f), strides=(1,1), padding='same', name=conv_name_base+'2b', kernel_initializer=glorot_uniform(seed=0))(X)
  30. X = BatchNormalization(axis=3, name=bn_name_base+'2b')(X)
  31. X = Activation('relu')(X)
  32. # Third component of main path (≈2 lines)
  33. X = Conv2D(filters=F3, kernel_size=(1,1), strides=(1,1), padding='valid', name=conv_name_base+'2c', kernel_initializer=glorot_uniform(seed=0))(X)
  34. X = BatchNormalization(axis=3, name=bn_name_base+'2c')(X)
  35. ##### SHORTCUT PATH #### (≈2 lines)
  36. X_shortcut = Conv2D(filters=F3, kernel_size=(1,1), strides=(s, s), padding='valid', name=conv_name_base+'1', kernel_initializer=glorot_uniform(seed=0))(X_shortcut)
  37. X_shortcut = BatchNormalization(axis=3, name=bn_name_base+'1')(X_shortcut)
  38. # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
  39. X = Add()([X, X_shortcut])
  40. X = Activation('relu')(X)
  41. ### END CODE HERE ###
  42. return X

可以看出,除了47-50行的shortcut是另外增加的外,其他的和之前的identity block并无大的区别,所以理解了之前的,这个也不难理解。

现在测试看看

  1. tf.reset_default_graph()
  2. with tf.Session() as test:
  3. np.random.seed(1)
  4. A_prev = tf.placeholder("float", [3, 4, 4, 6])
  5. X = np.random.randn(3, 4, 4, 6)
  6. A = convolutional_block(A_prev, f = 2, filters = [2, 4, 6], stage = 1, block = 'a')
  7. test.run(tf.global_variables_initializer())
  8. out = test.run([A], feed_dict={A_prev: X, K.learning_phase(): 0})
  9. # print(len(out[0]))
  10. # print(out)
  11. print("out = " + str(out[0][1][1][0]))

结果为:

  1. [ 0.09018463 1.23489773 0.46822017 0.0367176 0. 0.65516603]

3. 开始搭建50层的残差网络

下面是示意图:
50层ResNets

图中的“ID BLOCK” 意思是“Identity block,” and “ID BLOCK x3” 然后你要将3个identity block叠加在一起。

贴出一段说明详情:

The details of this ResNet-50 model are:
- Zero-padding pads the input with a pad of (3,3)
- Stage 1:
- The 2D Convolution has 64 filters of shape (7,7) and uses a stride of (2,2). Its name is “conv1”.
- BatchNorm is applied to the channels axis of the input.
- MaxPooling uses a (3,3) window and a (2,2) stride.
- Stage 2:
- The convolutional block uses three set of filters of size [64,64,256], “f” is 3, “s” is 1 and the block is “a”.
- The 2 identity blocks use three set of filters of size [64,64,256], “f” is 3 and the blocks are “b” and “c”.
- Stage 3:
- The convolutional block uses three set of filters of size [128,128,512], “f” is 3, “s” is 2 and the block is “a”.
- The 3 identity blocks use three set of filters of size [128,128,512], “f” is 3 and the blocks are “b”, “c” and “d”.
- Stage 4:
- The convolutional block uses three set of filters of size [256, 256, 1024], “f” is 3, “s” is 2 and the block is “a”.
- The 5 identity blocks use three set of filters of size [256, 256, 1024], “f” is 3 and the blocks are “b”, “c”, “d”, “e” and “f”.
- Stage 5:
- The convolutional block uses three set of filters of size [512, 512, 2048], “f” is 3, “s” is 2 and the block is “a”.
- The 2 identity blocks use three set of filters of size [256, 256, 2048], “f” is 3 and the blocks are “b” and “c”.
- The 2D Average Pooling uses a window of shape (2,2) and its name is “avg_pool”.
- The flatten doesn’t have any hyperparameters or name.
- The Fully Connected (Dense) layer reduces its input to the number of classes using a softmax activation. Its name should be 'fc' + str(classes).

上面这段说明这其实就是指出了里面的一些超参数的设定。
下面看看代码:

  1. def ResNet50(input_shape = (64, 64, 3), classes = 6):
  2. """
  3. Implementation of the popular ResNet50 the following architecture:
  4. CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
  5. -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER
  6. Arguments:
  7. input_shape -- shape of the images of the dataset
  8. classes -- integer, number of classes
  9. Returns:
  10. model -- a Model() instance in Keras
  11. """
  12. # Define the input as a tensor with shape input_shape
  13. X_input = Input(input_shape)
  14. # Zero-Padding
  15. X = ZeroPadding2D((3, 3))(X_input)
  16. # Stage 1
  17. X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = glorot_uniform(seed=0))(X)
  18. X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
  19. X = Activation('relu')(X)
  20. X = MaxPooling2D((3, 3), strides=(2, 2))(X)
  21. # Stage 2
  22. X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1)
  23. X = identity_block(X, 3, [64, 64, 256], stage=2, block='b')
  24. X = identity_block(X, 3, [64, 64, 256], stage=2, block='c')
  25. ### START CODE HERE ###
  26. # helper functions
  27. # convolutional_block(X, f, filters, stage, block, s = 2)
  28. # identity_block(X, f, filters, stage, block)
  29. # Stage 3 (≈4 lines)
  30. X = convolutional_block(X, f=3, filters=[128, 128, 512], stage=3, block='a', s=2)
  31. X = identity_block(X, f=3, filters=[128, 128, 512], stage=3, block='b')
  32. X = identity_block(X, f=3, filters=[128, 128, 512], stage=3, block='c')
  33. X = identity_block(X, f=3, filters=[128, 128, 512], stage=3, block='d')
  34. # Stage 4 (≈6 lines)
  35. X = convolutional_block(X, f=3, filters=[256, 256, 1024], stage=4, block='a', s=2)
  36. X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block='b')
  37. X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block='c')
  38. X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block='d')
  39. X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block='e')
  40. X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block='f')
  41. # Stage 5 (≈3 lines)
  42. X = convolutional_block(X, f=3, filters=[512, 512, 2048], stage=5, block='a', s=2)
  43. X = identity_block(X, f=3, filters=[512, 512, 2048], stage=5, block='b')
  44. X = identity_block(X, f=3, filters=[512, 512, 2048], stage=5, block='c')
  45. # AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)"
  46. X = AveragePooling2D((2,2), name='avg_pool')(X)
  47. ### END CODE HERE ###
  48. # output layer
  49. X = Flatten()(X)
  50. X = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = glorot_uniform(seed=0))(X)
  51. # Create model
  52. model = Model(inputs = X_input, outputs = X, name='ResNet50')
  53. return model

可以看到,这里用到了之前写的:
convolutional_block(X, f, filters, stage, block, s = 2)
identity_block(X, f, filters, stage, block)
这样就搭建好了一个50层的残差网络。
同样需要说明的是Flatten() , Dense()等等函数都是keras里的,在前面已经导入了,所以这里可以直接用。

4. 利用上面的搭建完成训练

调用一下语句来调用上面写的函数

  1. model = ResNet50(input_shape = (64, 64, 3), classes = 6)

接着:
As seen in the Keras Tutorial Notebook, prior training a model, you need to configure the learning process by compiling the model.

  1. model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

导入数据,并看看数据规模

  1. X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()
  2. # Normalize image vectors
  3. X_train = X_train_orig/255.
  4. X_test = X_test_orig/255.
  5. # Convert training and test labels to one hot matrices
  6. Y_train = convert_to_one_hot(Y_train_orig, 6).T
  7. Y_test = convert_to_one_hot(Y_test_orig, 6).T
  8. print ("number of training examples = " + str(X_train.shape[0]))
  9. print ("number of test examples = " + str(X_test.shape[0]))
  10. print ("X_train shape: " + str(X_train.shape))
  11. print ("Y_train shape: " + str(Y_train.shape))
  12. print ("X_test shape: " + str(X_test.shape))
  13. print ("Y_test shape: " + str(Y_test.shape))

看看数据规模

  1. number of training examples = 1080
  2. number of test examples = 120
  3. X_train shape: (1080, 64, 64, 3)
  4. Y_train shape: (1080, 6)
  5. X_test shape: (120, 64, 64, 3)
  6. Y_test shape: (120, 6)

然后,开始训练啦:

  1. model.fit(X_train, Y_train, epochs = 2, batch_size = 32)

然后你就发现训练及其缓慢(当然,我指的是CPU的,用GPU的大佬请忽略我这句话。。。),看看我的训练的图就知道了

一千多张图,一次迭代就要两分多钟,,,所以,如果大家想检验的话还是,要不用GPU,要不可以直接用大佬训练好的.h5文件。

下面贴出一下大佬用GPU训练好的.h5文件的链接:

resnet50_20_epochs.h5 链接:https://pan.baidu.com/s/1eROf3BO 密码:qed2
resnet50_30_epochs.h5 链接:https://pan.baidu.com/s/1o8kPNUM 密码:tqio
resnet50_44_epochs.h5 链接:https://pan.baidu.com/s/1c1N3AzI 密码:2xwu
resnet50_55_epochs.h5 链接:https://pan.baidu.com/s/1bpfMA0v 密码:cxcv
Coursera上提供的模型文件:
ResNet50.h5 链接:链接:https://pan.baidu.com/s/1boCG2Iz 密码:sefq

在此感谢这位大佬,贴出他的博客链接
https://blog.csdn.net/hongbin_xu/article/details/78766642

用他的训练数据我们可以得到以下准确度:

  1. model = load_model('ResNet50.h5')
  2. model = load_model('resnet50_44_epochs.h5')
  3. preds = model.evaluate(X_test, Y_test)
  4. print ("Loss = " + str(preds[0]))
  5. print ("Test Accuracy = " + str(preds[1]))
  1. 120/120 [==============================] - 9s 78ms/step
  2. Loss = 0.0914498666922
  3. Test Accuracy = 0.958333337307

可以打印出网络结构

  1. model.summary()

可以测试一下自己的图片

  1. img_path = 'images/my_image.jpg'
  2. img = image.load_img(img_path, target_size=(64, 64))
  3. x = image.img_to_array(img)
  4. x = np.expand_dims(x, axis=0)
  5. x = preprocess_input(x)
  6. print('Input image shape:', x.shape)
  7. my_image = scipy.misc.imread(img_path)
  8. imshow(my_image)
  9. print("class prediction vector [p(0), p(1), p(2), p(3), p(4), p(5)] = ")
  10. print(model.predict(x))

5. 训练过程中的代码

虽然CPU训练一个这个例子要很久,但是我们也可以来看看过程:

我们将epoch定为20。

  1. model = ResNet50(input_shape = (64, 64, 3), classes = 6)
  2. model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
  3. model.fit(X_train, Y_train, epochs = 20, batch_size = 32)
  4. model.save('resnet50_20_epochs.h5')
  5. preds = model1.evaluate(X_test, Y_test)
  6. print ("Loss = " + str(preds[0]))
  7. print ("Test Accuracy = " + str(preds[1]))

然后我们可以看到:

  1. Epoch 1/20
  2. 1080/1080 [==============================] - 15s 14ms/step - loss: 2.5141 - acc: 0.4241
  3. Epoch 2/20
  4. 1080/1080 [==============================] - 5s 5ms/step - loss: 1.7727 - acc: 0.6194
  5. Epoch 3/20
  6. 1080/1080 [==============================] - 6s 5ms/step - loss: 1.4935 - acc: 0.6769
  7. Epoch 4/20
  8. 1080/1080 [==============================] - 5s 5ms/step - loss: 1.5494 - acc: 0.5833
  9. Epoch 5/20
  10. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.6902 - acc: 0.7889
  11. Epoch 6/20
  12. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.4155 - acc: 0.8593
  13. Epoch 7/20
  14. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.2782 - acc: 0.9139
  15. Epoch 8/20
  16. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.1665 - acc: 0.9500
  17. Epoch 9/20
  18. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.2578 - acc: 0.9185
  19. Epoch 10/20
  20. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.1690 - acc: 0.9435
  21. Epoch 11/20
  22. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0913 - acc: 0.9694
  23. Epoch 12/20
  24. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.1389 - acc: 0.9602
  25. Epoch 13/20
  26. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.1490 - acc: 0.9444
  27. Epoch 14/20
  28. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.1044 - acc: 0.9694
  29. Epoch 15/20
  30. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0435 - acc: 0.9861
  31. Epoch 16/20
  32. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0324 - acc: 0.9926
  33. Epoch 17/20
  34. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0190 - acc: 0.9926
  35. Epoch 18/20
  36. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0577 - acc: 0.9824
  37. Epoch 19/20
  38. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0268 - acc: 0.9907
  39. Epoch 20/20
  40. 1080/1080 [==============================] - 5s 5ms/step - loss: 0.0662 - acc: 0.9787
  41. 120/120 [==============================] - 2s 17ms/step
  42. Loss = 0.825686124961
  43. Test Accuracy = 0.833333333333
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注