[关闭]
@Perfect-Demo 2018-05-01T08:41:35.000000Z 字数 9231 阅读 916

deep_learning_week2_logistic回归

机器学习深度学习


代码已上传github:
https://github.com/PerfectDemoT/my_deeplearning_homework

这是吴恩达深度学习里的第一次作业

实现logistic回归


1. 先导入包

  1. import numpy as np
  2. import matplotlib.pyplot as plt
  3. import h5py
  4. import scipy
  5. from PIL import Image
  6. import pylab
  7. from scipy import ndimage
  8. from lr_utils import load_dataset #这个是里面的一个导入数据的py文件

2. 导入数据,再显示一张图片看看效果

导入数据代码如下:

  1. # Loading the data (cat/non-cat)
  2. train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
  3. #分别为训练集x 209个,训练集y 209个,测试集x 50个,测试集y 50个

这是一张图片
图片

  1. #这时输出上面这张图的代码
  2. index = 19
  3. plt.imshow(train_set_x_orig[index])
  4. pylab.show()
  5. print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") + "' picture.")

之后会让你看看导入数据的维数,代码如下:

  1. ### START CODE HERE ### (≈ 3 lines of code)
  2. m_train = train_set_x_orig.shape[0]
  3. m_test = test_set_x_orig.shape[0]
  4. num_px = train_set_x_orig.shape[1]
  5. ### END CODE HERE ###
  6. print ("Number of training examples: m_train = " + str(m_train))
  7. print ("Number of testing examples: m_test = " + str(m_test))
  8. print ("Height/Width of each image: num_px = " + str(num_px))
  9. print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
  10. print ("train_set_x shape: " + str(train_set_x_orig.shape))
  11. print ("train_set_y shape: " + str(train_set_y.shape))
  12. print ("test_set_x shape: " + str(test_set_x_orig.shape))
  13. print ("test_set_y shape: " + str(test_set_y.shape))

结果显示是这样:
结果显示


3. 现在可以开始写函数了

首先把训练数据的64*64*3的矩阵变化为向量,代码如下:

  1. ### START CODE HERE ### (≈ 2 lines of code)
  2. train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
  3. test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
  4. ###现在处理好了图像,将其全部向量化了

将原来的四维(样本个数,图片横,竖,RGB)变为二维(样本个数,其他所有)


然后现在开始归一化

  1. #现在开始归一化,将RGB里的0-255全部变为0-1(除以255)
  2. train_set_x = train_set_x_flatten/255.
  3. test_set_x = test_set_x_flatten/255.

现在可以开始写各种函数了:

sogmoid函数

  1. def sigmoid(z):
  2. # """
  3. # Compute the sigmoid of z
  4. #
  5. # Arguments:
  6. # z -- A scalar or numpy array of any size.
  7. # ​
  8. # Return:
  9. # s -- sigmoid(z)
  10. # """
  11. ### START CODE HERE ### (≈ 1 line of code)
  12. s=1./(1.+ np.exp(-z))
  13. ### END CODE HERE ###
  14. return s

随机初始化w和b的函数

  1. def initialize_with_zeros(dim):
  2. # """
  3. # This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.
  4. #
  5. # Argument:
  6. # dim -- size of the w vector we want (or number of parameters in this case)
  7. #
  8. # Returns:
  9. # w -- initialized vector of shape (dim, 1)
  10. # b -- initialized scalar (corresponds to the bias)
  11. # """
  12. ### START CODE HERE ### (≈ 1 line of code)
  13. w = np.zeros((dim, 1))
  14. b = 0
  15. ### END CODE HERE ###
  16. assert (w.shape == (dim, 1))
  17. assert (isinstance(b, float) or isinstance(b, int))
  18. return w, b

现在来前向和反向传播

  1. def propagate(w, b, X, Y):
  2. # """
  3. # Implement the cost function and its gradient for the propagation explained above
  4. # ​
  5. # Arguments:
  6. # w -- weights, a numpy array of size (num_px * num_px * 3, 1)
  7. # b -- bias, a scalar
  8. # X -- data of size (num_px * num_px * 3, number of examples)
  9. # Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)
  10. # ​
  11. # Return:
  12. # cost -- negative log-likelihood cost for logistic regression
  13. # dw -- gradient of the loss with respect to w, thus same shape as w
  14. # db -- gradient of the loss with respect to b, thus same shape as b
  15. #
  16. # Tips:
  17. # - Write your code step by step for the propagation. np.log(), np.dot()
  18. # """
  19. m = X.shape[1]
  20. # FORWARD PROPAGATION (FROM X TO COST)
  21. ### START CODE HERE ### (≈ 2 lines of code)
  22. A = sigmoid(np.dot(w.T , X) + b) # compute activation
  23. cost = np.sum(Y * np.log(A) + (1-Y) * np.log(1-A))/(-m) # compute cost
  24. ### END CODE HERE ###
  25. # BACKWARD PROPAGATION (TO FIND GRAD)
  26. ### START CODE HERE ### (≈ 2 lines of code)
  27. dw = np.dot(X , (A-Y).T)/m
  28. db = np.sum(A-Y)/m
  29. ### END CODE HERE ### ​
  30. assert (dw.shape == w.shape)
  31. assert (db.dtype == float)
  32. cost = np.squeeze(cost)
  33. assert (cost.shape == ())
  34. grads = {"dw": dw,
  35. "db": db}
  36. return grads, cost

更新w和b的函数

  1. def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=False):
  2. # """
  3. # This function optimizes w and b by running a gradient descent algorithm
  4. #
  5. # Arguments:
  6. # w -- weights, a numpy array of size (num_px * num_px * 3, 1)
  7. # b -- bias, a scalar
  8. # X -- data of shape (num_px * num_px * 3, number of examples)
  9. # Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)
  10. # num_iterations -- number of iterations of the optimization loop
  11. # learning_rate -- learning rate of the gradient descent update rule
  12. # print_cost -- True to print the loss every 100 steps
  13. #
  14. # Returns:
  15. # params -- dictionary containing the weights w and bias b
  16. # grads -- dictionary containing the gradients of the weights and bias with respect to the cost function
  17. # costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve.
  18. #
  19. # Tips:
  20. # You basically need to write down two steps and iterate through them:
  21. # 1) Calculate the cost and the gradient for the current parameters. Use propagate().
  22. # 2) Update the parameters using gradient descent rule for w and b.
  23. # """
  24. costs = []
  25. for i in range(num_iterations):
  26. # Cost and gradient calculation (≈ 1-4 lines of code)
  27. ### START CODE HERE ###
  28. grads, cost = propagate(w , b , X , Y)
  29. ### END CODE HERE ###
  30. # Retrieve derivatives from grads
  31. dw = grads["dw"]
  32. db = grads["db"]
  33. # update rule (≈ 2 lines of code)
  34. ### START CODE HERE ###
  35. w = w - learning_rate * dw
  36. b = b - learning_rate * db
  37. ### END CODE HERE ###
  38. # Record the costs
  39. if i % 100 == 0:
  40. costs.append(cost)
  41. # Print the cost every 100 training examples
  42. # 每100次输出一个代价函数
  43. if print_cost and i % 100 == 0:
  44. print("Cost after iteration %i: %f" % (i, cost))
  45. params = {"w": w,
  46. "b": b}
  47. grads = {"dw": dw,
  48. "db": db}
  49. return params, grads, costs

现在这是预测函数

  1. def predict(w, b, X):
  2. # '''
  3. # Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)
  4. #
  5. # Arguments:
  6. # w -- weights, a numpy array of size (num_px * num_px * 3, 1)
  7. # b -- bias, a scalar
  8. # X -- data of size (num_px * num_px * 3, number of examples)
  9. #
  10. # Returns:
  11. # Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X
  12. # '''
  13. m = X.shape[1]
  14. Y_prediction = np.zeros((1, m))
  15. w = w.reshape(X.shape[0], 1)
  16. # Compute vector "A" predicting the probabilities of a cat being present in the picture
  17. ### START CODE HERE ### (≈ 1 line of code)
  18. A = np.dot(w.T , X)
  19. ### END CODE HERE ###
  20. for i in range(A.shape[1]):
  21. # Convert probabilities A[0,i] to actual predictions p[0,i]
  22. ### START CODE HERE ### (≈ 4 lines of code)
  23. if (A[0, i] > 0.5):
  24. Y_prediction[0][i] = 1
  25. else:
  26. Y_prediction[0][i] = 0
  27. ### END CODE HERE ###
  28. assert (Y_prediction.shape == (1, m))
  29. return Y_prediction

4.现在终于可以开始真正的训练参数了(之前都是用的测试,现在开始用真正的图片数据来训练)

代码如下:

  1. #现在开始将上面的这些函数整合起来
  2. # GRADED FUNCTION: model
  3. print("===============终于,开始处理图像了=========================")
  4. def model(X_train, Y_train, X_test, Y_test, num_iterations=10000, learning_rate=0.01, print_cost=False):
  5. # """
  6. # Builds the logistic regression model by calling the function you've implemented previously
  7. #
  8. # Arguments:
  9. # X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)
  10. # Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)
  11. # X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)
  12. # Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)
  13. # num_iterations -- hyperparameter representing the number of iterations to optimize the parameters
  14. # learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()
  15. # print_cost -- Set to true to print the cost every 100 iterations
  16. #
  17. # Returns:
  18. # d -- dictionary containing information about the model.
  19. # """
  20. ### START CODE HERE ###
  21. # initialize parameters with zeros (≈ 1 line of code)
  22. w, b = initialize_with_zeros(X_train.shape[0])
  23. # Gradient descent (≈ 1 line of code)
  24. parameters, grads, costs = optimize(w , b , X_train , Y_train , num_iterations , learning_rate , print_cost=False)
  25. # Retrieve parameters w and b from dictionary "parameters"
  26. w = parameters["w"]
  27. b = parameters["b"]
  28. # Predict test/train set examples (≈ 2 lines of code)
  29. Y_prediction_test = predict(w , b , X_test)
  30. Y_prediction_train = predict(w , b , X_train)
  31. ### END CODE HERE ###
  32. # Print train/test Errors
  33. print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
  34. print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
  35. d = {"costs": costs,
  36. "Y_prediction_test": Y_prediction_test,
  37. "Y_prediction_train": Y_prediction_train,
  38. "w": w,
  39. "b": b,
  40. "learning_rate": learning_rate,
  41. "num_iterations": num_iterations}
  42. return d

下面是调用这个函数来训练的例子:

  1. d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 10000, learning_rate = 0.001, print_cost = True)

训练完毕,我们来看看准确率

  1. print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
  2. print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))

结果是这样的:
结果


现在我们来看看代价函数曲线(代码长这样):

  1. costs = np.squeeze(d['costs'])
  2. plt.plot(costs)
  3. plt.ylabel('cost')
  4. plt.xlabel('iterations (per hundreds)')
  5. plt.title("Learning rate =" + str(d["learning_rate"]))
  6. plt.show()

图是这样的(这里的学习速率是0.001,迭代次数10000):
图片


下面来看看学习速率的选择(先上代码):

  1. learning_rates = [0.01, 0.001, 0.0001]#定义一个有三个值的数组,这里只是给你看方法,实际上的取值是有方法的(在机器学习里)
  2. models = {}
  3. for i in learning_rates:
  4. print ("learning rate is: " + str(i))
  5. models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
  6. print ('\n' + "-------------------------------------------------------" + '\n')
  7. for i in learning_rates:
  8. plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))
  9. plt.ylabel('cost')
  10. plt.xlabel('iterations')
  11. legend = plt.legend(loc='upper center', shadow=True)
  12. frame = legend.get_frame()
  13. frame.set_facecolor('0.90')
  14. plt.show()
  15. #通过图像的发现,三个学习速率中,0.001是最好的(好吧,如果不管前面蓝色的波动,0.01或许也不错,不过,你会发现训练集的误差已经很小了,但验证集误差却很大,存在过拟合情况,所以用0.01并不会减小验证集的误差)

下面是图片
图片
然后学习率对应的准确率如下:
图片


最后,来点有趣的,用自己的图

现在用自己的图像玩玩,1表示预测是猫,0表示不是

  1. ## START CODE HERE ## (PUT YOUR IMAGE NAME)
  2. my_image = "my_image2.jpg" # change this to the name of your image file
  3. ## END CODE HERE ##
  4. # ​
  5. # We preprocess the image to fit your algorithm.
  6. fname = "images/" + my_image
  7. image = np.array(ndimage.imread(fname, flatten=False))
  8. my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T
  9. my_predicted_image = predict(d["w"], d["b"], my_image)
  10. plt.imshow(image)
  11. print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") + "\" picture.")

图长这样:
图片
然后判断是:
图片

我还试了一张埃菲尔铁塔的,判断为不是猫,感觉还不错。。。(好吧,其实看识别率还挺低),这里还缺少了一个正规化来防止过拟合(并且个人感觉,这个一定overfitting了,因为训练样本准确率非常高,而测试样本准确率却很低),防止过拟合将在下次来实现

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注