[关闭]
@liushiya 2018-10-14T02:51:02.000000Z 字数 3525 阅读 1588

Logistic Regression and Support Vector Machine

机器学习 实验


你可以点击这里查看中文版本。

Motivation of Experiment

  1. Compare andand understand the difference between gradient descent and batch random stochastic gradient descent.
  2. Compare and understand the differences and relationships between Logistic regression and linear classification.
  3. Further understand the principles of SVM and practice on larger data.

Dataset

Experiment uses a9a of LIBSVM Data, including 32561/16281(testing) samples and each sample has 123/123 (testing) features. Please download the training set and validation set.

Environment for Experiment

python3, at least including following python package: sklearnnumpyjupytermatplotlib
It is recommended to install anaconda3 directly, which has built-in python package above.

Time and Place

2018-10-14 8:50-12:15 AM B7-138(Mingkui Tan) B7-238(Qingyao Wu)

Submit Deadline

2018-10-28 12:00 AM

Experiment Form

Complete Independently.

Experiment Step

The experimental code and drawing are completed on jupyter.

Logistic Regression and Batch Stochastic Gradient Descent

  1. Load the training set and validation set.
  2. Initialize logistic regression model parameter (you can consider initializing zeros, random numbers or normal distribution).
  3. Select the loss function and calculate its derivation, find more detail in PPT.
  4. Determine the size of the batch_size and randomly take some samples,calculate gradient G toward loss function from partial samples.
  5. Use the SGD optimization method to update the parametric model and encourage additional attempts to optimize the Adam method.
  6. Select the appropriate threshold, mark the sample whose predict scores greater than the threshold as positive, on the contrary as negative. Predict under validation set and get the loss .
  7. Repeat step 4 to 6 for several times, and drawing graph of with the number of iterations.

Linear Classification and Batch Stochastic Gradient Descent

  1. Load the training set and validation set.
  2. Initialize SVM model parameters (you can consider initializing zeros, random numbers or normal distribution).
  3. Select the loss function and calculate its derivation, find more details in PPT.
  4. Determine the size of the batch_size and randomly take some samples,calculate gradient G toward loss function from partial samples.
  5. Use the SGD optimization method to update the parametric model and encourage additional attempts to optimize the Adam method.
  6. Select the appropriate threshold, mark the sample whose predict scores greater than the threshold as positive, on the contrary as negative. Predict under validation set and get the loss .
  7. Repeat step 4 to 6 for several times, and draw graph of with the number of iterations.

Finishing experiment report according to result: The template of report can be found in example repository.

Evaluation

Item Proportion Description
Attendance 40% Ask for a leave if time conflict
Code availability 20% Complied successfully
Report 30% According to report model
Code specification 10% Mainly consider whether using the readable variable name

Requirement for Submission

Submission process

1.Access222.201.187.50:7001.
2.Click on the corresponding submission entry.
3.Fill in your name, student number, upload pdf format report and zip format code compression package.

Precautions


Any advice or idea is welcome to discuss with teaching assistant in QQ group.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注