@liushiya 2018-10-14T02:51:02.000000Z 字数 3525 阅读 1616

Logistic Regression and Support Vector Machine

机器学习 实验

你可以点击这里查看中文版本。

Motivation of Experiment

Compare andand understand the difference between gradient descent and batch random stochastic gradient descent.
Compare and understand the differences and relationships between Logistic regression and linear classification.
Further understand the principles of SVM and practice on larger data.

Dataset

Experiment uses a9a of LIBSVM Data, including 32561/16281(testing) samples and each sample has 123/123 (testing) features. Please download the training set and validation set.

Environment for Experiment

python3, at least including following python package: sklearn，numpy，jupyter，matplotlib
It is recommended to install anaconda3 directly, which has built-in python package above.

Time and Place

2018-10-14 8:50-12:15 AM B7-138(Mingkui Tan) B7-238（Qingyao Wu）

Submit Deadline

2018-10-28 12:00 AM

Experiment Form

Complete Independently.

Experiment Step

The experimental code and drawing are completed on jupyter.

Logistic Regression and Batch Stochastic Gradient Descent

Load the training set and validation set.
Initialize logistic regression model parameter (you can consider initializing zeros, random numbers or normal distribution).
Select the loss function and calculate its derivation, find more detail in PPT.
Determine the size of the batch_size and randomly take some samples,calculate gradient G toward loss function from partial samples.
Use the SGD optimization method to update the parametric model and encourage additional attempts to optimize the Adam method.
Select the appropriate threshold, mark the sample whose predict scores greater than the threshold as positive, on the contrary as negative. Predict under validation set and get the loss $Lvalidation$ .
Repeat step 4 to 6 for several times, and drawing graph of $Lvalidation$ with the number of iterations.

Linear Classification and Batch Stochastic Gradient Descent

Load the training set and validation set.
Initialize SVM model parameters (you can consider initializing zeros, random numbers or normal distribution).
Select the loss function and calculate its derivation, find more details in PPT.
Determine the size of the batch_size and randomly take some samples,calculate gradient G toward loss function from partial samples.
Use the SGD optimization method to update the parametric model and encourage additional attempts to optimize the Adam method.
Select the appropriate threshold, mark the sample whose predict scores greater than the threshold as positive, on the contrary as negative. Predict under validation set and get the loss $Lvalidation$ .
Repeat step 4 to 6 for several times, and draw graph of $Lvalidation$ with the number of iterations.

Finishing experiment report according to result: The template of report can be found in example repository.

Evaluation

Item	Proportion	Description
Attendance	40%	Ask for a leave if time conflict
Code availability	20%	Complied successfully
Report	30%	According to report model
Code specification	10%	Mainly consider whether using the readable variable name

Requirement for Submission

Submission process

1.Access222.201.187.50:7001.
2.Click on the corresponding submission entry.
3.Fill in your name, student number, upload pdf format report and zip format code compression package.

Precautions

Experiment reports and code can be uploaded multiple times, and multiple uploads will overwrite previously submitted files.
After uploading, you can refresh the page and check if the upload is successful in the file list below.
Teaching assistants save all uploaded results at the experimental deadline, and the files uploaded after the deadline are invalid.
If you write an experiment report in Word, you need to export it to pdf format.
The package format of the code file must be zip. Please do not submit the compressed file in rar format.
Submit URL can only be accessed by campus network.
The code is written in python language, the experimental report score standard English is better than Chinese, latex is better than word.

Any advice or idea is welcome to discuss with teaching assistant in QQ group.