[关闭]
@Macux 2015-12-01T06:52:13.000000Z 字数 981 阅读 1046

R语言_RandomForest

R语言_学习笔记


1、准备工作:

  1. > library(randomForest)
  2. > bank1 <- read.csv("bank-full.csv")
  3. > book.sample1 <- subset(bank1,subset=(y=="yes"))
  4. > book.sample2 <- subset(bank1,select=c(1:17),subset=(y=="no"))
  5. > bank.sample1 <- book.sample1[sample(1:nrow(book.sample1),200,replace=FALSE),]
  6. > bank.sample2 <- book.sample2[sample(1:nrow(book.sample2),200,replace=FALSE),]
  7. > sample <- rbind(bank.sample1,bank.sample2)

2、构建随机森林模型:

  1. > set.seed(111)
  2. > bank.rf <- randomForest(y ~ .,data=sample,importance=TRUE,proximity=TRUE,ntree=1000)

3、输出混淆矩阵:

  1. > bank.rf
  2. Call:
  3. randomForest(formula = y ~ ., data = sample, importance = TRUE,proximity = TRUE, ntree = 1000)
  4. Type of random forest: classification
  5. Number of trees: 1000
  6. No. of variables tried at each split: 4
  7. OOB estimate of error rate: 21.75%
  8. Confusion matrix:
  9. no yes class.error
  10. no 151 49 0.245
  11. yes 38 162 0.190

4、输出各指标(变量)的重要性:

  1. > varImpPlot(bank.rf2)

此处输入图片的描述

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注