悦月直播免费版app下载 - 悦月直播app大全下载最新版本免费安装软件

基于RFECV特征選擇和隨機(jī)森林預(yù)測(cè)模型的應(yīng)用與優(yōu)化

  • 打印
  • 收藏
收藏成功


打開文本圖片集

摘要:該文基于隨機(jī)森林預(yù)測(cè)模型,提出RFECV特征選擇方法:首先對(duì)特征變量進(jìn)行獨(dú)熱編碼,再利用RFECV內(nèi)置的交叉驗(yàn)證評(píng)估各特征子集性能,以確定最佳特征數(shù)量,并遞歸消除低重要性特征。實(shí)驗(yàn)表明,該方法在隨機(jī)森林上訓(xùn)練與預(yù)測(cè)更快,均方誤差更低,特征提取準(zhǔn)確率高。

關(guān)鍵詞:隨機(jī)森林預(yù)測(cè)模型;獨(dú)熱編碼;遞歸特征消除;交叉驗(yàn)證

doi:10.3969/J.ISSN.1672-7274.2024.09.039

中圖分類號(hào):TP 391                 文獻(xiàn)標(biāo)志碼:B            文章編碼:1672-7274(2024)09-0-03

Feature Selection Based on RFECV and Application and Optimization of Random Forest Prediction Model

SUN Jing

(School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030000, China)

Abstract: Based on the random forest prediction model, this paper proposes the RFECV feature selection method: firstly, the feature variables are encoded with one-hot encoding, and then the built-in cross-validation of RFECV is used to evaluate the performance of each feature subset to determine the optimal number of features, and recursively eliminate low-importance features. Experiments show that this method achieves faster training and prediction on the random forest, lower mean squared error, and high accuracy in feature extraction.

Keywords: random forest prediction model; one-hot encoding; recursive feature elimination; cross-validation

0   引言

在數(shù)據(jù)量高速增長的今天,與數(shù)據(jù)對(duì)象相關(guān)的其他特征數(shù)據(jù)越來越多,在分析的過程中,不可避免要對(duì)這些特征數(shù)據(jù)的影響力進(jìn)行計(jì)算并判斷,從而更好地理解數(shù)據(jù)對(duì)象,服務(wù)于后續(xù)流程。(剩余4688字)

目錄
monitor