Random Forest คืออะไร

2 min readSep 29, 2019

Forest รูปจาก http://afreeimages4you.blogspot.com

Random Forest

เป็น Model ประเภทหนึ่งของ Machine Learning ถูกพัฒนาขึ้นจาก Decision Tree ต่างกันที่ Random Forest เป็นการเพิ่มจำนวน Tree เป็น Tree หลายๆ ต้น ทำให้ประสิทธิภาพในการทำงานสูงขึ้น แม่นยำมากขึ้น ซึ่งโมเดล Random Forest เป็นโมเดลที่ได้รับความนิยมไปอย่างมากในการใช้ Machine Learning

ตัวอย่างการแบ่งข้อมูลออกเป็น Tree แต่ละต้น

คล้ายกับ Bagging

Bagging จะมีการแบ่งข้อมูลออกเป็น Tree หลายๆ ต้นแต่การทำ Bagging จะมีปัญหาเรื่องความไม่เป็นอิสระของข้อมูลเนื่องจากต่อให้เราแยกออกไปหลายๆ Tree ก็จริงแต่มันก็คือข้อมูลเดียวกัน Random Forest จึงเข้ามาแก้ปัญหาตรงนี้ โดยการทำ Random Sample Feature

Random Sample Feature

คือ นอกจากจะแบ่งเป็น Tree หลายๆ ต้นแล้ว ยังแบ่ง Feature ของ Tree แต่ละต้นจะมี Feature ที่ไม่เหมือนกันทั้งหมด เพื่อทำให้แต่ละ Tree มีความหลากหลายและมีความอิสระกันมากขึ้น

ตัวอย่าง Random Forest

Random Forest Error Rate

Correlation ระหว่าง Tree แต่ละต้น ยิ่ง Correlation กันมากยิ่งแย่
Single trees ยิ่งสูงยิ่งดี

Increasing Number of Features

เพิ่มจำนวน Features (max_features) ทุกๆ การ split

Max Features : จำนวน feature ที่มากที่สุด

ข้อเสีย : max_features ช่วยเพิ่ม correlation ของแต่ละ Trees (เราต้องการลด)
ข้อดี : max_features ช่วยเพิ่มแข็งแรงของ Single tree ทำให้ข้อมูลมีความสะอาดขึ้น

Tuning Random Forests

set parameter : max_features
- หากเป็น Classification จะใช้ sqrt(n_features)
- หาเป็น Regression จะใช้ n_feature / 3

set n_estimators > 100 (จำนวน Trees)
- Scikit-learn default 10 หรืออาจจะน้อยกว่านั้น

Random Forest คืออะไร

Random Forest

คล้ายกับ Bagging

Random Sample Feature

ตัวอย่าง Random Forest

Random Forest Error Rate

Increasing Number of Features

Max Features : จำนวน feature ที่มากที่สุด

Tuning Random Forests

References
IT554 Pattern Recognition and Machine Learning, SWU

Written by PradyaSin

Random Forest คืออะไร

Random Forest

คล้ายกับ Bagging

Random Sample Feature

ตัวอย่าง Random Forest

Random Forest Error Rate

Increasing Number of Features

Max Features : จำนวน feature ที่มากที่สุด

Tuning Random Forests

ReferencesIT554 Pattern Recognition and Machine Learning, SWU

Written by PradyaSin

References
IT554 Pattern Recognition and Machine Learning, SWU