ML-E3: Machine learning in 5 lines of python (plus a real-world demo too)
We’ll walk through an ultra-simplified example of training and predicting with a machine learning (ML) model, using a mere five lines of Python code.
We’ll then expand on this example to include steps that you’ll encounter in real-world data science projects, like data loading, train/test/validation splitting, and model saving.
ML series menu: E1 E2 E3 E4 E5 E6 E7 E8 E9
Literal 5-liner
Here’s a five-line demonstration of how to train and predict with a machine learning model in Python, using the scikit-learn library:
Let’s break down what’s happening in this code:
- We import the Iris dataset from scikit-learn’s datasets module and the RandomForestClassifier from the ensemble module.
- The
load_iris(return_X_y=True)
function call loads the Iris dataset and returns the featuresX
and labelsy
. - We create a random forest classifier and fit it to our data using the
.fit(X, y)
method. Random Forest is a rock solid ML algorithm (see below). - Finally, we use the
.predict(X)
method on our trained classifier to predict the labels of our features.
Random Forest?
Random Forest is a versatile and widely used machine learning algorithm that belongs to the category of ensemble learning techniques.