Origin of wine part 7

Nelson Punch

Published in

Software-Dev-Explore

2 min readNov 2, 2023

Introduction

Training the model is straightforward. Here I will create a new pipeline for training the model.

Later I will evalute the model and see its performance and then save the model.

Code

Notebook with code

Training the model

Pipeline and Training

The NLPTransformer is included in this pipeline. The training data will flow through NLPTransformer follow by TfidfVectorizer and end with LinearSVC.

The model is going to be trained with the best parameters.

Metrics

In order to see the performance of trained model, I can use accuracy_score, roc_auc_score and classification_report from Scikit-Learn.

Save and test model

Save the model

According to Scikit-Learn document, I can use pickle from python to save the model.

Here I save the dataframe and pipeline into a pickle file.

Load and test model

I load the pickle file and retrieve the pipeline then test the model with a description. The model take the description then produce a cluster lable for me so that I can find all wine in that cluter.