Origin of wine part 7
Introduction
Training the model is straightforward. Here I will create a new pipeline for training the model.
Later I will evalute the model and see its performance and then save the model.
Code
Training the model
Pipeline and Training
The NLPTransformer is included in this pipeline. The training data will flow through NLPTransformer follow by TfidfVectorizer and end with LinearSVC.
The model is going to be trained with the best parameters.
Metrics
In order to see the performance of trained model, I can use accuracy_score, roc_auc_score and classification_report from Scikit-Learn.
Save and test model
Save the model
According to Scikit-Learn document, I can use pickle from python to save the model.
Here I save the dataframe and pipeline into a pickle file.
Load and test model
I load the pickle file and retrieve the pipeline then test the model with a description. The model take the description then produce a cluster lable for me so that I can find all wine in that cluter.
Conclusion
Finally I have a model that is able to suggest me a group of wine base on my description.
Next
To bring my model into application and share with others, I can use Streamlit to help me out.