How redBus uses Scikit-Learn ML models to classify customer complaints?

Background : redBus receives complaints from customers mainly regarding Refunds, Cancellations, Operator queries etc... The process of classifying these emails and redirecting them to the respective customer service agents/queue becomes a humongous task. This also increases during peak days (especially holidays and weekends).

To circumvent (classifying and tagging) this process redBus took the machine learning route. We had years worth of hand tagged data which served as the knowledge base (training set) for the machine learning system. We decided to build a system which would classify each email into a category from set of predefined categories. Our CRM (we use SalesForce) system would automatically redirect incoming emails to the respective customer care service executive based on the classification done by the ML system.

System overview

The Tech :

ML Library : Scikit-Learn

Algorithm used for classification : Multinomial Naive Bayesian

Wrapper (For exposing classification as an API): FLASK

Deployment Server : Linux

To vectorize the input data TF-IDF vectorizer is used and for building the classification system, Multinomial Naive Bayesian model is used.

Post training analysis:

Originally 7 categories were considered. The accuracy was lingering around 70%. Since only 4 of the categories were considered, other categories were dropped off and the system was retrained with only 4 categories. The accuracy was at 82%. While studying the accuracy, we found out that 2 of the categories had more than 90% accuracy while 2 categories had less accuracy. This was primarily due to feature overlap. More often than not, Cancellation of a ticket/Bus and Refund go together.

End product : The classification model is exposed through an API. The api was integrated into the Salesforce system, which hits the api for every incoming email.

By Product :

  1. The customer care executive who looks into an issue can verify if the classifier system has properly classified an email. He/She can give feedback on the classifier. This would serve as additional inputs to train the system and up the accuracy even more.
  2. While studying the accuracy, we found that there were certain documents on which the system performed a better job (as a result of human error). This report was sent to the Customer Care team as an additional input for their internal training purposes.(Cross validation)


The process of deploying the models to a restAPI are as follows,

  1. Export the models-JobLib, Pickl
  2. A python script to load the models and classify [ JobLib, Pickle to load the exported models]
  3. A python web app to serve request and responses [ Flask]
  4. WSGI container to serve the app
  5. Supervisor to get the server process up and running

Lets see the steps a bit more detailed.I will assume that you have experimented with building the models.

Step 1 : Exporting Models: Export the model and vectorizer as follows

Export models with Joblib, Pickle

Step 2 : Create a script to load the models

Load models to memory

Step 3 : Add a python function to actually do the classification given an input text

Classify input text

Step 4 : Introduce Flask to serve classification through an API end point

Wrap the classification function with flask so that it can serve requests and respond accordingly. Flask is a light weight web server for Python.

Pour the models into FLASK

This will return a prediction when requested with a HTTP Post method

Step 5: WSGI

Use WSGI to serve the Flask app

If our flask application is named as then the wsgi python file should look like below. Call it Place the file in the same directory as the Flask app python file.

WSGI Container

Use Gunicorn and Supervisor to get the server up and running

Create a supervisor config at /etc/supervisor/config.d/your_config.conf

Supervisor config

Execute the following commands

$ sudo supervisorctl reread
$ sudo supervisorctl update
$ sudo supervisorctl start your_program_name

Access it using http://ip:port/classifyText with a body as well.

Request Body : {“input_text”: “The team scored more goals. But lost in the end”}

Response : {“classified_output” : “Sport”}

This is how we deployed the ML model to server real time use cases.

More to do:

  1. Try Word2Vec to see if there is improvement in accuracy
  2. Solve class imbalance problem where there are few categories that have very scarce features or very few training documents when compared with other categories