I trained a model. What is next?

Kaggle Team
Sep 9, 2020 · 11 min read

Here at Kaggle we’re excited to showcase the work of our Grandmasters. This post was written by Vladimir Iglovikov, and is filled with advice that he wishes someone had shared when he was active on Kaggle. The original post can be found on Vlad’s Ternaus Blog.

Introduction

I participated in machine learning (ML) competitions at Kaggle and other platforms to build machine learning muscles. I was 19th in the global rating, got Kaggle Grandmaster title.

Every ML challenge ended with new knowledge, code, and model weights.

I loved new learnings but ignored the value that old ML pipelines could bring. Code stayed in private GitHub repositories. Weights were scattered all over the hard drive. In the end, all of them were deleted.

The situation is not unique to Kaggle. Same story in academia. The student trains a model, writes a paper. After it is accepted to the conference, pipelines are abandoned, training artifacts deleted and student moves on.

This article will talk about small steps that you can do after the end of every ML challenge.

These steps will:

  • Boost technical knowledge.

As an example, I will use the repository https://github.com/ternaus/retinaface

It was not a part of a Kaggle challenge but was created to illustrate the story.

I. +5 min: Release code to the Public GitHub repository

Most likely, code is already at GitHub, but in a private repo.

What will you lose if you will make it public?

There are situations when private should stay private, but in your pet project, your Kaggle solution, or your paper, it may not be the case.

The most common obstacle that I have seen: people assume that all public code should be perfect and that they will be judged if it is not the case.

In reality, no one cares. Just do it. Release it as is, without any polishing.

Making code public is an important psychological step. Releasing non-perfect code is a confident, bold move.

Besides, all later steps are based on this one.

Example: https://github.com/ternaus/retinaface

II. +20 min: Improve readability

You can improve the readability of your python code by adding syntax formatters and checkers.

It is not hard and not time-consuming. Checkers and formatters will not transform bad code into good, but the readability will go up. Think about fixing syntax as about basic hygiene. It is like brushing your teeth, but for the code.

I wrote a blog post on the topic called Nine Simple Steps for Better Looking python code. Feel free to check it out.

Step 1: configuration files

Add these files to the root of your repository.

Step 2: requirements

Install the required libraries with

pip install black flake8 mypy

Step 3: black

There are 100500 ways to format the code. Formatters like black or yapf modify the code to satisfy a pre-defined set of rules.

It is easier to read codebase that has some standards. When you work on the code for hours and need to switch a context between different coding styles, it drains “willpower energy” — no need to do it without a good reason.

Running

black .

will reformat all python files to follow the set of rules by black.

Step 4: flake8

Running

flake8

will not modify the code, but will check code for syntax issues and output them to the screen.

Fix them.

Step 5: mypy

Python does not have mandatory static typization, but it is recommended to add types to the function arguments and return types.

For example:

class MyModel(nn.Module):
....
def forward(x: torch.Tensor) -> torch.Tensor:
....
return self.final(x)

You should add typing to the code.

  1. It makes it easier to read the code.

After updating the code run mypy on the whole repo:

mypy .

If mypy found issues — fix them.

Step 6: pre-commit hook

Running flake8, black, mypy manually all the time is annoying.

There is a tool called pre-commit hook that addresses the issue.

To enable it — copy this file to your repo: .pre-commit-config.yaml.

You need to install the pre-commit package on your machine with:

pip install pre-commit

And initialize with:

pre-commit install

You are good to go.

From now on, on every commit, it will run a set of checks and not allow the commit to pass if something is wrong.

The main difference between the manual running of the black, flake8, mypy is that it does not beg you to fix issues, but forces you to do this. Hence, there is no waste of “willpower energy.”

Step 7: Github Actions

You added checks to the pre-commit hook, and you run them locally. But you need a second line of defense. You need Github to run these checks on every pull request.

Way to do it is to add file .github/workflows/ci.yaml to the repo.

There are lines:

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install black flake8 mypy
- name: Run black
run:
black --check .
- name: Run flake8
run: flake8
- name: Run Mypy
run: mypy retinaface

that tell GitHub what to check.

I also recommend to give up the practice of pushing the code directly to the master branch.

Create a new branch, modify the code, commit, push to Github, create a pull request, and merge to master.

It is a standard in the industry, but it is exceptionally uncommon in the academy and among Kagglers.

If you are not familiar with these tools, it may take more than 20 minutes to add them and fix errors and warnings.

Remember this time. In the next project, add these checks in the first commit, when no code is written. From that moment, every small commit will be checked, and you will need to fix at most a couple lines of code every time: tiny overhead, excellent habit.

I would also recommend reading a book Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones. It talks about small changes in your behavior that improve productivity and the quality of your life.

III. +20 min: Create good readme

Good readme serves two purposes:

  • For yourself: you assume that you will never use this code, but “never say never.” You will, and you will not remember what was happening here. Readme will help you.

For machine learning repositories, the bare minimum is:

  • An image that tells what the task was and how did you solve it. No words should be required. Most likely, after working on a problem for weeks, you have 100500 pictures. They are just not a part of the Readme. Fix it.

If you need to write 100500 words to describe how to run training or inference it, it is a red flag. You need to refactor your code and make it more user friendly. People often ask — how can I become a better programmer? This is the exercise that helps. You will need to rewrite your code. Try to look at your Readme from the eyes of someone else.

It is also a great exercise that will help develop a habit of looking at the product from the user’s point of view.

Famous: “Customers come first.”

Example: For retinaface, I wrote a wrapper over a model that hides details of the postprocessing.

Postprocessing is straightforward hard, but it does not add value to people that do not care about it => users should be able to hide it.

Readme created at this stage will be reused later when we will build a library.

IV. +20 min. Make it easy to use your trained model

I guess you write

model = MyFancyModel()
state_dict = torch.load(<path to weights>)
model.load_state_dict(state_dict)

to load pre-trained weights to the model.

It works, and steps are clear, but it requires weights on the disk and knowing where they are. A more elegant solution is to leverage the torch.utils.model_zoo.load_url function in torchvision and similar in TensorFlow or Keras.

We can do:

from retinaface.pre_trained_models import get_modelmodel = get_model("resnet50_2020-07-20", max_size=2048)

If weights are not on the disk, they are downloaded from the internet and cached on the disk. The model is initialized, and weights are loaded.

This is user friendly, and that is what you see in torchvision and timm libraries.

Step 1: host weights of the pre-trained model

This was the biggest blocker for me. Where can I put weights for the model if you do not want to deal with AWS, GCP?

Apparently, there is an excellent solution, I would say a loophole. You can put weights to the releases at GitHub.

The limit is 2Gb per file, which is enough for most Deep Learning models.

Step 2: write a function that initializes the model and loads weights

In my case:

This functionality will be leveraged when we will build Colab Notebook and WebApp.

V. +20 min. Make a library

In this step, you lower the entry point to use your model. The goal is to perform predictions without

git clone retinaface

Step 1: add necessary dependencies to the requirements.txt

You can use

pip freeze > requiements.txt

or update it manually.

Step 2: change the file structure of the repo

Create a “main folder,” in my case, called “retinaface,” the same as the repository.

  • Move all the important code there.

Doing it manually and updating all imports would be painful. PyCharm or similar IDE will do it for you.

This is a common way to structure code in the repositories.

I hope, in the future, you will follow this pattern from the beginning. If you want something even more structured, check out Cookie Cutter package.

Step 3: add config file

  • Add setup.py to the root of the folder with the content similar to setup.py

Step 4: create an account at PyPI

If you do not have an account at PyPI, it is time to create it.

Step 5: build a library and upload to PyPI

python setup.py sdist
python setup.py sdist upload

That is it. Your repository is a library, and everyone will be able to install it with:

pip install <your_library_name>

If you check the package’s page at PyPI, you will see that it uses Readme that you have in the repo to present the project.

We will use the functionality of this step for Google Colab and for a Web App.

VI. +20 min. Create a Google Colab notebook

It is a good practice to add a Jupiter notebook to the repository to show how to initialize the model and perform inference. Example.

We can do better.

We enabled a “fancy” model initialization and the pip install magic in the previous two steps. Let’s leverage this.

We can create a Google Golab notebook.

Now, the only thing that someone needs to play with your model is a browser! More people will be able to check it out.

Example.

Do not forget to add a link for a notebook to your readme and update the version at PyPi.

VII. +20 min. Create WebApp

Many Data Scientists assume that building a web app is a complicated procedure that requires specialized knowledge.

This assumption is correct, the web app for a complex project requires skills that Data Scientists may not have.

Building a simple web app that demonstrates the model is easy.

I created a separate github repository for a web app. Still, you can do it in your repository with the model.

A blog post that describes details: How to Deploy Streamlit on Heroku

Step 1: Add code for the app

Code

"""Streamlit web app"""import numpy as np
import streamlit as st
from PIL import Image
from retinaface.pre_trained_models import get_model
from retinaface.utils import vis_annotations
import torch
st.set_option("deprecation.showfileUploaderEncoding", False)
@st.cache
def cached_model():
m = get_model("resnet50_2020-07-20", max_size=1048, device="cpu")
m.eval()
return m
model = cached_model()st.title("Detect faces and key points")uploaded_file = st.file_uploader("Choose an image...", type="jpg")if uploaded_file is not None:
image = np.array(Image.open(uploaded_file))
st.image(image, caption="Before", use_column_width=True)
st.write("")
st.write("Detecting faces...")
with torch.no_grad():
annotations = model.predict_jsons(image)
if not annotations[0]["bbox"]:
st.write("No faces detected")
else:
visualized_image = vis_annotations(image, annotations)
st.image(visualized_image, caption="After", use_column_width=True)

Less than 40 lines.

Step 2: add config files

You will need to add files:

  • setup.sh — you can use this file without changes.

Step 3: add requirements.txt

Step 4: register at herokuapp

Step 5: push the code

heroku login
heroku create
git push heroku master

You are live. Check out the example for https://retinaface.herokuapp.com/

VIII. +4 hours: Write a blog post

Many people undervalue their work. They assume that if they know how to do something, everyone knows it. It is not the case.

One man’s trash is another man’s treasure.

Your article will help other people and improve your career opportunities.

I work at Lyft, Level5, and apply Deep Learning techniques to self-driving problems. Before Lyft, I worked at the debt collection agency TrueAccord. You can read about my job search in the blog post: “Shifting Careers to Autonomous Vehicles.”

One of the reasons I was able to do this career shift is because I shared my knowledge in blog posts and meetups. This attracted the attention of recruiters and hiring managers.

Worked for me — will work for you.

For machine learning, I would recommend writing the text that covers:

  • What was the problem?

If you read till this moment and found this article useful, you can say “Thank you!” by writing a blog post about one of the machine learning problems that you faced and how you solved it.

Example:

IX: Days. Write a paper that describes your solution to the machine learning competition

Even if your paper is not a breakthrough, it will be published and have value to other people. Writing academic papers is a different skill, and you may not have it now. Not a problem. You can collaborate with people that know how to write in an academic format.

Examples:

Just look at my Google Scholar. Bump in the previous few years comes from the papers that were summarizing participation in different machine learning competitions.

I participated in machine learning (ML) competitions at Kaggle and other platforms to build machine learning muscles. I was 19th in the global rating, got Kaggle Grandmaster title.

Besides — your paper will not be alone. It comes in a package with:

  • Github repository that has clean code and a good readme.

It is not a paper anymore, it is a cohesive strategy that shows your “ownership” and “communication.” Both are crucial for your career growth, but I will talk about it next time :)

Kaggle Blog

Official Kaggle Blog ft.