MLOps-CI/CD for ML Projects

Sarang Mete
3 min readNov 8, 2022

--

CI: Continuous Integration CD: Continuous Deployment/Delivery

Photo by jesse ramirez on Unsplash

CI-Continuous Integration:

Check in the code almost daily or for every small change, perform some tasks on every commit like code linting, testing

CD-Continuous Deployment/Delivery:

If all checks are passed on every commit then deploy to production. Execute deployment steps like copy code, activate env, run main file etc.

If you don’t want to use any service for deployment then basically after CI is done then you can just copy the code to production and run shell script having steps like create and activate environment, running main.py etc. or run docker container.

There are some third party services like Circleci/Jenkins/teamcity/github actions where they have separate server to build our code and test for any issues and deploy to server or you can publish on PyPI

Python is an interpreted, so when we say we have to build python project we are not talking about compilation but executing tests etc.

Here, we’ll see example, we’ll use pylint for linting, pytest for unit testing, pytest-cov for test coverage, circleci for CI/CD.

How to make sure code is linted?

1.Specify a score threshold to be exceeded before program exits with error in pylintrc

fail-under=10

2. Add pylint check in circleci on ‘development’ branch, build will fail if score less than threshold

How to make sure all test cases are passing?

Add pytest check in circleci and build will fail if all test cases are not passing

How to set code coverage check in circleci?

Add pytest-cov html output in circleci artifacts.

Sample .circleci/config.yml

version: 2.1
jobs:
build_test:
docker:
- image: continuumio/miniconda3
working_directory: ~/code
steps:
# Step 1: checkout source code to working directory
- checkout
# Step 2: create virtual env and install dependencies
- run:
name: install dependencies
command: |
conda env create -f environment.yml
source activate nlp_text_cleaner
# Step 3: run tests
- run:
name: run tests and linting
command: |
source activate nlp_text_cleaner
pylint src
pytest -vv
pytest --cov-report html:tests/cov_html --cov=src tests/
- store_artifacts:
path: tests/cov_html
pypi_publish:
docker:
- image: continuumio/miniconda3
working_directory: ~/code
steps:
# Step 1: checkout source code to working directory
- checkout
# Step 2: create virtual env and install dependencies
- run:
name: run pypi publish
command: | # create whl, install twine and publish to PyPI
conda env create -f environment.yml
source activate nlp_text_cleaner
python -m build
python -m twine check dist/*
python -m twine upload dist/*
workflows:
build_test_publish:
jobs:
- build_test
- pypi_publish:
requires:
- build_test
filters:
branches:
only:
- master

Precommit hooks Vs CircleCi(CI):

Both are used to validate code. Precommit will validate on local machine before commit. Circleci will validate on repo level once code is checked in.

So it’s good to have precommit hooks shared among team members.(create a folder inside your repo and ask team to copy hooks to inside their .git/hooks on local machine).

Precommit hooks should validate small changes like linting, code formatting etc. Because once they are validated before commit, CI build won’t fail because of such trivial things and it will save time for BUILD team.

Couple of my projects you can refer where I’ve used circleci. code_template, nlp_text_cleaner.

If you liked the article or have any suggestions/comments, please share them below!

Let’s connect and discuss on LinkedIn

References:

Continuous Integration With Python: An Introduction — Real Python

https://circleci.com/blog/publishing-a-python-package/

--

--