Write Fast, Efficient, and Production-Ready PyTorch Deep Learning Models (Part 1)

Axen Georget
PhysicsX
Published in
9 min readFeb 5, 2024
Source: DALL-E

Arguably a backbone of modern society, programming is a powerful tool that empowers individuals and companies to bring their ideas to life. Looking at the machine learning field, it allowed the creation of powerful algorithms and more broadly what we call artificial intelligence.

As these new paradigms are becoming popular, a lot of programmers, scientists, and mathematicians from different backgrounds are entering this world of machine learning. Easy programming languages like Python and powerful libraries like PyTorch empower these individuals to create world-changing machine learning algorithms.

But when creating deep learning models, the step between experimental models and production-ready models is quite big. This makes it difficult for anyone without a software engineering background to create usable products.

This series of 4 articles tries to make this step easier. While non-exhaustive, it covers the most important concepts leading to production-ready PyTorch deep learning models, with a special focus on speed, efficiency, and best practices.

These articles aim to help anyone with a machine learning and programming background to write better code. It can be used by data scientists to improve their models, by machine learning engineers as a cheat sheet, or simply by anyone curious to learn more about this subject.

Part 1 / Software Engineering: Essential Concepts

In this article, we will review essential software engineering concepts. These are absolute requirements when it comes to writing production-ready code. And applying these will have a bigger impact than you can ever imagine.

In the context of machine learning, it is common for these practices to be overlooked for the benefit of fast experimentation and iterations. This often leads to issues with reproducibility and maintainability, eventually slowing down the state of machine learning. Researchers in the field should be encouraged to follow them.

Programs must be written for people to read, and only incidentally for machines to execute.
Structure and Interpretation of Computer Programs by H. Abelson & G. Sussman

Always remember why you write code. A common misconception is to think that code is written for computers. Sure, a computer will run your code, but humans will read, fix, and improve it. Whether it is a personal project or large-scale software, the code should be easy to enhance with new features.

Software engineering best practices are the foundations of well-written, easy-to-maintain, and efficient code. The following concepts are not exhaustive, but knowing them should give you a solid base.

Version Control

Simply summarised, a Version Control System (VCS) allows you to record changes on a set of files (or a single file) over time. It is an essential tool for any software engineer.

The most popular, and industry-standard tool is Git. It has the advantage of being fairly easy to use while being extremely powerful. Many web-based platforms offer hosted version control, easing collaboration on software projects. The most popular are GitHub and GitLab.

The advantages of using such tools when working with other people are substantial. It allows many programmers to contribute to the same codebases while offering features to manage conflicts, review code, etc. When working with machine learning models, version control tools also greatly improve reproducibility.

Vital for teams, these tools are also useful for individuals. Seemingly a loss of time when working on personal projects/codebases, it will always be a positive investment.

Coding Style

Code can be written in many different ways and many programming languages can be used to achieve the same task. And for each of these, there is never a single way to do things. This applies to naming conventions, but also to the formatting itself.

A coding style usually specifies all these little details, ultimately defining conventions. These include details about the right place to insert line breaks, the number of lines each function should contain, the correct way to indent, and many more.

Whether you are working in a team, or alone, it is generally a good idea to define/choose a coding style that will need to be respected within a project, a team, or even a company. Unifying how code looks is one of the best ways to make sure code is readable, understandable, and maintainable over time.

Modularity and Reusability

Two principles are really helpful when it comes to writing better, cleaner, and more maintainable code: modularity and reusability.

Modularity consists in dividing your code into independent, eventually smaller, and cohesive parts. Each of these parts should ideally perform only one specific task or function. Concerns should be separated making sure that each piece of the code has a single responsibility and a clear purpose.

On the other hand, reusability helps make sure that there is no duplication. Your code should have the ability to be used in different contexts without requiring significant changes or modifications.

Let’s take an example: you have some code logic that is duplicated in different files or functions, and this logic contains a bug. Fixing this bug implies applying the same fix multiple times everywhere. If the codebase is big enough, it becomes time-consuming and hard to completely fix the bug.

As a software engineer, you should always try not to duplicate/repeat code (Don’t Repeat Yourself or DRY principle). Remember, a good programmer is a lazy programmer. To a point where you should never want to write the same thing more than once.

Many different paradigms exist to help a software engineer improve modularity and reusability: Object Oriented Programming (OOP), functions, classes, and plenty of design patterns. Studying and practising these paradigms is helpful and important when it comes to writing better code.

Documentation

One simple tool helping to clarify code is comments. Most programming languages offer the possibility to write comments (code that is ignored by the compiler/interpreter). They can help quickly debug, but most importantly they can guide other programmers on how to use a specific function, explain a specific logic, etc.

Commenting is a powerful tool, and should be extensively used to write docstrings for your functions. Docstrings aim to specify what are the inputs and outputs of a function, and any other information needed to call it. However, commenting should not be overused outside of the docstrings.

A perfect piece of code should be self-explanatory and arguably should not require any comments. To help in that matter, when naming functions, variables, classes, or files, you should be as specific and extensive as possible. Do not be afraid of long variable names. This combined with the concepts described in the previous chapter (Modularity and Reusability) should make your code easy to read and understand. That being said, this is sometimes not enough to make the code easy to understand, and comments can help in those cases.

Here is a simple example showing how you can make your code clearer with documentation:

# Example of not-so-clear code
# -------------------------------------------------------------------------

def add_matrix(m: list[list[int]]) -> None:
for i in range(len(m)):
for j in range(len(m[i])):
m[i][j] += 1
# Example of clearer code
# -------------------------------------------------------------------------

def add_one_matrix(matrix: list[list[int]]) -> None:
""""Add one to a matrix in-place.

Args:
matrix: matrix to add to, represented as a two dimensional array.
"""
n_rows = len(matrix)
for row_index in range(n_rows):
row = matrix[row_index]

n_columns = len(row)
for column_index in range(n_columns):
row[column_index] += 1

Generally speaking, if you feel that you need to add a comment, take some time to see if you can improve the readability of the code itself instead.

Automated Testing

Automated tests are one of the most important parts of any codebase. Some situations require writing more tests than others, but it is a very good (and important) habit to take. They increase the robustness of your code, reduce the chances of bugs, and most importantly help avoid regression.

Two main types of tests can be written, unit tests and integration tests. Unit tests should test very specific functions of your code individually. Integration tests should aim to test the interaction of different components together, as well as the software as a whole.

One of the most common metrics for automated tests is code coverage, it represents the percentage of a codebase that is being executed in the tests. Achieving a code coverage of 100% is usually a good start to ensure the robustness of your software. But this metric should be used carefully, as it is easy to hijack with meaningless tests.

Writing bug-free code is almost impossible, and this is okay. However, using automated tests, it is possible to get closer to the perfect code. Every time you encounter a bug or an unwanted behaviour it should be seen as a sign that something is not tested properly. Before trying to fix anything, write this missing test. It should help you find the origin of the problem, and it will also make sure that this bug never occurs in the future.

Finally, it is good to know that writing tests is going to be helpful to make sure that your code is production-ready. It is unlikely that it will have a direct impact on the speed and efficiency of your code. Moreover, writing automated tests for your machine learning models and functions will increase the stability of your accuracy, performance, and results in general.

Data Structures

Fundamental when it comes to computer science, data structures allow to solve a variety of software engineering problems. Even if this seems less fun than programming itself, this should not be overlooked. For anyone wanting to write fast and efficient code, studying the different data structures is inevitable.

Choosing the right data structure for a problem depends on two main aspects. First, it is conditioned on what the data is and in what form it is ingested. And most importantly, it depends on how the data will be used.

Here are the most common ones:

  • Arrays: fixed-size structure holding items.
  • Linked Lists: dynamic-sized sequential structure holding items in a linear order.
  • Stacks and Queues: respectively LIFO (Last In First Out) and FIFO (First In First Out) structures allow easy manipulations of the items at the beginning or the end of it.
  • Hash Tables: structure implementing efficient direct addressing of some specific keys.
  • Trees and Graphs: hierarchical data structures useful in a wide range of applications (particularly in machine learning).

A good way to train and get better at using the right data structures is simply to practice using them. You can use interview-like programming problems or bigger toy projects to apply these data structures and understand how and when they are useful.

Computer Architecture

Computer architecture is a complex subject, and knowing the end-to-end structure of a computer may not be essential when it comes to software engineering or machine learning development. However, being aware of the high-level principles is a necessity to write efficient code.

One of the most important aspects of computer architecture is the memory. It is used by computers to load and run code, and by software to create variables. Most modern programming languages abstract the process of allocating and freeing memory, but it is still important to know what is going on behind these abstractions to use them correctly.

The most common types of memory allocation are static and dynamic. Static memory allocation is when software allocates a fixed-size data structure, which is done at compiling time. On the other hand, dynamic memory allocation is when software allocates memory at runtime. Pointers are used to access dynamically allocated memory and have a usual size of 4 or 8 bits.

Moreover, it is also useful to clearly understand the different data types. More specifically how these are using the memory: as a general idea, remember that everything uses a certain number of bits (8 of them create a byte). The more of them you use, the more memory space you are using. It is always a good practice to minimise this space as it can improve performance but also ensure that your software is compatible with different hardware.

When working with deep learning models this is particularly useful to know. Some models can use a lower floating point precision for the weights or some heavy operations. For instance, going from a 64-bit float to a 32-bit float will reduce your memory usage by 2. And incidentally, speed up every operation up to 3x.

--

--