How modularity makes everything better

Carlos Uziel Perez Malla
Moonvision
Published in
3 min readMay 7, 2019

If you have any experience with software development, you must have encountered in your journey the term modular programming. In essence, this involves partitioning a program in multiple modules with simple responsibilities (i.e. see SOLID) that interact with each other and can be interchanged, thus avoiding having a monolithic design. The main benefits of modularity are reusability of modules and easier maintainability.

This time, I will talk about a real project where having a modular system saved substantial time and resources at MoonVision.

New feature? Just add a new component

All of our computer vision pipelines are composed of many simple steps that have to be taken to obtain the final result from the original image. Each one of these steps is a component (a module) in our execution graph, starting with a data provider and ending by storing the results, which can be done in several ways. But we also have different kinds of graphs, depending on whether we want to train a model, perform inference on a dataset or just record a video with the pipeline results, among many others.

Chaining modules in our pipelines is made possible by using strong typing in Python. Knowing beforehand the types each module has to handle guarantees correct processing of inputs and returning the right outputs, making interchangeability of modules a trivial thing.

For one of our recent projects, we needed to detect objects of very different sizes. As a reference, consider that the smallest object was approximately 20 times smaller than the largest object. For this reason, we decided that in order to maximize efficiency and accuracy, it was best to use multiple detection models instead of a single one. This way, we could train a detector for larger objects on scaled images by a very small factor and then train another detector for the smallest objects with bigger resolution images. Such a design had considerable speedup gains in inference times, since in a production environment inference is usually done in small-factor computers that rely solely on CPU power.

Using multiple detectors means that the results of each detector must eventually be merged into a single result. Additionally, scaling at various points also increases complexity. By adding simple components that would take care of this, we can easily make complex graphs where all components interact seamlessly with each other.

Feel free to take a look at Figure 1 to understand all of the above.

Figure 1: Simplified inference graph for one of our latest projects. The modules highlighted in orange were the new additions that made this project a success in a very short period of time.

Another side-effect of representing traditional and deep learning inference in a single symbolic graph is serialization. We can share an entire application between devices running on a single camera stream and a compute cluster with auto-scaling.

Conclusion

Modularity guarantees major flexibility improvements in the system and makes changes on the go much easier to implement. This translated into a much faster deployment of new features and completion of clients’ projects in a timely manner.

Check out what we do at https://www.moonvision.io/ and check our platform at app.moonvision.io/signup.

--

--