The ML Native Language: Mojo

A language for a world of infinite machine learning but finite compute

Published in

NYU Data Science Review

4 min readNov 10, 2023

With the advent of chatGPT, large language models and machine learning have come to dominate the tech sphere, yet the underlying technologies which have enabled this boom all rest on a language with much to be desired, namely Python. Whether it’s dynamic typing, slow runtime, or the global interpreter lock preventing true multithreading, Python’s limitations are hamstringing AI development. Of course, Python is still dominant in this field because of its incredible developer ecosystem with libraries that have become essential to modern ML, whether it’s scikit-learn, NumPy, or PyTorch. But what if developers didn’t have to trade performance for necessary libraries- what if they could have the performance of a systems level language like C while using the libraries of Python? Well that would be incredible, and that also happens to be exactly what Mojo is.

Mojo is an entirely new language developed by Modular AI, a startup which recently raised 100 million from Google Ventures and Greylock. Founded in 2022 by Chris Lattner, the creator of the Clang compiler and the chief architect of Swift, and Tim Davis from Google Brain, Modular AI is developing Mojo along with an AI inference engine with the grand goal of accelerating AI development [2].

Though Mojo is an entirely new language, it is a superset of Python. Essentially, Python syntax, libraries, and projects are FULLY compatible with Mojo. To make the switch, simply download Mojo and change your Python files and paths to .mojo or .🔥(yes the fire emoji is a valid Mojo file extension). I should note that Mojo is currently only available on Linux Ubuntu and macOS with M1 or M2 silicon, but there is a free Mojo jupyter notebook playground for testing. The incredible performance of Mojo is made apparent when comparing it against CPython and even C++. In terms of execution speed, Mojo is 68,000 times faster than CPython, and even 6.6 times faster than C++.

Mojo Internal Evaluations (https://docs.modular.com/mojo/)

This performance increase can significantly expand the scale of projects that GPU-poor engineers [3] develop, especially for compute-intensive tasks like training LLMs. In a similar vein, at a time of rising GPU scarcity and AI regulation, maximizing the capabilities of current computing resources through performance optimization is becoming increasingly necessary. Even if you have access to thousands of H100 GPUs, you’ll still want to use Mojo to achieve the highest possible training performance. In terms of how Mojo is able to achieve this boost, Mojo utilizes automatic parallelization of tasks whereby the compiler divides computations into tasks that can be executed across multiple GPU cores.

Despite being a superset of Python, Mojo’s features make it far more than just a new variation of Python. One crucial feature is optional typing, which allows developers to choose whether or not to enforce type safety in their programs. This added flexibility, which is key to interoperability with Python, is a further advantage for developing production level programs given the dynamic nature of user input.

Another key feature of Mojo is its integration with MLIR, which provides direct access to low-level primitives for optimization [1]. This significantly enhances development versatility. As users can utilize high-level abstractions for rapid development while also being able to leverage zero-cost abstractions to increase performance, an ability which is often confined to low-level languages.

Overall, the multifaceted nature of Mojo, along with its interoperability with Python, makes it an ideal choice for the modern machine learning engineer.

See you at compile time!

References:

“Modular: Inference Engine.” Modularcom RSS, www.modular.com/engine. Accessed 31 Oct. 2023.
Wiggers, Kyle. “Modular Secures $100M to Build Tools to Optimize and Create AI Models.” TechCrunch, 24 Aug. 2023, techcrunch.com/2023/08/24/modular-raises-100m-for-ai-dev-tools/.
Barr, Alistair. “The Tech World Is Being Divided into ‘GPU Rich’ and ‘GPU Poor.’ Here Are the Companies in Each Group.” Business Insider, Business Insider, 28 Aug. 2023, www.businessinsider.com/gpu-rich-vs-gpu-poor-tech-companies-in-each-group-2023-8.
https://news.ycombinator.com/item?id=37324683
Fertman, Kim, et al. “You May Now Kiss the… Uh… Guy Who Receives.” Family Guy. Season 4, Episode 25. Fox Broadcasting Company, 30 April 2006. Television

The ML Native Language: Mojo

A language for a world of infinite machine learning but finite compute

Written by Paul Chan