The New Kid on the Data Science Block: Marimo

Chris Van Yperen
bigdatarepublic
Published in
5 min readJun 24, 2024

Recently, I came across a cool new tool called marimo. It’s not really a new tool, but an improved version of a tool that virtually all data scientists already heavily rely on. The marimo team describes it as “The future of Python notebooks”. The marimo team consists of two members according to their website, Akshay Agrawal and Myles Scolnick. Both these guys have earned their stripes at companies like Google and Palantir Technologies, which gives me the confidence that they are up for the task of revamping such an iconic tool as the Jupyter Notebook.

The initial commit took place on the 14th of August 2023 according to the git history. Since it was quite an extensive commit it seems that the team had already invested quite some time into the development of it before then. Since then it has been growing in popularity, or at least in Github stars, and it really took off in January 2024. Whether it was a popular New Year’s resolution or just a slow week at the office, more and more people have adopted marimo as their Jupyter notebook replacement. Let’s dive into why they have done that and whether you should too.

Reactive Design

At the heart of marimo is its reactive design. When you run a cell or interact with a UI element, marimo automatically updates all affected cells, ensuring that code and outputs remain consistent. This prevents bugs and enhances the reliability of your notebooks. The interactivity doesn’t stop there; marimo allows you to bind sliders, tables, plots, and more directly to Python without the need for callbacks. This makes the creation of dynamic and interactive visualizations straightforward and intuitive. More on these visualizations later.

Collaborative Nature

Marimo is designed to enhance collaboration among data scientists. Its real-time collaboration features allow multiple users to work on the same notebook simultaneously, seeing each other’s changes in real time. This makes it easy to brainstorm, debug, and develop models together. Additionally, marimo supports commenting and annotations directly within the notebook, enabling team members to leave feedback, suggestions, or explanations. This level of collaboration streamlines workflows, reduces misunderstandings, and fosters a more cohesive team environment, making it an ideal tool for both small teams and large organizations.

Emphasis on Reproducibility

One of marimo’s key strengths is its emphasis on reproducibility. This is crucial for ensuring consistent results, especially in collaborative environments. For instance, imagine you’re working on a machine learning project with a team. You run an analysis and share the notebook with your colleagues. In traditional notebooks, hidden states or non-deterministic execution could cause your colleagues to get different results when they rerun the notebook. With marimo, hidden states are eliminated, and the execution order is deterministic. This means that when your team reruns the notebook, they will get the same results as you did, ensuring that everyone is on the same page and the integrity of the data is maintained.

Consider a scenario where you are building a predictive model for sales forecasting. You can bind sliders to input parameters for the model, and as you adjust these sliders, marimo updates all relevant cells in real-time. This live updating feature ensures that any changes made are consistently reflected throughout the notebook. When the project progresses to the stage where different team members need to validate or extend the model, marimo’s reproducibility ensures that each team member’s modifications produce consistent and reliable results, enhancing collaboration and trust in the model’s outcomes.

By ensuring a reproducible environment, marimo allows data scientists to code with confidence, knowing that their results are reliable and consistent. This reproducibility is particularly valuable in collaborative projects, where maintaining data integrity and consistency across different team members and stages of the project is critical.

Developer-Friendly Features

Marimo’s integration with GitHub Copilot provides intelligent code suggestions, helping users write code faster and with fewer errors. This tool leverages AI to predict and complete code snippets, significantly reducing the time spent on coding tasks. Whether you’re writing complex algorithms or simple functions, GitHub Copilot streamlines the process and enhances your coding experience.

Fast autocomplete further boosts coding efficiency by predicting and completing code snippets as you type, reducing the time spent on repetitive tasks. Additionally, built-in code formatting ensures that your code remains clean and readable, adhering to best practices and style guidelines automatically. This makes it easier for teams to maintain a consistent coding style and for new team members to understand and contribute to the codebase quickly.

Seamless Deployability

Marimo also excels in deployability, providing users with versatile options to execute and share their work. Each notebook can be run as a script or deployed as an app, broadening the scope of how and where you can use your analyses. For instance, you might develop a complex data visualization in a marimo notebook, then easily convert it into a standalone app for colleagues to interact with without needing to navigate the notebook interface. Similarly, a machine learning model trained in marimo can be deployed directly from the notebook to a production environment, facilitating a seamless transition from development to deployment.

marimo’s “Grid Layout” example notebook

Advanced Visualization and Expressiveness

Visualization and expressiveness are other areas where marimo shines. You can use markdown, LaTeX, tables, accordions, tabs, and grids to create rich, informative notebooks that effectively communicate your findings. Additionally, marimo supports deploying your notebooks as interactive web apps, providing a seamless way to share your work with others.

Even if you agree that this sounds great, visualizations need be, well…, visualized. Luckily, marimo provides some great examples to show what marimo is capable of. This example in particular really reminded me of a use case that I would normally make a quick dashboard for in Dash. Although Dash already does a great job of making this a simple process, this just takes it to the next level by combining the notebook and interactive visualization into a single framework. Other great examples are the ISS notebook, the reactive plots notebook, and the Pokémon Statistics notebook.

Conclusion

In conclusion, marimo is more than just a Python notebook — it’s a robust platform that enhances productivity, collaboration, and the overall data science experience. Whether you are working solo or as part of a team, marimo offers the tools you need to take your data science projects to the next level with ease and confidence.

For more details on marimo and to get started, visit marimo documentation. If you want to see some more examples or following along with some tutorials, check them out here.

--

--