Visualizing da Vinci’s Greatest Work

Behind the scenes of the concept, design, and development of codex-atlanticus.it

The Codex Atlanticus is the largest existing collection of original drawings and text by Leonardo da Vinci and is preserved at the Biblioteca Ambrosiana in Milan. It comprises 1,119 leaves dating from 1478 to 1519. Due to this wide span of time the book contains a great variety of subjects: from architecture to engineering, from art to natural sciences.

The story of the Codex Atlanticus begins with da Vinci’s pupil Francesco Melzi who, after the death of the Florentine genius in 1519, inherited a large collection of his manuscripts. In the following years, the collection passed from person to person and risked being divided up due to thefts, until the sculptor Pompeo Leoni took possession of some volumes at the end of the sixteenth century. He started to organize the 1,119 leaves by arranging the drawings on large sheets of 64.5 x 43.5 cm, which was the size used for atlases at the time, hence the collection became known as the Codex Atlanticus all over the world.

After several hereditary misadventures and looting by Napoleon’s troops that brought the codex to the Bibliothèque Nationale in France for some time, the Codex Atlanticus came back to the Biblioteca Ambrosiana in Milan where it has been kept for 200 years. Only a dozen pages are exhibited at a time.

Sala Fedriciana at the Ambrosiana Library

How the concept was born

Our idea stemmed from the desire to celebrate the 500th anniversary of da Vinci’s death with a digital project. Initial experiments investigated the possible areas for an in-depth analysis of such a broad work; by analyzing content, chronology, dimensions, and scope of the entire production of Leonardo da Vinci.

These initial studies identified the Codex Atlanticus as the work with the greatest experimental interest, and therefore the need to create a tool for the analysis and the communication of all 1,119 pages. From an analytical point of view, all the pages contain different types of drawings and writings. In order to create an application capable of representing such heterogeneous content, our research had to extend into multiple dimensions.

The first dimension concerns the pictures of the pages. Thanks to an archive belonging to Mondadori Portfolio, the leaves were digitized, cropped and cleaned up in order to be used in the application.

The second dimension concerns the writing. The possibility to include the original transcription of the texts has been lost because the ancient Italian, together with dialectal forms and several abbreviations, is extremely difficult to understand — even for an Italian today. There was another possibility represented by the so-called critical transcription (a specific transcription approach into contemporary Italian language), but that, unfortunately, is protected by rights belonging to third parties and, in any case, too extensive to be translated in English.

At this point, a new path opened up: finding out which topics and subjects are included on every single page in order to represent da Vinci’s thoughts. A 1970s philological study of the Codex Atlanticus’ content by Augusto Marinoni, a professor of Romance Studies at the Catholic University of Milan, provided a solution. Considered the greatest expert on the philology of Leonardo da Vinci, Marinoni published his studies in the book Il Codice Atlantico di Leonardo da Vinci : indici per materie e alfabetico (The Codex Atlanticus by Leonardo da Vinci: alphabetical table of content and by subject) edited by Pietro C. Marani in 2004. In addition to a substantial critical analysis of the work, the philological study contains a list of 140 topics and where they are covered, as well as an estimation of the year of writing for every sheet.

Thanks to this philological study we started to conceive a visualization capable of showing the evolution of Leonardo da Vinci’s thought through the years of his life.

Creating a brand new dataset

The very first step in the process was to understand how to get and build a consistent dataset. The index of Marinoni’s work was, in fact, a classification of all the subjects and topics contained in each page of the Codex. Starting from the digitization of this authoritative source we were able to create our very first textual dataset describing all 1,119 pages.

At the end of this (manual!🖖) transcription phase we had a 45,000+ row Excel file containing the list of topics covered on each page and, for most of them, the year in which they were written. Given the huge amount of topics (140) identified by Marinoni, we decided to group them into five subjects that could be easily associated to the content of the pages and could help us to make information accessible.

We decided to make this dataset open source and available on the website codex-atlanticus.it.

Finally, the dataset was completed by adding high-resolution images of the pages, protected by copyright, that we obtained thanks to the Mondadori Portfolio and the Ambrosiana library.

Understanding and sketching data

The design of the page module was the result of a long process of sketching data. Our goal was to find a flexible visualization of a single page that could be easily scaled down, readable and, at the same time, able to show an intuitive overview of the codex’s composition.

We created a custom module that visually summarized the subjects contained on a page as well as the numerical and chronological order of the page in the collection. The final result was the consequence of different attempts to combine the data of the page in a single visualization.

Some sketches from the initial phases

Given the amount of data that we had to process, we mostly tested our design through the implementation, creating scripts that could plot the design for all the pages of the codex. This process allowed us to design and test a lot of different options before choosing the most effective one for the application.

Target and devices

Once the visualization module was defined, we started focusing on the design of the entire experience. The design process started with some user research on different targets for the three supports we decided to develop. The user research led the design of the interfaces for three devices: web, mobile, and a 27-inch touchscreen sited in the Ambrosiana library, among the original pages of the Codex Atlanticus. When working on these different devices we needed to keep in mind that our goal was to create a product that could be easily accessible and useful both for experts and non-experts in Leonardo’s work.

We wanted to create a tool that could help our target audience to explore the Codex Atlanticus through data visualization. To do so, we had to make it as easy as possible to explore and understand the data, letting the users coming up with their own stories and conclusions. As an exploratory project, the goal is the exploration through data visualization. Given the complexity of the project, we found it really helpful to combine the user research with the user testing in order to build an effective design.

Looping between design and development

The design was built around the data visualization module. We decided to create a system to show the content of the pages and navigate among them.
The data visualization becomes the page itself where the subjects are represented by different colors and two vertical lines show the chronological and numerical order of the page out of the total of years/pages.

The page is then used as a small multiple to display the total amount of pages of the codex in the main screen of the interface, portraying a general overview of the entire content. The central visualization is surrounded by additional elements that provide extra information (such as the total amount of subjects appearing in the pages displayed with a bar chart) and allows the user to filter the pages according to some data (year of writing, page range, subjects, and topics in the page).

General overview of the application

The navigation of the entire application has been structured as a gradual zoom in into the details of a single page, guiding the users through exploratory storytelling.

Given the complexity of the project and the amount of data we had to deal with, the design process was often combined with the coding. It was not a linear process from the idea to the development but it was a circular looping process from the design to the implementation, testing and back to the design.

How data visualization becomes a tool to access information

This project can be seen as the result of the combination of different disciplines, where data visualization interacts with data exploration, user experience, coding and approaches to the field of digital humanities.
The purpose of the project is to make a substantial amount of heterogeneous data accessible and readable through a data viz driven interface. The application allows users to navigate complex collections of works, exploring the content at different levels of details and expands the knowledge of such precious content to a broad audience.

One of the most challenging and exciting aspects of the project was the design of the application for the touchscreen. The touchscreen is situated in the Federiciana room together with a selection of original pages of the Codex Atlanticus. The location of the screen adds important value to the application, allowing the audience to understand and dig into the analysis of da Vinci’s work. It represents a new digital way to access information in service of ancient and precious material, otherwise difficult to obtain.


Written by: Matteo Bonera, Sara Perozzi, Giulia Zerbini

🔗 Visit us 
🔗 Explore the application