Helen D. Wall
5 min readFeb 18, 2016

120kMoMA peak years — A data visualization study of years information in The Museum of Modern Art Collection dataset

In my previous 120kMoMA study post, I presented data visualizations of major categories in MoMA’s dataset, which was made available on GitHub in 2015. In this second study of the Collection, I looked specifically at the years information — artist’s birth and death (Artistbio column), year artwork created (Date column) and year artwork acquired (DateAcquired column).

I concentrated on the following artwork classifications: Painting and Sculpture listed in the Painting & Sculpture Department, Drawing in the Drawings Department, and A&D Architectural Drawing and A&D Architectural Model in the Architecture & Design Department. The latter two, Drawing and Model, are combined here under Architecture.

Surprising results were found in age created. The highest percentage of artworks in MoMA’s Collection were created by artists in their late 30s to early 40s, in all four departments. The average age for each classification was 42, except for Drawing, where it was 40. This might be explained by the 2005 Gift of The Judith Rothschild Foundation Contemporary Drawings Collection, which may have included many younger artists. Female and Male patterns were similar, with a notable dip in Architecture for women at age 65, then a rise in the 70s. Exceptional standouts are Flora Manteola and Josefa Santos of the firm M/SG/S/S/S, whose works in the Collection span from 1962 to 2015.

The age created peak is worth further investigation, both within the MoMA Collection and art world in general. One factor to consider is “The 10,000 Hour Rule” discussed in Malcolm Gladwell’s book Outliers. He provides case studies to show that success is correlated with intensive amounts of time devoted to working at one’s craft. Of course, many other factors come into play as well, such as timing, talent and circumstances. Exploring artists’ critically productive years would be a worthwhile study in itself, although this would require amassing data from a wide range of museums and private collections, gallery and auction house records, catalogue raisonnées and publications.

Notable in year created is the peak in the 1960s, a time when there were significant changes in styles and movements, including abstract expressionism, pop art, minimalism, and conceptual art, as well as a shift in the art world center from Europe to America. Whether these explain the phenomenon would require further data digging.

The standouts in the subtle ebb and flow of year accessioned are the 2005 Judith Rothschild Drawing Gift, mentioned earlier, and Architecture acquisitions (even considering that 3,000 works from The Ludwig Mies van der Rohe Archive were not included in this analysis).

More surprising results were found in year created to accessioned (above) and year created to posthumously accessioned (below). A large percentage of artworks were accessioned within the first 10 years after the work was created, in all four classifications and both genders. In the posthumously accessioned chart there are two notable peaks for women. All the works acquired within 5 years of death were by women born from 1885 (Sonia Delaunay-Terk) to 1941 (Hanne Darboven). The second peak is 20 years after death, a 10 year lag as compared to accessions of works by men.

Notes about data cleaning for dates analysis
Date” when the artwork was created and “DateAcquired” were listed in various ways in the MoMA dataset, including month/day/year (e.g., January 24 — February 6, 1956), range (1972–73), and circa. For this analysis, the year in the “Date” column was extracted to a new column YearCreated. When a range of years was given, the end year was used for Painting and Sculpture; for Architecture the begin year was used since “DateAcquired” could fall within the date range. Any artworks in the above classifications with a range greater than 5 years were not included. The Drawing set was large enough to use single year where it was so defined. YearBorn and YearDied were extracted from “ArtistBio.”

To analyze when artworks were acquired by the Museum, I used the year noted in the “MoMANumber” column (accession number). This YearAccessioned number was simpler than extracting the year from various formats in the “DateAcquired” column. The accession number was also useful to identify multiple entries. For example, there could be many architectural drawings and models associated with a single project, so duplicates were removed and shown as a single entry.

Once dates were defined in YYYY format, calculations were simple. YearCreated minus YearBorn = AgeCreated. YearAccessioned minus YearCreated = YearCreatedtoAccessioned. YearAccessioned minus YearDied = YearsPosthumouslyAccessioned.

For gender analysis, I found the bulk of women artists’ names on the Modern Women MoMA page. More names were discovered by adding these to the Artist column, sorting and finding similar first names. In the Architecture dataset women were often found in the list of project credit names. For this reason, each person in a multiple name listing was given a separate entry, so that Male and Female could be more accurately counted.

The dataset sizes varied : Painting (2,147), Sculpture (1,371), Architecture (709; 1,102 by Gender), Drawing (6,737), artworks by Men (10,098) and artworks by Women (1,259). For this reason, the graphs were created with R ggplot to normalize the y-axis for proportion rather than frequency.

The findings in this study are based on The Museum of Modern Art Collection data downloaded from GitHub on 10/31/2015, DOI 10.5281/zenodo.35610. The visualizations were created in R.

Find the link to Part 1 of 120kMoMA here

My initial work on the MoMA dataset started in Lev Manovich’s “Social and Cultural Computing” course at the Graduate Center, CUNY, during Fall 2015.