“What does the data tell us?: Representation, Canon, and Music Encoding”

Anna Kijas
27 min readMay 29, 2018


*Keynote text delivered at the Music Encoding Conference, University of Maryland, May 24, 2018.

I am thrilled to be here with you today. I would like to begin by thanking the organizers, Raffaele Viglianti and Stephen Henry, for inviting me to give this keynote. I would also like thank the students and staff at the University of Maryland Libraries, MITH, and the Clarice Smith Performing Arts Center who were involved in making this conference run smoothly. Thank you.

When Raff and Stephen invited me to give this keynote they told me that the theme of the conference was “encoding and performance.” My original inclination was to talk about the ways in which I have engaged with digital tools and methods to facilitate faculty research and pedagogical initiatives, discuss the affordances that digital editions and encoded music can bring to a music seminar, as well as, ways in which students might interact with these materials. I will talk a bit about these things, but what was really on my mind and the focus of this keynote, is the issue of representation and canon. My hope is that this talk will encourage discussion and reflection.


Between 2010–2015, I was working at the University of Connecticut (UConn) as the Music & Dramatic Arts Librarian and a digital humanities specialist. The music department offered degrees from the undergraduate through the doctoral level. One of my many responsibilities was to teach several sessions of a graduate music research and bibliography course, in which students would be introduced to key resources (both analog and digital) in music bibliography and research. One of the goals of this course was to expose students to the research process and activities, such as finding and accessing resources of primary and secondary materials, creating a bibliography, or writing a literature review.

A reading room at the Finnish National Archives in Helsinki. Photo: The National Archives of Finland/Marko Oja.

Archival research was new to most of the graduate students enrolled in this course, therefore one of my goals was to not only discuss and show them the various thematic catalogues, indexes, or bibliographies that could lead them to manuscripts or early editions, but to also demonstrate the process of searching for digitized materials in the numerous digital open access collections that had come online in the 21st century, as well as how to locate interactive or analytical music resources.

Example of digitized manuscript in Bibliothèque nationale de France. Lili Boulanger, Esquisses et brouillons pour Clairières dans le ciel, 1914–16 (manuscrit autographe).

Digital music collections, such as Gallica (BnF), Polona (from the National Library of Poland), Early American Sheet Music (Library of Congress), Sheet Music Consortium, or the IMSLP Petrucci Music Library, have in the last 20 years or so, made music manuscripts and early editions widely accessible and discoverable to scholars, students, and music enthusiasts.

Example of digitized sheet music in Sheet Music Consortium. Teresa Carreño, Gottschalk Waltz, 1863.

For students (performers or musicologists, alike), access to materials in a digital environment can facilitate their research and enable them to analyze and study compositional works previously only available through a visit to the archives, on microfilm, or in a facsimile edition. As a subject and digital humanities librarian, I not only introduced students to these materials, but also demonstrated ways in which they might wish to engage with digitized materials and encoded data, hoping that a few of them might show interest in digital humanities or digital musicology as a way to leverage the affordances of digital tools and methods. The digital collections mentioned above, amongst many others, are important resources, however, they primarily offer access to high quality images with minimal metadata and no underlying music data that can be extracted for further analysis and study. How do most students interact with these materials? In general, I found that students will view the images online, bookmark URLs, extract measures or selections for their papers (often screenshots), and download or print individual or full images for further study or annotation purposes.

Creating notes and saving items in Polona about Maria Szymanowska, including, Vingt Exercises et Préludes (Leipzig: Breitkopf u. Härtel, [1819]). Nr. 43354729.

Some of the platforms, such as Polona, enable users to create notes or annotations about the material they are viewing and retrieve or view them all together, as well as add items to a collection for later review or study.

Over time, digital library collections, including those from the Internet Archive, Gallica, Polona, and the HathiTrust have applied OCR to textual materials, such as historical newspapers or journals, enabling users to search within the texts and in some cases to also make corrections of the text. Unfortunately, this is not yet a standard practice for music materials. As many of you are aware, optical music recognition (OMR) is a technology that is still under development and can not yet be applied at large scale to digitized music materials. When students create and save notes for digitized music materials, it is akin to writing on a paper sticky (post-it) note and putting it into their score or text. In general, projects which have attempted to facilitate user interaction between source materials in a single digital workspace have focused on digital music editions.

This includes the Online Chopin Variorum Edition (OCVE). The OCVE has aggregated Fryderyk Chopin’s manuscripts and printed editions from libraries in Europe into a single platform where users can compare and view sources, melodic examples, or create annotations and commentaries that they can save in their own collection. This and other more recent digital music editions (such as the Beethoven Werkstatt, Giuseppe Sarti Edition or Carl Maria von Weber Gesamtausgabe) have focused on addressing issues around the “work concept,” especially for composers or repertoire that may not have a definitive textual and music source history.

Impromptu, Op. 29, bar 1 compared across multiple sources in OCVE, http://www.chopinonline.ac.uk/ocve/.

These digital music editions are valuable contributions, however, there are some drawbacks, including that not all are open access editions and there is not yet a way to aggregate these editions and other music data through a single portal or search interface, which would facilitate access and discovery, especially for students who are new to the research process.

In addition to digital music editions, there are encoding projects and resources that enable analysis, manipulation, and comparison of musical patterns and repertoires via graphical user interfaces, such as Themefinder (Center for Computer Assisted Research in the Humanities, Stanford University), Medieval Music Database, The Josquin Research Project (Stanford), or CANTUS manuscript database (Waterloo U/international), which enable students and scholars to search across repertoire due to the underlying encoded music data. As with digitized manuscripts, however, many of these tools focus primarily on early music, (often liturgical) repertoire from Western Europe, and while they may occasionally include a small percentage of composers from other time periods, they generally exclude works by women, people of color, and non-western composers.

Computational musicology projects developed over the last several decades, such as Computerized Mensural Music Editing (CMME) (UC Davis/Utrecht), MuseData (CCARH, Stanford), or Scribe (now NeoScribe) music software, have also focused primarily on early music repertoire and although the music data is generally accessible online, doing anything with this data presents a high barrier to entry for the graduate students who may have little or no familiarity with encoding, programming, or computational analysis. Newer initiatives focused on the application of open and interoperable schemas and standards, including the music encoding initiative (MEI), music21 toolkit (MIT, Cuthbert), and musicXML with which scholars are creating music datasets and digital music editions that represent various repertoires, genres, and time periods, present a greater range of possibilities for the study and analysis of music. These initiatives along with efforts from the IIIF (International Image Interoperability Framework) and optical music recognition (OMR) communities continue to push the boundaries of what will be possible when digitized music sources across collections and platforms can be displayed and searched through a single interface, such as Single Interface for Music Score Searching and Analysis (SIMSSA) (under development). Ongoing efforts built upon the decades long work of scholars, students, library and archives professionals, technologists (and others) have brought us closer to the promise of what music encoding and OMR can offer to the music research and scholarly community. Yet, there are still a number of areas and challenges that need to be addressed, especially in areas of pedagogy & training, archival selection, recovery, and canonization.

Acts of Recovery

How has the musicological community participated in acts of recovery and how has this translated into the work of the music encoding community? I should first explain what I mean by acts of recovery: this refers to the uncovering or recovery of histories, narratives, and works by underrepresented or marginalized people. In academia, scholars, primarily engaged with second wave feminist thinking (1960s-80s) became increasingly interested in decentering historical (hegemonic) narratives. In musicology this took a number of forms, including acts of recovery, development of a feminist (and later queer) music criticism, rethinking the canon, examination of ideological and cultural constructions, and application of methods from outside the discipline (such as cultural studies, literary criticism, feminist theory, queer studies, philosophy, and anthropology). I will mention a handful of publications by scholars who aimed to decenter the canon through acts of recovering women musicians. These publications and scholars have a place on my bookshelf and have shaped my approaches to and understanding of canon and gender in relation to musicology.

One such publication was a 1986 collection of essays, Women Making Music: The Western Art Tradition, 1150–1950, edited by Jane M. Bowers and Judith Tick, which focused on highlighting neglected women composers overlooked by historical musicology. This text was an important contribution to musicology and is often on the reading list for university music courses, often those relegated as “women in music.” In honor of the 25th anniversary of WMM at the 2012 Feminist Theory in Music Conference, Tick shared that, looking back, she and Bowers “did not confront the use and misuse of “greatness” in contemporary historical musicology. (At its most perverse, misuse implicitly promotes the notion that the least bit of trivia about a “great man” is more important than scholarship on “second-tier” musicians.) We did not destabilize the idea of a “canon.” Although the editors may not have explicitly stated that their aim was to destabilize the canon, their publication can be viewed as an implicit, yet profound, contribution to decentering the canon. In 1993, Ruth A. Solie’s Musicology and Difference, pushed the boundaries by exploring or applying concepts central to disciplines outside of musicology, including critical theory, ethnography, or post-structural theory, in order to address issues of otherness, gender, sexuality, and ideology. Essays in Susan C. Cook and Judy S. Tsou’s Cecilia Reclaimed (1994) blurred the line between Western and non-Western music, and high and popular culture, thus demonstrating that “the West is to be understood as a specific culture among many others.”

Alongside scholarly writing, a number of publishing initiatives sprang forth focused on uncovering and preserving works by women composers from the medieval through contemporary period, these included music presses, ClarNan (est. 1984), Furore Press (est. 1986), and Hildegard Publishing Company (est. 1988). In an effort to resurrect compositions by women, and package them in a way so they could supplement content in university music courses, anthologies of music by women, such as James Briscoe’s Historical anthology of music by women (1986; later 2004 ed. New HAMW), which presented scores and recordings of vocal and instrumental music by composers from ancient Greece through the 20th centuries alongside biographical articles written by leading scholars, gained popularity.

On a personal note — Although these presses and anthologies were around as early as 1984, my own understanding of canon was first shaped and reinforced by my Polish piano teacher, trained in western music traditions, who encouraged my love of Chopin, but never assigned any repertoire by women composers. Perhaps it never crossed his mind to mention that there was a Clara Schumann or Maria Szymanowska. Or more likely, it is because this assignation of value and superiority to works within the canon has been passed down across generations by teachers to their students, whether they are performers, composers, or scholars. Students that I work with have expressed similar experiences of minimal exposure to works by women or marginalized figures during their undergraduate and even graduate studies. As Katherine Bergeron wrote in the prologue to Disciplining Music (1992), “The canon, always in view, promotes decorum, ensures proper conduct. The individual within a field learns, by internalizing such standards, how not to transgress.” It was not until I was an undergraduate performance major when I searched for and “discovered” women composers thanks to the encouragement of several musicologists and mentors.

In the essay “What Do We Want to Teach,” written 19 years ago in Rethinking Music, Ellen Koskoff observed that “Simply creating a canon is not a problem; nor is embodying it with one’s own meaningful values. The problem comes with canonization — the institutionalization of certain works over others through the imposition of hierarchies of self-invested value upon other people and their musics.” How does this apply to the work we do as music scholars, librarians or archivists? When we teach music history, theory, or repertoire, when we program or perform works, when we create data — we are privileging works by certain composers and excluding others. We know this of course, but it becomes glaringly apparent when we look at, for example, whose works are performed by major orchestras in the United States. Staff at the Baltimore Symphony Orchestra have analyzed data from 89 of the largest symphony orchestras in the United States with membership in the League of American Orchestras. For the 2015–2016 season, they found that in a total of 2978 concerts in which 504 composers were represented, 98.3% of the composers were male and 1.7% female.

(Partial view) Poster by Rachel Upton and Ricky O’Bannon. https://www.bsomusic.org/stories/what-data-tells-us-about-the-2015-16-orchestra-season.aspx.

And if you are curious which composer’s works are most frequently performed, here is the breakdown.

(Partial view) Poster by Rachel Upton and Ricky O’Bannon.

In the Winter 2018 issue of Symphony, Jesse Rosen, President and CEO for the League of American Orchestras, interviewed several thought leaders from academic or performance organizations to consider this question, “are orchestras culturally specific?” Cecilia Olusola Tribble, community and organizational development coordinator for the Metro Nashville Arts Commission, responded “If we think about how white art forms, white people, white icons, composers, have always been at the fore of writing history — that is the issue. The question of cultural specificity is raised in a way that doesn’t take responsibility for the fact that classical music, historically and presently, is a colonizing force, and is a tool of colonization. Not only here in the United States, but globally.” Her statement is reflected in the content taught in music history courses, the repertoire our students perform, and the music we encode.

This reality is also visible in the technology and algorithms we use on a daily basis, for example if you go to Google and type in “music composers” your results will display a bar of images that you can scroll through. Here are a few of them…

What do you notice? In order to find women composers, black composers, Asian composers, etc. you need to add an additional attribute term to your search, such as “women music composers” or “black music composers.”

The default according to this algorithm is that “music composer = white male.”

For decades, music encoding has largely been the domain of scholars interested in early music. Access to medieval and renaissance music sources coincided with the availability of textual or music manuscripts and early printed editions, on microfilm, and then online, as digital images (the latter since the late 1990s). The work of libraries and archives in creating accessible online content has benefited the early music community and enabled the development of music encoding projects. Digital library collections, such as DIAMM a comprehensive site for a complete list of polyphonic music manuscripts up to c. 1600; or the British Library Catalogue of Illuminated Manuscripts, as well as many individual institutional digital library collections continue to make high-resolution digital images available.

With funding and interest in building interoperable platforms, such as Europeana, Polona, Gallica, and HathiTrust there has been an increase in accessible, aggregated digital content and metadata from individual institutions. While there has been growth in the number of manuscripts or early editions by underrepresented or unknown composers that can be accessed and studied online, institutions still privilege composers who hold a prominent place in the western music canon. For example, while the British Library has made numerous music manuscripts and editions by well-known composers available as digital images, including those by Beethoven, Handel, Haydn, and Schubert, works by women composers, such as Ethel Smyth (1858–1944), Francesca Caccini (1587–1641), or Thea Musgrave (b.1928), have not yet been digitized.

“International Women’s Day 2018,” British Library Music Blog, (March 8, 2018). http://blogs.bl.uk/music/2018/03/international-womens-day-2018.html.

And, these are the more notable women composers that have made it into our music history courses, whose works were firsts in their own right, such as Smyth’s opera Der Wald, which premiered in 1903 and remained the only opera by a woman to be staged at the Met until 2016; or Caccini’s opera La Liberazione di Ruggerio dall’Isola d’Alcina (1625), believed to be the first opera composed by a woman composer. Works by these and other women are missing from encoded datasets and digital music editions. Even when published scores or digital images are available these composers continue to be overlooked and excluded from projects, especially those receiving grant funding for the creation of large music datasets. If we continue to exclude works by women, people of color, and non-canonical composers, then how useful will our data be and for whom?

Decisions for selecting composers, repertoire, genres for encoding or digital music edition creation, are often linked to the practices and research interests of the scholars at a particular institution or even geographic location. They are also tied to the disciplinary traditions passed from faculty to student and are influenced by the particular holdings (analog or digital) of an archive or library collection, as well as by funding sources.

Libraries and archives digitize content to make it accessible for scholars to use in their research and often expose data or create datasets that can be used for encoding, computational analysis, digital edition creation, or tool building. Some scholars may be involved in acquisitions decisions, in particular when expensive manuscripts or rare materials are being considered, as well as in the selection or prioritization of content for digitization. One problem that has arisen is that universities often focus their efforts on large-scale digitization of hegemonic texts, such as literary corpora by white European male authors, liturgical manuscripts, or Western early music editions, which again reinforces canonization. The collections in libraries and archives, primarily those in first-world countries with access to digital imaging equipment and digital library infrastructure, as well as institutional or grant funding, perpetuate not only canonization, but also colonization.

While I can not fully address the issue of colonization in this talk, I will point out that there are scholars and projects who are challenging colonization in the archives and academia. A few recent examples include, Elizabeth Maddock Dillon’s (Northeastern) work on the Early Caribbean Digital Archive, which is working to uncover a literary history of the colonial Caribbean that is “written or related by black, enslaved, creole, and/or colonized people” and Tamara Levitz’s (UCLA) 2017 talk at the Society for American Music (SAM), in which she called for addressing “structures of inequality and white supremacy in” [SAM].

In regards to canonization, most librarians or archivists see this happen firsthand in the materials that are housed in their institutions or prioritized for digitization. At Boston College, I have thus far been involved with two music encoding projects.

Michael Noone, Graeme Skinner, and Boston College Libraries, The Burns Antiphoner, (2016). DOI 10.17605/OSF.IO/2SAUF; burnsantiphoner.bc.edu.

One of these projects focused on a 14th century liturgical antiphoner (Burns Antiphoner) which we encoded, made available through a Diva.JS viewer in several formats, including JSON and MEI.XML, and contributed to the CANTUS database. The encoded data augments the digitized manuscript in a way that may contribute to scholarly research and greater understanding of this particular genre of music, (not to diminish our accomplishment) however, we have in a sense contributed a low-hanging fruit, rather than challenging the notion of canon.

Recently, there has been a noticeable increase in discourse around library, archives, faculty, and institutional engagement in de-centering and decolonizing collections. A number of scholars, among them, Elizabeth Maddock Dillon (Northeastern U), Tonia Sutherland (University of Alabama), Lae’l Hughes-Watkins (Kent State University), and Safiya Noble (University of Southern California) have been leading conversations on erasure, colonization, and social justice in the archives, as well as algorithmic bias. Professional organizations, such as the Digital Library Federation (DLF), are providing support to professionals engaged in these efforts through grants, initiatives, and resources, including bibliographies around topics, such as “Ethics and Social Justice” for advancing hidden collections or documenting culturally sensitive materials.

As more libraries and archives begin to thoughtfully evaluate their collections, as well as their selection and digitization processes, they will require allies and advocates in their faculty. In addition to the institutional mission or strategic goals, curricular and research needs drive priorities around acquisition and digitization and this necessitates close interaction between librarians/archivists and faculty. Often institutional and faculty priorities take precedence over library or archives-led projects, therefore it is critical that there is faculty buy-in for initiatives focused on recovering or de-centering collections, which may lead to fruitful collaborations.

Lansing Urban Renewal (Michigan State University, student work for RCAH 192).

Some of these collaborations make take the form of digital pedagogy projects, which may use archival materials that focus on recovering historical narratives and social justice issues, as demonstrated in projects, such as the student developed Lansing Urban Renewal project from Michigan State University or the Women, Work, and Song in Nineteenth-Century France exhibit from the McGill University Libraries, which features musical collections by women through scholarly essays, digitized content, and exhibits.

Kathleen Hulley, Ph.D., and Kimberly White, Ph.D., Women, Work, and Song in Nineteenth-Century France, McGill University Libraries.

My intention is not to criticize early music scholars or the music encoding community in their efforts to make data available for analysis and study or to create encoded editions. This is an important area of research and contribution to our understanding of music. Instead, I suggest that we examine and consider the digital canon that we are creating. A canon that does not challenge or decenter, which has largely excluded work by women, people of color, and other underrepresented groups. As we continue to seek materials and create larger datasets in order to develop our machine learning capabilities, optical music recognition functionality, tools for searching, aggregating, and analysis (IIIF, SIMMSA), we must keep in mind which composers and repertoire we are including and which we may be excluding.

Canonization, is of course not a new phenomenon and is not limited to musicology. This is a problem that can be found in other disciplinary areas, including literary studies and digital humanities. As Texas A&M literary scholar Amy Earhart writes in “Can Information be Unfettered? Race and the New Digital Humanities Canon” (2012) “Without careful and systematic analysis of our digital canons, we not only reproduce antiquated understandings of the canon but also reify them through our technological imprimatur.” She continues… “In digital humanities, however, we have much theoretical work to do in the selection of materials and application of digital tools to them.”

In literary studies with access to digitized texts and the TEI, a number of digital edition projects were born in the 90s and early 2000s, largely focused on white male authors, such as the Walt Whitman Archive, Dante Gabriel Rossetti Archive, Algernon Charles Swinburne Project, or the Mark Twain Project Online. A few encoding projects centered around women authors or feminist literary history also came online and still continue to develop, specifically the Women Writers Project (WWP), Willa Cather Archive, and Orlando. The WWP, in particular, is engaged in recovery of texts by early modern women writers. The project team is creating an encoded dataset of these texts that can be analyzed, studied, and visualized by students and scholars. Also important to note, is that through training of students, scholars, and librarians in the TEI and in developing assignments and teaching materials from their collections, the WWP project team, has created a community of practitioners.

The musicology community is beginning to follow in the footsteps of its colleagues in literary studies and with the development of MEI and MusicXML (early 2000s), developers, scholars, archives and library professionals are applying the standards and schemas drawn from the TEI and XML communities to music sources in an effort to build digital editions.

MEI Projects, http://music-encoding.org/community/projects-users.html.

Within the last decade or so a number of digital music editions have been under development or published that focus on music of composers or repertoire from the 16th, 18th, and 19th centuries. In terms of encoding projects, the musicology community does not yet have anything comparable to the Women Writers Project, Willa Cather Archive, or Orlando. What are we waiting for?

If we take a look at which projects have received grants, we will find that in the United States, there were a total of 15 projects (since 2005) funded by the National Endowment for the Humanities and Andrew W. Mellon Foundation. A query in the NEH grants database with the search terms: “digital music,” “music encoding,” “musicxml,” and “optical music recognition” retrieved ten proposals from 2008–2016, for a total of $1,001,056.

National Endowment for the Humanities for music encoding projects, 2008–2016.

The music encoding initiative (MEI) benefited from funding through the Digital Humanities division’s joint NEH and DFG (German) program, receiving funding in both 2009 and 2010 to focus on developing the data model and standard. Four of the ten projects focused primarily on encoding early music, specifically renaissance repertoire, and the one project focused on musical style of western music from 1300 to 1900 that aimed to build “one of the largest online repositories of symbolic musical data” did not include a single woman composer or person of color in the public-facing dataset. In this chart, you will also see that the majority of the projects fall into the tool building category.

The Andrew W. Mellon Foundation has funded music encoding projects since the early 2000s. Querying their grants database with these search terms: “digital music,” “digital musicology,” and “music encoding”, retrieves 5 relevant projects, funded at a total of $1,621,000. Three of these projects focused on (blue) tool building and creation of digital music editions (UK institutions), while the other two (orange) focused on encoding a musical corpus (Indiana U).

Andrew Mellon Foundation Grants for music encoding projects, 2005–2016

This sampling of data from the NEH and Mellon grants also illustrates what Amy Earhart notes in her own observations of digital humanities projects, that “examination of funded projects reveals that the shift toward innovation has focused on technological innovation, not on innovative restructuring of the canon through recovery.” As scholars continue to pursue future grant projects there should be a conscious effort to be more inclusive and perhaps seek out partners across institutions (including libraries and archives) who may house or have access to materials that have been overlooked. Grant agencies should also encourage applications for music encoding projects that explore or address issues of intersectionality, diversity, and recovery.

Digital music edition projects exist largely in institutions or centres located in Europe where there is a strong tradition of scholarly editing that has flourished, carries more value, and often receives greater resources and institutional support than in other parts of the world. This can be seen with a number of recent projects at institutions, such as the Danish Centre for Music Publication, Akademie der Wissenschaften und Literatur Mainz, Programme Ricercar at the Centre for Renaissance Studies in Tours (CESR/University of Tours), and Universität der Künste Berlin. Many of these institutions are also building tools for computational analysis or graphical user interfaces for non-programmers, which are meant to break down some of the barriers associated with music encoding. While one of the benefits of encoding music is that it enables scholars to encode all versions of a manuscript or printed edition, individuals or institutions engaged in compiling digital music editions are still stuck on the singular composer/creator model. In her essay on “Editing Early Modern Women’s Manuscripts“ Texas A&M English scholar Margaret Ezell makes the following observation, “editors do not please to select certain types of material and this is in part because perhaps we are not yet changing some of the basic assumptions about what an ‘edition’ does, or in [historian Michael] Hunter’s terms, what is ‘appropriate.’ If those of us, in positions of privilege and authority, in selecting music sources for digitization, encoding, or edition creation, are looking for items that represent a “complete” collection or meet the criteria of the “work concept,” then we will continue to overlook the sources that would otherwise contribute valuable data and content for a richer understanding of musical history and a more inclusive digital canon.

What can we gain when we recover, research, and analyze works, typically excluded from the canon? Aspects of musical style, compositional process, attribution, or musical stylometry across genres could be better analyzed and studied if we have a greater representation of musical works, especially in a dataset. Musicologists have analyzed (with traditional tools) stylistic features of composers, such as Fanny Mendelssohn Hensel (1805–1847), who collaborated closely with other musicians, and whose compositions were published under her brother, Felix’s, name. Creating data or encoded editions of works by Hensel and other women, would enable distant reading across repertoire, identification of unique or similar features with musicians who may have been their contemporaries, relatives, or mentors. Through my research of underrepresented composers (and performers), including Maria Szymanowska (1789–1831) (a predecessor and influence on Chopin) and Teresa Carreño (1853–1917), I have found that close musical communities and mentors were very important to these and other women musicians. And as still continues today, the role of their music teachers had an impact on their compositional style and performance repertoire. What can we learn from the experiences of women composers if we had a dataset to explore? Would it be possible to more easily examine and identify connections between teachers and students, as well as influences on their compositional development? How might elements in the compositional process deviate from, or exemplify, the musical structures and experiences that we expect based on our understanding of the canon? How can we leverage music data and technology to investigate musical communities of practice across the centuries at close and distant levels of reading?

In the 2004 monograph, Empirical Musicology, Nicholas Cook wrote “recent developments in computational musicology present a significant opportunity for disciplinary renewal… there is potential for musicology to be pursued as a more data-rich discipline than has generally been the case up to now…” As more content becomes available in forms that can be used for optical music recognition or reformatted for encoding and data analysis, we can not continue to ignore works by women, people of color, and other marginalized figures nor exclude them from large data-driven projects in musicology. Doing so will result in a poorly developed dataset that will impede our understanding of musical development over time.

Post-Script: Promises & Suggestions

During the last thirteen years or so since digital humanities has become commonplace on university campuses, in conference presentations, and publications, music encoding and digital musicology have also gained more interest amongst scholars, students, library and archives professionals. Funding bodies, such as the NEH, Mellon foundation, Canadian Social Sciences and Humanities Research Council, UK Arts and Humanities Research Council, and others, have supported digital music projects to encode musical corpora, build data models, create digital editions, and develop tools. Librarians, archivists, and other specialists have collaborated with scholars on many of these projects, have presented on the schemas, standards (musicXML, MEI) and encoding projects, and provided workshops or trainings at annual conferences, such as the Music Library Association or Digital Humanities.

As a community, we have invested much time, resources, and intellectual labor into music encoding. As we know, music encoding is resource intensive and also presents numerous barriers to those unfamiliar with the schemas, standards, programming, and tools. At the same time, it holds many promises for the future of music research and the music community. Applications of IIIF and OMR to a larger body of digitized content can make the underlying music data available to scholars and students through search interfaces, such as SIMSSA and other federated search engines. Libraries and archives can explore the use of OMR on their collections in order to extract music notation and incipits, as well as encode music according to MusicXML or MEI standards to meet the Library of Congress’ recommended formats specification for long-term preservation. Larger datasets that better represent diverse repertoires can provide a more accurate understanding of the development in music notation, melodic borrowing, methods of imitation and variation, authorship, and other musical features.

Although there is a growing community of music encoders who are willing to work with XML and develop programming expertise in order to contribute to building tools, analyzing datasets, and creating digital editions, there is still no standard curriculum or lesson plan in place in most musicology or library science and information graduate programs. Institutions and programs, such as the CCARH (Stanford), DDMAL (McGill), or Indiana University, amongst others around the globe, offer students and professionals the opportunity to engage with current digital musicology technologies and standards, however, many of us are self-taught or become acquainted with the languages and tools after we earn our degrees and have already moved into our careers as faculty or information professionals. There has been a growing need for training in music encoding, as well as other areas of digital musicology. Efforts such as pre-conference workshops, ThatCamps, or summer institutes, such as DHSI, have served to provide some training, yet too often music faculty who teach undergrad through graduate programs do not fully consider the affordances of digital pedagogy, which may include music encoding, and the ways in which it can enhance their students’ toolkit. Post-graduate opportunities related to music encoding or other areas of computational musicology are often available, but the pool of students who may be interested or have some expertise in this area is still limited.

Students learn to stick to the canon from faculty and institutional programs. Those of us in the music encoding community need to consider cultural constructions, just as much as our colleagues who teach music history classes. For starters, we should not only rely on the music examples provided in music history or theory textbooks, but look beyond to resources, such as Music Theory Examples by Women, which identifies concepts applied in early to modern music repertoire.

Music Theory Examples by Women, http://musictheoryexamplesbywomen.com/.

When we create encoded music examples we need to include works by women and underrepresented composers. There are a number of initiatives and online sites, including the Diversity Composer Database and the Women Composers Database where unfamiliar or non-canonical composers and works can be identified.


We have already begun to see students, faculty, and performance groups push back against the traditional canon — demanding that curriculum be re-written and concert programs revised to include more than a handful of women or marginalized composers.


Although many of us are working to create tools that may not require musicologists to be familiar with programming or markup languages and schemas in order to use them, there is value in knowing what happens in the “black box.” There is still a hesitancy from faculty to explore music encoding or the application of computational musicology unless they are already familiar with it and use it in their own research. And yes, there is still skepticism of using a computer to study musical works. I have been told a number of times by faculty that music encoding is not a scholarly act and that they do not have time for it.

Moving forward, initiatives such as Music Scholarship Online (MuSo), similar in concept to NINES, 18thC Connect, and the Advanced Research Consortium which have served as peer-review and aggregators of DH (primarily textual/TEI) projects, promises to be an equivalent for music-focused digital projects. While MuSO will not solve the issue of canonicity, it can be used as a tool to not only bring together these dispersed datasets, editions, and other projects, but also to promote transparency through peer review and as a means to critique our progress in addressing canonization. Peer review of digital musicology projects may also persuade scholars to venture into this area, rather than to continue to pursue traditional publication methods often tied to promotion & tenure. If, as Earhart suggests, “standards and institution have become a core part of project success and sustainability, crucial to the canonization of digital work,” then initiatives, such as MuSo and SIMSSA, may be able to shift us towards acts of recovery.

As Tim Crawford and Richard Lewis wrote in their review of the “Music Encoding Initiative” in JAMS 2016, “There may still be some musicologists who would maintain that, since the discipline has managed quite nicely for the best part of two centuries using traditional (non-digital) resources, approaches that require the use of a computer are, somehow, invalid or unnecessary. But for the rest of us — and certainly for most younger researchers — it is obvious that modern tools are needed to enable new modes of investigation that will produce genuinely useful insights into historical repertoires.” If we are to take Crawford and Lewis up to the challenge and use our “modern tools” to “enable new modes of investigation that will produce genuinely useful insights into historical repertoires,” then we must make sure that we, our students, and our collaborators, are mindful of whose works or repertoire we recover, and, consider the cultural constructions that have determined, up to this point, what we have selected and reproduced digitally.


Baker, Vicki D. “Inclusion of Women Composers in College Music History Textbooks.” Journal of Historical Research in Music Education 15, no. 1 (2003): 5–19.

Bergeron, Katherine; Philip V. Bohlman, eds.. Disciplining Music. (Chicago: University of Chicago Press, 1992).

Clarke, Eric F.; Nicholas Cook, eds. Empirical Musicology. (Oxford: Oxford University Press, 2004)

Citron, Marcia J. Gender and the Musical Canon. (Cambridge: Cambridge University, 1993).

Cook, Nicholas; Mark Everist, eds.. Rethinking Music. (Oxford: Oxford University Press, 1999).

Cook, Susan C., Judy S. Tsou, and Susan McClary, eds.. Cecilia Reclaimed. (Urbana: University of Illinois, 1994).

Crawford, Tim; Richard Lewis. “Review: Music Encoding Initiative.” Journal of the American Musicological Society 69, №1 (Spring 2016): 273–285.

Downie, John Stephen; Sayan Bhattacharyya, Francesca Giannetti, Eleanore Dickson, and Peter Organisciak. “The HathiTrust Digital Library’s Potential for Musicology Research,” Digital Libraries for Musicology. Manuscript under review.

Dumitrescu, Theodor; Karl Kugle; Marnix van Berchum, eds. Early Music Editing: Principles, Historiography, Future Decisions. (Turnhout, Belgium: Brepols, 2013).

Earhart, Amy. “Can Information be Unfettered? Race and the New Digital Humanities Canon.” Debates in the Digital Humanities. (University of Minnesota Press, 2012).

Ezell, Margaret J. M. “Editing Early Modern Women’s Manuscripts: Theory, Electronic Editions, and the Accidental Copy-Text.” Literature Compass 7/2 (2010): 102–109.

Hughes-Watkins, Lae’l. “Moving Toward a Reparative Archive: A Roadmap for a Holistic Approach to Disrupting Homogenous Histories in Academic Repositories and Creating Inclusive Spaces for Marginalized Voices,” Journal of Contemporary Archival Studies 5, Article 6 (2018).

Levitz, Tamara. “Decolonizing the Society for American Music.” The Bulletin XLIII, №3 (Fall 2017).

Rosen, Jesse. “Are Orchestras Culturally Specific?” Symphony (Winter 2018): 14–20.

Solie, Ruth A. Musicology and Difference: Gender and Sexuality in Music Scholarship. (Berkeley: University of California Press, 1993).

Tick, Judith. “Reflections on the twenty-fifth anniversary of Women Making Music.” Women & Music 16 (2012): 133–138.