Subjective Data

Hannah Davis
Jun 8, 2017 · 2 min read

I’m an artist, musician, and data scientist. My practice is centered in AI, particularly through data sonification, natural language processing, and generative music. For a few years now I’ve been working on an algorithm called TransProse, which identifies emotions in a piece of text and translates it into a musical piece with a similar undertone.

Through my work on emotions in AI, I’ve become particularly interested in the subjectivity of data, and have recently started further research into this area. Studying subjective data means studying things like:

Emotions and other subjective experiences: How do we incorporate these into AI, especially as AI starts moving towards more complex and personal areas? How can we capture abstractions like emotions in datasets, and how can we model things like personality and life experience?

Identifying subjectivity in “objective” datasets: If we examine commonly used machine learning datasets, what types of subjectivity do we find, especially when we look at how the datasets were created, what the motivations and demographics of the taggers are (if taggers are present), where the data comes from (if not manually tagged), who funded the dataset, what the dataset leaves out, where the dataset breaks down, etc.?

“Artisanal data”: In the current AI landscape, particularly related to art, it is possible to create and use explicitly subjective datasets for artistic purposes with no claims to objectivity/completeness/usefulness. (This, to me, is interesting and positive) Two of my favorites are Shinseungback Kimyonghun’s Animal Classifier and Sebastien Schmieg’s Decision Space.

Bias retention over time: A culture’s current set of values is a type of bias and is inherently present in datasets. How can we avoid bias retention over time, and is it possible to update datasets with newer values?

Terminology: All the above issues require labels, terms, and vocabulary to talk about them more concretely. How can we create these and use them with regularity? Would it make sense to label the ingredients of our datasets like we label the ingredients of our food (data marinara, anyone?)

Check back here and at dataobscura.org for more!


Data Obscura is Hannah Davis, mykola bilokonsky, and collaborators investigating liminal spaces in the digital landscape.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store