The Dataset Nutrition Label Project | Open Leaders 6
The Dataset Nutrition Label Project empowers data scientists and policymakers with practical tools to improve AI outcomes
I interviewed Open Leaders 6 participant Kasia Chmielenski to learn more about the Dataset Nutrition Label Project and how you can contribute to the work.
Q: What is the Dataset Nutrition Label Project?
A: The Dataset Nutrition Label Project aims to create a standardized, recognizable label framework — similar to food nutrition labels — for datasets that improves industry behavior around dataset transparency, leading to healthier artificial intelligence overall.
Q: Why did you start the Dataset Nutrition Label Project?
A: Many artificial intelligence algorithms are found to make decisions that are biased towards or against specific people or outcomes. Much of this bias is in fact gleaned from the underlying data used to train the algorithms themselves, and yet the data science field lacks standardized ways to interrogate that data ahead of time. This is why we created the Dataset Nutrition Label Project: to provide a quick way for practitioners to assess the ‘health’ of their data before algorithms are built using that data.
Q: What was your experience like at MozFest this year?
A: So many smart people doing so many weird and magical things! I really enjoyed meeting people whose interests spanned virtual and digital spaces; social, artistic, and mathematical concepts; geographic regions; diversity of thought and life experience.
Q: What challenges have you faced working on this project?
A: Data comes in all shapes and forms, which makes standardization of metadata incredibly challenging. A ‘nutrition’ label for a dataset about trees in Central Park will look different from a label for an image dataset of human faces. Instead of building a single label format, we are focusing on building a framework that is flexible depending on the kind of data you are assessing.
Q: What kind of skills do I need to contribute to your project?
A: We are currently looking for dataset authors or curators to experiment with us on building labels, especially those in the Open Data space. We’re also looking for UX and UI designers who want to prototype the best way to visualize large amounts of data and communicate this to data scientists at multiple skill levels.
Q: How can others contribute your project?
A: We would love to work with you! Check out our paper and prototype or take a look our code and drop us a comment. If you’d prefer to have a conversation, you can reach us at firstname.lastname@example.org. We look forward to hearing from you!
Q: How has the Open Leaders program helped you with your project?
A: The Open Leaders program provided a structure for us to better understand ourselves. Through the program, we were able to assess our readiness for contribution, the ways in which we are seeking collaboration, and a rough roadmap for what comes next. Also: we were able to connect with some amazing experts for advice and feedback.
Q:What meme or gif best represents your project?