Data & Lyrics
Published in

Data & Lyrics

Trustworthy AI: Check Where the Machine Learning Algorithm is Learning From

We do care what our children learn, but we do not care yet about what our robots learn from. One key idea behind trustworthy AI is that you verify what data sources your machine learning algorithms can learn from. As we have emphasised in our forthcoming academic paper and in our experiments, one key problem that goes wrong when you see too few small country artists, or too few womxn in the charts is that the big tech recommendation systems and other autonomous systems are learning from historically biased or patchy data.

This is precisely the type of work we are doing with the continued support of the Slovak national rightsholder organizations. In our work in Slovakia, we reverse engineered some of these undesirable outcomes. Our Slovak musicologist data curator, Dominika Semaňáková explains how we want to teach machine learning algorithms to learn more about Slovak music in her introductory interview.

A key mission of our Digital Music Observatory, which is our modern, subjective approach on how the future European Music Observatory should look like, is to not only to provide high-quality data on the music economy, the diversity of music, and the audience of music, but also on metadata. The quality and availability, interoperability of metadata (information about how the data should be used) is key to build trustworthy AI systems.

Traitors in a war used to be executed by firing squad, and it was a psychologically burdensome task for soldiers to have to shoot former comrades. When a 10-marksman squad fired 8 blank and 2 live ammunition, the traitor would be 100% dead, and the soldiers firing would walk away with a semblance of consolation in the fact they had an 80% chance of not having been the one that killed a former comrade. This is a textbook example of assigning responsibility and blame in systems. AI-driven systems such as the YouTube or Spotify recommendation systems, the shelf organization of Amazon books, or the workings of a stock photo agency come together through complex processes, and when they produce undesirable results, or, on the contrary, they improve life, it is difficult to assign blame or credit [..] If you do not see enough women on streaming charts, or if you think that the percentage of European films on your favorite streaming provider-or Slovak music on your music streaming service-is too low, you have to be able to distribute the blame in more precise terms than just saying “it’s the system” that is stacked up against women, small countries, or other groups. We need to be able to point the blame more precisely in order to effect change through economic incentives or legal constraints.

Assigning and avoding blame, read the earlier blogpost here.

This is precisely the type of work we are doing with the continued support of the Slovak national rightsholder organizations. In our work in Slovakia, we reverse engineered some of these undesirable outcomes. Popular video and music streaming recommendation systems have at least three major components based on machine learning. The problem is usually not that an algorithm is nasty and malicious; algorithms are often trained through “machine learning” techniques, and often, machines “learn” from biased, faulty, or low-quality information. Our Slovak musicologist data curator, Dominika Semaňáková explains how we want to teach machine learning algorithms to learn more about Slovak music in her introductory interview.

Read more about our Slovak music use case here.

These undesirable outcomes are sometimes illegal as they may go against non-discrimination or competition law. (See our ideas on what can go wrong — Music Streaming: Is It a Level Playing Field?) They may undermine national or EU-level cultural policy goals, media regulation, child protection rules, and fundamental rights protection against discrimination without basis. They may make Slovak artists earn significantly less than American artists.

In our academic (pre-print) paper we argue for new regulatory considerations to create a better, and more accountable playing field for deploying algorithms in a quasi-autonomous system, and we suggest further research to align economic incentives with the creation of higher quality and less biased metadata. The need for further research on how these large systems affect various fundamental rights, consumer or competition rights, or cultural and media policy goals cannot be overstated.

Incentives and investments into metadata

The first step is to open and understand these autonomous systems, and this is our mission with the Digital Music Observatory: it is a fully automated, open source, open data observatory that links public datasets in order to provide a comprehensive view of the European music industry. It produces key business and policy indicators, and research experiment data following the data pillars laid out in the Feasibility study for the establishment of a European Music Observatory.

Join our Digital Music Observatory as a user, curator, developer or help building our business case.

Join our open collaboration Music Data Observatory team as a data curator,developer orbusiness developer. More interested in antitrust, innovation policy or economic impact analysis? Try ourEconomy Data Observatory team! Or your interest lies more in climate change, mitigation or climate action? Check out ourGreen Deal Data Observatory team!

Originally published at https://dataandlyrics.com on June 8, 2021.

--

--

Writing about music, data, AI and how they change the way we enjoy music and artists earn their pay.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Daniel Antal

Daniel Antal

190 Followers

Co-founder of Reprex, a reproducible research company and member of the Dutch AI Coalition. Data scientist behind the screen, analogue photographer in the sun.