The big issues in social data science in 2021: a view from Melbourne, Australia (Part 2)

Published in

Data & Policy Blog

5 min readMar 19, 2021

By Anthony McCosker (@ACMcCosker), Diane Sivasubramaniam, Liz Seabrook, Kath Albury (@KathAlbury), Sam Wilson & Jane Farmer (@jane_c_farmer) from Swinburne University Social Innovation Research Institute. This is the second part of 2 articles, which focuses on data user and producer rights, long-term data usability, and ethical Artificial Intelligence. Part 1 underlined key contemporary themes which are relevant for social data science researchers.

The rights of end-users and data producers

How do we bring in the end-users and producers of data and engage them in data science and uses of their data?

One of the challenges with data science is making sense of the output the ‘black box’ gives you. There has been a move toward Explainable AI, using algorithms that demystify how the AI works and provide opportunities to explain why decisions are made. But who is currently involved in interpreting AI?

We think we are going to see greater breaking down of silos, having better collaboration between data scientists and content matter experts. What we would like to see is greater involvement of the people whose data is being used; with ‘end-contributors’ having a greater say in interpreting ‘their own data’, bringing in more ‘expert by experience’ in the consumer experience mix.

How can public interest technology be co-designed with non-experts?

In the context of co-design, which seems to be intrinsic to public interest technology, how can laypeople be meaningfully involved with technical experts in the co-design of public interest technology? Do people need to understand what is ‘under the hood’ in order to meaningfully participate in technology co-design? Is knowing how (and why, and with what effect) something works a necessary aspect of genuine participation in co-design? If not, why not? What does this mean for complicated, advanced technologies that exist in webs of interconnection with systemic effects on diverse psycho-social systems?

How can the voices and needs of marginalised people influence automated content moderation practices?

With large and heavily automated social media platforms so deeply engrained in our social and economic fabric, there is substantial interest across the globe in the impact of automated content moderation on vulnerable and marginalised communities. Movements around the world are urging change. To remain relevant, platform policies developed in privileged sites and spaces (i.e., male-dominated tech companies in the Global North) must become more inclusive of the needs of stigmatised and marginalised people.

What happens to our data?

How, when and where is consent implied, contested and negotiated in data collection practices and data-sharing practices?

Norway’s Data Protection Authority just fined Grindr for on-selling sensitive data about users to third parties. How, when and where is consent implied, contested and negotiated in data collection practices and data-sharing practices? Whose data is considered ‘sensitive’ and why?

The implications of trust: where does our data go, and what rights do we have?

The development of new technologies often prompts scientists to consider how to generate trust in those technologies among the community. Often, however, we are faced with the opposite problem: an uncritical acceptance of technology.

While there are clear societal benefits (particularly during the COVID-19 pandemic) to tracking movement and activity and other forms of personal data, the prevalence of personal data collection inevitably results in its habituation. The next step is to address both the habituation to data collection and the unnecessary or unlawful skimming of data from individual transactions in ways that were never intended.

What are the policy implications of voice data capture and processing?

This relates to the rise of Ambient AI. Listening devices are beginning to multiply, along with the tools and models used for processing voice and natural language. There are serious policy implications surrounding voice capture and processing in ‘private’ spaces and environments through devices like personal assistants, Google home, etc. Policy and security measures have to keep step with expectations of personal privacy protection in these spaces.

How can we deal with issues around the re-use of data?

A recent case in the US saw the FTC force an App developer (Ever app) to delete image data collected from users of the photo storage app, after it pivoted to using that data to train and develop facial recognition services.

Not only was the company forced to delete the image database, but also anything they had trained off of it. This could have significant implications for the use and re-use of data collected through digital tech, and it’s perhaps a first step toward holding data collecting tech companies to account.

Ethical AI

Can we codify ethics into design, development and use of AI systems?

Ethical AI should not be an afterthought. The decisions made by AI have significant impacts, and there have been some notable examples of how decision-making by AI perpetuates inequality.

There is a tendency to address these ethical issues after a system has been built, with more or ‘better’ data and adjustments to models. Substantial international effort is driving the development of ethical principles for AI. However, questions remain: how do we translate these principles into concrete and robust methods for designing, developing, implementing, and maintaining ethical AI systems? This year, we have seen Google take a major step away from this goal by firing high profile AI ethics experts.

When intelligent systems are embedded in our day-to-day life, how can people opt out or correct mistakes in data about them, in the system?

We need to keep in mind the end-users of automated systems when they are being designed. When intelligent systems are embedded in multiple facets of our daily life, how do we enable people to opt out? How do we design systems with intuitive interactions that help people tell an algorithm to ‘stop’, ‘go’, ‘that it was inaccurate’, or to edit their data? See, for example, this work by Albrecht Schmidt and others.

This is the blog for Data & Policy, the partner journal for the Data for Policy conference. You can also find us on Twitter. Here’s instructions for submitting an article to the journal. Part 1 of this article is available here.