Practical ethics in product development: Mitigating against risk

Published in

Wellcome Data

4 min readMay 10, 2019

In previous blogs we have written about how we, at Wellcome Data Labs, are developing and embedding an ethics methodology in our product development. We carried out a workshop where we thought about cases of abuse, misuse, and unintended consequences of what we are building. This creative exercise got us thinking about what we can do to try to mitigate unintended harms that may result from our product.

About our product
We are developing an online service to track and analyse the reach of research in policy documents of major global organisations. It allows users to see where the research has been cited in policy documents, and to analyse the contents of policy documents. For more information, check out our open Github repo here.

A severity versus likelihood plot

After identifying how our product could be used in unintended ways we plotted the cases against a likelihood versus severity map. This exercise brought us back to reality and gave us focus. The two questions to answer were:

What is the likelihood of this case happening?
How severe would this case be if it did happen?

Template Likelihood versus Severity plot

After plotting our use cases on this graph, we narrowed our focus to what was highest risk. We summarised these risks as follows:

People may perceive the outputs of the product to be more accurate than they actually are;
Our tool serves the purpose of reinforcing biases that already exist in the research citation system, such as citation count unfairness — just because a research paper has been cited a lot doesn’t necessarily indicate its quality.
People may use the tool to hunt for sensational or negative stories on researchers, particularly if the research is on a subject area that is of live political debate (e.g. vaccines, climate change or animal testing).

Taking responsibility where we can

This exercise sparked discussion about our own role in preventing unintended consequences from our product. Are we responsible if someone decides to use our tool to find negative stories on researchers or research areas? How do we prevent reinforcing bias? Are we the right people to take action on this? The exercise left us with some interesting questions to think about.

Adding the sociological lens

Wellcome Data Labs works with a social scientist, who helps the team see things from a different perspective. Our social scientist joins us for around 2 days a month to review our ethics methodology and to share in-depth analysis from a sociological point of view.

For our current product, he helped us re-focus on the wider implications. Do we adequately understand the current problems with the academic citation systems? We are aware that how research gets cited is not exactly a fair system (Balietti, 2016). There are factors other than merit that influence whether a research publication will get cited. For example, this research review (Borsuk, Budden and Leimu-Brown, 2009) found that research with multiple authors was more likely to get cited. They argue this could be because of the connections the authors may have with publishers. Additionally, the way in which citations occur are not always accurate. Another paper (Teixeira et al., 2013) found that “22% of citations were inaccurate, and another 15% unfairly gave credit to the review authors for other scientists’ ideas”. We need to understand this research context better to be able to know how our product will impact it.

Committing to transparency

As a first step to reducing ethical risks, we agreed that ensuring users are empowered with the knowledge to correctly interpret the outputs of the algorithms is paramount. We are currently carrying out the following activities as part of a content design process:

Make sure that users understand how our algorithm works at a high level with the chance to learn the technical details if they want to.
Make sure that the limitations and accuracy levels of the algorithm are explained so that users can accurately interpret the outputs.

Our next priority will be to understand how we can best fit into the research and funding world that we are now providing a service to, so that we do not negatively influence that community. Who are the members of this community? What does fairness currently look like in this area? What precise effect do we expect our product to have? Over the next year we plan to do research and analysis to achieve this and will be regularly reporting on our progress.

Another commitment we are making is for our data scientists to do analysis of bias within the algorithms and report publicly on the findings. We will be checking if our algorithm is better or worse at finding citations of research publications based on a number of factors such as the type of publication and country of publication, etc. Once we have this information we believe we will be in a better position to understand the fairness of the algorithm and to understand the unintended consequences it may have. Keep an eye out for this analysis on our Wellcome Data Labs Medium publication!

References:

Balietti, S. (2016). Here’s how competition makes peer review more unfair. [online] The Conversation. Available at: https://theconversation.com/heres-how-competition-makes-peer-review-more-unfair-62936 [Accessed 9 May 2019].

Borsuk, R., Budden, A. and Leimu-Brown, R. (2009). The Influence of Author Gender, National Language and Number of Authors on Citation Rate in Ecology. The Open Ecology Journal, [online] 2(1), pp.25–28. Available at: https://www.researchgate.net/publication/262104578_The_Influence_of_Author_Gender_National_Language_and_Number_of_Authors_on_Citation_Rate_in_Ecology [Accessed 9 May 2019].

Teixeira, M., Thomaz, S., Michelan, T., Mormul, R., Meurer, T., Fasoli, J. and Silveira, M. (2013). Incorrect Citations Give Unfair Credit to Review Authors in Ecology Journals. PLoS One, [online] 8(12). Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3859513/#pone.0081871 [Accessed 9 May 2019].