What is Data Ethics?
“With great power comes great responsibility”
The first movement in data science was heavily focused on accuracy and efficiency — how we can get to the “truth” as quickly as possible. We are, however, quickly moving towards the second major movement in which the focus is more on responsibility and data ethics. Whether we like it or not, the question is no longer what we can do with data but rather what we should or shouldn’t do.
On the surface, data science and ethics are topics that seem unrelated. The former is related to science and statistics while the latter concerns itself with philosophy and morals. The truth, however, is that data science and ethics are two areas of study that are inherently connected. Oxford philosophy professor Luciano Floridi characterizes data ethics “as a new branch of ethics that studies and evaluates moral problems related to data (including generation, recording, curation, processing, dissemination, sharing and use), algorithms (including artificial intelligence, artificial agents, machine learning and robots) and corresponding practices ,in order to formulate and support morally good solutions “ [1]. In other words, data ethics studies the moral questions apparent in all aspects of data science from data collection to machine learning. In many ways, thinking deeply about ethics in data science is essential to doing data science itself as much of what is done in data science directly relates to and oftentimes impacts the lives of real people.
One major trend in data ethics right now is the concern for privacy of personal data. As we spend more and more of our lives online, privacy of our data in the digital world has become increasingly important. According to a 2021 poll by Morning Consult, 83% of respondents said that national privacy legislations should be passed and 72% believed that the government should be responsible for regulating how companies collect data[2] . As a reaction to this general shift towards privacy, companies such as Apple have made significant changes to how they track and use our data for personalized ads. Starting with iOS 14.5, Apple introduced a feature called “App Tracking Transparency” . Now, when an app wants to track and share our information with third parties a window will show up asking for our permission[7]. Of course, there are many other methods used by advertising companies such as fingerprinting[3] to continue tracking us but this move is very much symbolic in that it represents the increasing consideration of privacy and the use of personal data. According to an report by Insider Intelligence by December of 2021, only 37% of U.S. iPhone and iPad users have opted to continue allowing companies to track them on their devices[4]. As you can see, data ethics isn’t some theoretical study, it deals with moral dilemmas that already govern our everyday lives.
What can we do?
So what can we do to keep up with this movement? Here are the general aspects that one should consider when doing anything related to data science according to Business Insider[6]:
Ownership: It is important to remember that we do not have ownership of personal data. How we obtain our data must be accounted for through written agreements or terms and conditions.
Transparency: It is important to make sure that people know exactly how their data is being used and for what purpose. Users should be able to have access to enough information, perhaps through a detailed policy, to decide whether they want to take part or not.
Privacy: It is general practice that personally identifiable information (PII) shouldn’t be shared or made publicly available. This includes information such as full name, birth date, address, etc.
Apart from the general principles, we can also rely on other online resources such as the Oxford-Munich Code of Conduct or this handy checklist[5] posted on Medium by the Big Data club at Berkeley. One of the important checklist items that we should look out for is making sure that we not only ask for informed consent but also explain succinctly that exactly the users are consenting to. Sometimes this may be difficult to do when using data for research purposes, because it would affect the outcome of the study. In this case, we must weigh the potential benefits against the human costs. For example, back in 2014, Facebook conducted a study on “emotional contagion” by intentionally manipulating almost 700,000 Facebook users’ feeds in order to study how different words used in posts would affect the emotion and feelings of the users[8]. While the research was published by the US National Academy of Sciences, Max Masnick, a researcher at the University of Maryland, states that “As a researcher, you don’t get an ethical free pass because a user checked a box next to a link to a website’s terms of use”.[8]Ethical considerations are never black and white, and more has to be done to properly gain the consent of individuals. For example, it is generally considered the researcher’s responsibility to make sure that all aspects of how the data will be used and how the study will be conducted be explained prior to the participants giving any form of consent
As people become more and more aware of how their data is being used and the importance of privacy, data ethics will no doubt become more and more important. While it is often difficult to think about ethics let alone consider it in our work, we can follow data ethics guidelines to ensure that we are trying our best to keep up with this movement.
Reference List:
[1]Floridi, L., & Taddeo, M. (2016). What is data ethics? Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2083), 20160360. https://doi.org/10.1098/rsta.2016.0360
[2]Sabin, S. (2022, September 29). States are moving on privacy bills. Over 4 in 5 voters want Congress to prioritize protection of online data. Morning Consult. Retrieved November 7, 2022, from https://morningconsult.com/2021/04/27/state-privacy-congress-priority-poll/
[3]Peterson, T. (2022, November 18). WTF is device fingerprinting? Digiday. Retrieved December 3, 2022, from https://digiday.com/marketing/what-is-device-fingerprinting/
[4]Goetzen, N. (2022, February 22). The shakeout from Apple’s Privacy Update. Insider Intelligence. Retrieved December 3, 2022, from https://www.insiderintelligence.com/content/shakeout-apple-privacy-update
[5]Lou, S., & Yang, M. (2020, August 12). Things you need to know before you become a data scientist: A beginner’s Guide to Data Ethics. Medium. Retrieved November 7, 2022, from https://medium.com/big-data-at-berkeley/things-you-need-to-know-before-you-become-a-data-scientist-a-beginners-guide-to-data-ethics-8f9aa21af742#:~:text=“Data%20ethics%20is%20a%20new,robots [
[6]5 principles of data ethics for business. Business Insights Blog. (2021, March 16). Retrieved November 7, 2022, from https://online.hbs.edu/blog/post/data-ethics
[7]Chen, B. X. (2021, April 26). To be tracked or not? Apple is now giving us the choice. The New York Times. Retrieved November 7, 2022, from https://www.nytimes.com/2021/04/26/technology/personaltech/apple-app-tracking-transparency.html
[8]Arthur, C. (2014, June 30). Facebook emotion study breached ethical guidelines, researchers say. The Guardian. Retrieved November 19, 2022, from https://www.theguardian.com/technology/2014/jun/30/facebook-emotion-study-breached-ethical-guidelines-researchers-say