Phase IV: Narrowing the Context

Min Kim
Breaking Out of Filter Bubbles
4 min readDec 4, 2016

I now know that I want to work on something related to making the workings of algorithm models more transparent and fair — below is a short list of specific contexts that I’ll explore.

  • In the field of e-commerce such as Amazon or Staples, there’s the price tags that fluctuate according to the users’ geolocation, neighborhood, and other personal data, to try to push for the maximum amount of $$ that the user might be willing to pay. It could be interesting to explore the socioeconomic or gender variables within this context.
  • Most job searches are now conducted online — i.e. Google. It could be interesting to test just how much of an impact my personal data that Google has stored about me (age, gender, socioeconomic status, etc) on the search results. Do men’s and women’s job searches show different results?
  • In the context of Social Media platforms, does everyone’s newsfeed differ according to how they interact? How much do they differ?

Points of Interest

  • Transparency in the algorithm models
  • Agency & control by the users
  • Fairness
  • Building back trust between the user & the company

Possible Method

  1. Categorization of user’s individual data into two: ‘re-usable’ and ‘not re-usable’ by the third parties. If there’s a way to aggregate the types of data we’re mined for and to categorize them to figure out what purpose each of them serve for the companies versus the users, maybe we could develop a system, where a <digital tool> could inform the user what the possible consequences are, how likely it is to happen (on a scale system), and the possible gains on the user’s part (curated content, etc), so that user can control what to reveal and what to withdraw. ALSO to set limitations on how long that data, if user chooses to share, can be used for.
  • User has transparency over what possible ramifications are.
  • User has agency over what data about self to release or not.
  • Somewhat utopic and not immediately commercially applicable or lucrative for companies, but companies on moral grounds need to be open to audits (they hold so much power), and it’ll build trust back into the relationship btw users vs big corps.
  • This data (that’s aggregated) would need to be constantly updated, but I imagine the categorization system to remain somewhat steady.

2. A digital you that you’re the sole owner of and that others can access only on your terms. What if there’s a digital “me”, perhaps multiples of it, that i can consciously choose to present depending on different situations? Help user curate the right kind of model(s) for themselves. In Master Algorithm, Pedro Domingos talks about a personal Digital Bank that stores personal data, anonymizes that information for the user, then gives the user the control over what aspects of the data should be used and how, when interacting with third party sites such as facebook and Google.

Potential Contexts

  • Staples.com product search, and their fluctuating prices;
  • Flight search depending on the country of origin and country of search, and the fluctuating prices;
  • Job searches displaying different results depending on the user’s gender, age, or socio-economic status

Context: Google search algorithm

“Gender distributions of specific occupations were unrepresentative of the gender distributions in the real world and that search can influence people’s perception of gender distribution in reality.”

I’m seeing two different problems here: 1. the search displaying biased results or ‘unrepresentative’ results, and 2. the user not having control over what kind of data is pulled, and where from. As much as i’d like to tackle both problems, the first is a much much larger problem and requires much more technical skills than i have, and not at a scale i alone could work on.

We can’t dig into the question of bias and discrimination without first setting benchmarks on what is fair and when something should be considered biased, or what neutrality even means.

I’ll be focusing on the latter; if i can’t immediately impact the first, i think it’ll be important in the near future to enable the actual users to see what is going on (transparency & agency). So that even though the user might not be able to control what they are being shown, at least they have a say in what kind of data they would like to store in search engines’ databases, so that they can consciously create their own models (ref. master algorithm: “what mental model do you want the algorithm to have of you?”), and perhaps with options to create multiples (personas/for diff purposes) of it, you could a. open up the world of possibilities (by searching for the same thing with diff personas, aka models, to yield diff results), and b. have both agency and fairness, as well as trust for the company.

Open Questions

  • What else could this system do for the user? for the companies? what other utility would this serve?
  • So what would i need to do as a designer? what would the design process be? how would I go about researching this?
  • How would this work in social media? user to one other company (i.e. google search) might be plausible bc you’re just setting the reusability status for one other party. what about facebook? when you first register, how would the infos you’re inputting (birthdate, gender, occupation, etc) get displayed for facebook itself and your friends (or any other people)?
  • As a platform/company learns more and more about you, can it provide better or more curated, premium services?

Inspirations

https://www.ghostery.com/

http://www.businessinsider.com/datawallet-lets-you-sell-your-data-2016-6

https://idcubed.org/

http://citizenme.com/

https://www.qiyfoundation.org/about-qiy/

http://dataprivacylab.org/projects/onlineads/

http://jots.pub/index.html

https://webtap.princeton.edu/

--

--