Data Walking for Social Good

By Brittany Fiore-Gartland, Anissa Tanweer, & Meg Drouhard

When you tell people you are doing a data walk, they are immediately intrigued. Two words that are simultaneously experienced as a contradiction of sorts and a surprisingly pleasing pairing, conjuring many different imaginations of what a data walk might be.

At the eScience Institute, a hub for data science at UW, data is most often talked about as the stuff of bytes and bits and interacted with via an increasingly complex computing infrastructure and software ecosystem. Yet a host of scholars across many fields, including Critical Data Studies and Science and Technology Studies (STS), have demonstrated time and again the social, embodied, and contingent nature of data, not to mention the social and political consequences of data in use. This sociotechnical lens on data, so well articulated and researched in these fields, finds few productive ways to crossover and intervene in the practical everyday work of data science. To talk about data as human and sociotechnical, while centrally important to changing the discourse, often leaves a gap in engaging practitioners where they are and generating paths forward together. In light of this gap, we have advocated for creating more opportunities for critical data scholarship and data science practice to engage as a way to strengthen and improve them (Neff et al. 2017). What if data scientists could experience the ways data are human, embodied, and contingent, much like an ethnographer of data might? It is in this gap that the data walk becomes a compelling proposition for bridging discourse and practice and generating new collaborative forms of inquiry.

But what is a data walk? There are many possibilities.Thanks to Alison Powell, who has developed the concept and fleshed out what one version of a data walk could look like, we had a very healthy and robust place to start. Powell defines datawalking as:

“A research process for producing radical data through collaborative walks. Data walking creates a process for observing, reflecting on and seeking to intervene in how data influences civic space. By playing roles as photographers, note-takers and map-makers, participants develop ways to think about and reflect on what data might be, and what role it plays in key social issues.” (www.datawalking.org)

Our data ethnography team wanted to experiment with data walking as a way to engage data science researchers in an embodied experience of data. We imagined a kind of nature walk — full of mindfulness and careful observation — but focused on data.

A Data Walk Experiment

We took up this idea in the context of the Data Science for Social Good program (DSSG) at eScience. As ethnographers embedded in the UW Data Science Environment focused on studying the ethical and social dimensions of data science, we saw the data walk as both a research endeavor and a contribution to the DSSG program. This program hosts 16 DSSG fellows for the summer to work on four DSSG projects that are each led by one or two project leads, who are from academic, nonprofit and government sectors. The project leads bring a DSSG project to the program and work with data scientists and DSSG fellows at the eScience Institute to complete the project over the summer quarter. The fellows have a wide range of disciplinary backgrounds, but are all allied in their desire to expand their experience and develop their skills in the DSSG arena. We decided to run a pilot data walk as an initial orientation activity for the fellows because there was a need for an icebreaker at the beginning of the DSSG program. We could have easily orchestrated an alternate kind of data walk that was more deeply engaged with each team and focused around a particular relevant issue to that team. In fact, this is something we hope to do in the future, but given the constraints of the program timeline and our own bandwidth, we decided this initial walk would be more of an introduction to the program and participants, a low stakes way to try the activity out and learn something through the experience.

We started with a short presentation motivating the data walk. Rob Kitchen (2014) talks about data as immersed in data assemblages — “amalgams of systems of thought, forms of knowledge, finance, political economies, governmentalities and legalities, materialities and infrastructures, practices, organisations and institutions, subjectivities and communities, places, and marketplaces — that frame how data are produced and to what ends they are employed” (xvi). This was a framing concept we used to introduce ways of seeing data embodied in the world on a data walk. We also presenting a structure and expectations for the walk, along with some suggested questions for reflection during the walk, including: What is missing? What is data-rich? Data-calm? Data resistant? We then facilitated discussion with the whole group of fellows along with some of the project leads soliciting their definitions of data science for social good. We used their responses to highlight the primary facets and dimensions of data science for social good and we organized the fellows’ responses into three themes: 1) Data for accountability, transparency, accessibility, openness, 2) Data as a site for community and stakeholder engagement, 3) Data as embedded in a network and infrastructure. We split the group into three teams around each of the themes based on people’s interests. In these sub-groups we began the process of designating roles and clarifying how the data walk could support the particular theme.

These themes were marginally helpful in focusing the groups. Some participants found the themes to be ambiguous and vague, and as the walks began, each of the groups ended up discussing a particular issue around which they could notice and reflect on data. That all three groups took this step independently without being prompted to do so in the instructions indicates to us that a focusing issue helps make the exercise more concrete, and in this case, helped the data walk participants know what was data when they saw it. For group 1, the issue that emerged was urban development; for group 2, the issue was construction; for group 3, the issue was homelessness.

Photo credit: Brett Bejcek

We had each team assign roles that were slightly adapted from the ones suggested on datawalking.org. The roles included: Interviewer, Photographer, Note-taker, Collector, Navigator, and Map maker. One ethnographer accompanied each group and we each played different participant-observer roles within our groups. The roles helped organize people and provide a bit of structure, especially for those who experienced the task as overly ambiguous. However, when we do the activity again we will experiment with different roles, either tied to people’s particular expertise and perspective or potentially associated with different layers of the data assemblage.

Data Walk Reflections

Photo credit: Brett Bejcek

We were surprised at how easily people’s notion of data become something quite broad and inclusive as we walked. For instance, when the first author’s group was asked about what they were considering to be data, they agreed that it could be anything that conveys some sort of information. The group described how this could be a crosswalk or a stop sign. This group even picked up a stick that had been spray painted bright blue. The stick was part of a blue mark sprayed on the ground that was unintelligible to us as data walkers, yet meaningful to those working on the construction site nearby. The group guessed at what it meant, but in this case although they could, quite literally, collect the data, they couldn’t interpret it without the expertise and context of the particular construction project.

Developing critical data literacies

Each group produced a blog post as a creative response to the walk reflecting on their particular experience, which can be accessed here, here, and here. What was fascinating was that the insights gained through this exercise reflected many of the core distilled contributions of a wide swath of scholarship in STS and critical data studies. The data walk provided a rich context for developing these critical data literacies. As Group 1 reflected on what data calm and data rich meant, they found that a stroll through the rose garden was data calm for one participant, but for an ecologist in the group, the rose garden and pond were more akin to her research sites, and so she considered the garden space data rich. The group distilled this insight in their blog post to “One person’s data poor is another person’s data rich…”, articulating a key insight about the contextual, relational nature of data discussed by many scholars, including ourselves as ethnographers.

Photo credit: Brett Bejcek

Group 2 reasoned that perhaps data rich and data calm weren’t necessarily on opposite ends of a spectrum. They considered air quality and noise pollution to be data rich because of the measures that could be collected, yet they also expressed that ideally air quality could be data calm, which they defined as “consistent, not fluctuating…or data that’s well known or understood.” Thus, the group concluded, even for data that could be readily measured and have numerous metrics that are valued for their potential impact on human health and well-being, it is possible that “data calmness” might exist in the sense of continuity or consistency of a particular state. Group 2’s reflections indicated that their approach to characterizing data would be contextual and interpretive with respect to both the manner in which data was constructed as well as the purposes for which its analysis might be most impactful. While Group 3 was discussing how data could be used to understand the issue of homelessness, they took time to observe from a distance an interaction between several bicycle-mounted police officers and a few individuals who appeared to be homeless. As the conversation ended on amicable terms and the cops went on their way, the fellows reflected on how biased an analysis of the homelessness problem would be if it relied on data about arrests for vagrancy. This is because it would inevitably miss many of the inconstant factors and subjective choices that determine whether an arrest happens at all…things like “the mood of the police officers, how polite the homeless people are, the time of day, whether or not people are watching.” Their reflections surfaced another critical data literacy as they recognized the subjective interpretations and biases that get baked into data from their very inception.

The data walk offered the DSSG fellows some of the tools and opportunities to experience data ethnographically. In other words, through using the techniques of observation, participation, documenting, and interviewing, the fellows became the instrument for collecting, processing, and interpreting data in the world. This produced valuable insights that operate at the capillary level (Foucault 2003, p.27), the level of micro-circuitry that makes us who we are and shapes how we produce knowledge and power in society. These are the building blocks of data literacies, that will ultimately shape what kind of society we live in. In thinking about the power of a data walk, we find a striking resonance with ethnographic work by Samantha Hautea, Sayamindu DasGupta, and Benjamin Mako Hill (2017). They study youth who are programmatically analyzing public data about their own learning and social interaction in the Scratch online community, and find that in the discussions among these youth, many of them readily grasp profound implications of data analytics relevant to them and their community, and a set of critical data literacies emerge that can be mapped easily to expert critiques. Their findings and our own experiences in the datawalk indicate that offering data practitioners a personally relevant and embodied experience with data, along with space for reflection, is an important way to develop critical data literacies. This is an approach that should continue to be explored and extended.

Data Assemblage as Palimpsest

A palimpsest is defined as “something having usually diverse layers or aspects apparent beneath the surface” (Merriam-Webster). It comes from the Greek word palimpsestos meaning “scraped again,” which refers to a writing material (such as parchment or tablet) that was re-used after earlier writing had been erased, such that two (or more) layers of writing are visible. In other words, you can see how the materials and meanings have changed over time through the layers of residue. This is the metaphor that emerged from the first author’s reflection on data walking, captured in this fieldwork memo:

Mindful data walking makes us aware of the limitations of what we can see at the surface. As an ethnographer and data walker I am drawn to examine more closely, to see what is underneath, to probe further in time and space, to inhabit the surrounds. The process calls attention to the layers that are now erased, almost invisible and overwritten with new layers. Sometimes the under-layers are barely visible and you have to physically unearth them. On our walk around a construction site, our attention immediately was directed to the beautiful images of the futuristic modern looking building imaged on the mesh fencing.This signaled future oriented data, a projection of what will be and an attempt to engage the community and elicit buy-in for the long term vision of the building.

Photo credit: Brett Bejcek

There was a website written on the fencing that was supposed to have highlights, but when we tried to go to the website there was nothing there. Evidently, it, too, was under construction. We looked a little closer and saw another sign. It was a bit hidden only because it had started to camouflage with the natural setting around it. It was quite dirty, growing moss, caked with grime, and pine needles covering it. The stand of the sign looked like it had been there quite a long time as it was rusted. The sign depicted a greenhouse roof, what looked like it had been a futuristic image at some point in time, and developed in part by the Green Futures Research and Design Lab. It was a “green roof demonstration”, ostensibly on the bleeding edge of a green future, yet it’s aged look told a different story. And then juxtaposed next to the most recent layer of modern building signs and projections, it was clearly a legacy of what had been there before. It was the residue from a “future” replaced by a new future.

Photo credit: Brett Bejcek

These enmeshed, aging layers of infrastructure that exist when you scratch beneath the surface made us reflect on the legacies and residues of data assemblages upon which newer layers of infrastructure and the latest, greatest promises for the future are laid. How these layers can be felt today, and the environmental conditions that continue to shape them and change their meaning. This is painfully seen in how the traces and effects of redlining practices and maps that occurred in the first half of the 20th century live on in our current digital systems, as revealed explicitly in Bloomberg’s 2016 analysis of the reach of Amazon’s same day service delivery in major U.S. cities. By combining data about the zip codes eligible for Amazon’s same-day delivery with U.S. Census Bureau data, they found that in six major same-day delivery cities, the service area excludes predominantly black ZIP codes to varying degrees. One example is Atlanta, in which the areas south of downtown do not receive same day service, whereas the places in the northern parts of the city do. This service map overlaid onto census data shows how the predominantly black areas of south Atlanta are not provided the same service as the predominantly white areas of the northern parts of the city. These maps that reveal in stark terms the inequality present in current day access to services have been called out as one form of digital redlining (see Talbot 2016; Weisse 2016; Gralla 2016). These Amazon same day service maps demonstrate how the early redlining maps and the associated discriminatory practices can live on in digital systems. The metaphor of palimpsest helps us think through the traces, the power dynamics, those layers scraped away, but that continue to act in the data-intensive world around us.

Photo credit: Brett Bejcek

Data Walk Aspirations

We have many aspirations for the data walks of the future. We would like to work with particular groups and collaborations to develop focused data walks on a particular issue or place. We want to use the walk to bring in the appropriate stakeholders and use it as a space for soliciting and sharing knowledge and wisdom across the collective group. We are also interested in using the activity to generate multiple perspectives on data — or a richer experience of the data assemblage people work with every day. This could be incredibly useful for thinking through various stakeholder relationships and social implications in product design and development. Developing a sociotechnical and humanistic lens on data is an essential building block of data literacy, which supports the generation of more socially valuable, relevant, and ethical data science. If you think you have a situation or team or project that might benefit from a data walk, reach out and let us know!

References:

Foucault, M., & Ewald, F. (2003). “Society Must Be Defended”: Lectures at the Collège de France, 1975–1976 (Vol. 1). Macmillan.

Gralla, P. (2016, May 11) “Amazon Prime and the racist algorithms” Computer World https://www.computerworld.com/article/3068622/internet/amazon-prime-and-the-racist-algorithms.html

Ingold, D. and Soper, S. (2016, April 21) Amazon doesn’t consider the race of its customers. Should it? Bloomberg, https://www.bloomberg.com/graphics/2016-amazon-same-day/

Kitchin, Rob (2014) The Data Revolution: Big Data, Open Data, Data Infrastructures & their Consequences. Sage Publications, London.

Hautea, S., Dasgupta, S. and Mako Hill, B. (2017) Youth Perspectives on Critical Data Literacies. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI ‘17). ACM, New York, NY, USA, 919–930. DOI: https://doi.org/10.1145/3025453.3025823

Neff, G., Tanweer, A., Fiore-Gartland, B., Osburn, L. (2017) 2017) Critique and Contribute: A Practice-based Framework for Improving Critical Data Studies and Data Science. Big Data, Special issue on social and technical trade offs 5(2): 85–97. https://doi.org/10.1089/big.2016.0050 (OA)

Talbot, D. (2016, April 25) “Amazon Prime or Amazon Redline?” MIT Technology Review. https://www.technologyreview.com/s/601328/amazon-prime-or-amazon-redline/

Weise, E. (2016, April 22) “Amazon same-day delivery less likely in black areas, report says.” USA Today https://www.usatoday.com/story/tech/news/2016/04/22/amazon-same-day-delivery-less-likely-black-areas-report-says/83345684/