Code Work in Science: How it changes, and Why it matters how we talk about change
As software projects in science become more ambitious (and expensive in time, energy, and money), they increasingly require a language that recognizes and rewards the collective pursuit of uncertain possible worlds.
I conducted an 18-month interview and observation study of code work in science. The implication of my findings for design and practice is to articulate goals in ways that can reward community and skill related progress, because there is relatively less uncertainty associated with these than with increasingly large or ambitious programming projects. The take-aways are aimed at anyone about to invest time and energy in a project that carries with it an inevitable amount of risk:
- When building something integrative or holistic, consider ways that its components can be made available piece-wise and engage with the existing skill and enthusiasm in the target-user scientific community.
- Focusing on technological deliverables and goals creates pressure; focusing on time spent cultivating community or learning a new vocabulary or way of thinking creates momentum.
- Set initial, ambitious goals, against which outcomes will be measured, without unintentionally privileging any one part (technological, social, or cognitive) of the working environment over the others.
When I talk about “code work,” I include scripting, software engineering, using command-line interfaces, or typesetting your manuscript in LaTeX. Based on the feedback to this work so far, I believe my findings apply to many domains that experimenting with code. However, my study had to do with code work in oceanography. The study of the ocean draws people from many disciplines, and so there are many different complementary methods, all of which can and do engage differently with code work.
The setting of my qualitative study is “oceanography” intersecting in some way with “code work.” The population consists of four teams in the Pacific Nrothwest, as well as natural scientists who had attended Software Carpentry workshops or similar interventions intended to overview computing skills to scientists in a few days.
I interviewed people individually and observed group events, illustrated to the left. Over 18 months, I conducted about 300 hours of observation, and collected several dozen interviews. The use of qualitative methods alone is not groundbreaking in computer science; I review excellent work in software engineering and computer supported cooperative work (CSCW) that make use of qualitative methods. However, this work is unique in its simultaneous recognition of the idealized software engineering concepts (“etic”) at the same time as aiming to contextualize and describe the participant’s own meanings and values on the relevant subjects (“emic”).
2 concepts form the foundation of this conceptual framework: the working environment and the perfect world. The working environment is a highly personalized set of social, cognitive, and technical resources at an individual’s disposal. “Social” ones include colleagues, supervisors, mentors, and office-mates. “Cognitive” ones include, for example, being very good at interpreting a specialized visualization that most people outside your field would take a long time to understand. “Technical” resources comprise all those things that must be downloaded, installed, and otherwise “set up” for you to be productive.
Think of the perfect world as that world if you would build if only you had the time. Or the world you would have built if only you knew in the past what you know know. Crucially, having a concept of a perfect world is not itself reason enough to build anew.
The perfect world is collectively-imagined, socially-negotiated, and envisioned with respect to an audience. Where the working environment is an individual-centric way to understand how code work is done, the perfect world is a more socially-situated concept.
The working environment is subject to change, and the direction of this change is informed by a collective vision of the perfect world. When I talk about deliberate change, I am talking about the myriad small decisions to try something new or go the familiar route. Together, these may accumulate to the kind of slow drift that is only visible in retrospect. These small moments of flux, on the other hand, are visible, but quickly forgotten, because we lack ways to constructively articulate the decision-making work that they require. However, they are important, so here is a set of concepts to talk about them.
Here’s an example of using this vocabulary to actively listen to a study participant describing deliberate change in code work practice:
I’ve been doing increasing amounts of data processing, starting during my PhD with a lot of my own data, using some larger online databases. I am planning on starting doing more of that (1). I guess there’s two things, one is that often we’re just learning on our own, and that’s effective to a certain extent but if you ever want to try something new, there’s a lot of inertia for trying something totally new (2), so it’s nice to get an introduction into whatever’s going to be new to you, to give you that boost so that you can actually not be afraid to try it. In my case, it was GitHub that I’d heard a lot about (3), didn’t really quite understand how it worked or what its purpose was, so I wanted to learn about that. The second thing is that I’m starting a lab of my own as a professor in the fall. And with my own students and postdocs coming in, I want us to do things like GitHub and version control. [I want to learn these skills myself] so that I can best teach them and establish good practices within my own lab (4).
This person had been “doing an increasing amount of data processing” (programmatically, involving code work) and is “planning on starting doing more” (1). This pertains to an anticipation of the near-or-far future, and is an example of future-oriented motivation for programming skill expansion or acquisition. Points (2–4) exemplify typical sentiment toward GitHub as an instantiation of a desirable best practice. Because the speaker places trust in the social context (3), it is not necessary to “quite understand how it worked or what its purpose was” in order to attend a workshop. Attending the workshop was itself the act of change deliberately undertaken.
How long is it necessary to put time and energy into something before giving up on it? The kind of of change that Chapter 6 focuses on requires a step into the unknown.
As software projects in scientific contexts involve more people, longer time spans, and more ambitious collaboration between disciplines, understanding how coding practices influence scientific inquiry is increasingly important. The discussion of “best practices” in open science encourages the sharing of negative results and disappointing data as a top priority. This call for reflection on failure must be extended to include code work. With data as well as with code sharing, repeated “best practices” are not sufficient to inspire change, even for those scientists who openly feel they “should” do it. The conceptual framework I propose creates optimistic vocabulary for reflecting upon deliberate changes, big and small.
This is the extra-short, figures-only version of my recently-completed dissertation, which you can get in full [PDF] here. If you only read one chapter, read Chapter 7. It’s full of ethnographic vignettes and provides additional detail and context for the take-aways listed in the beginning of this post. Feedback happily welcome!