Coding human centered data

Published in

Getsalt

13 min readMay 3, 2023

Coding data is when a researcher takes all of the data they collected and transforms it into meaningful insights. In human-centered design and ethnography, the data are usually stories — even snippets of stories, or observed behavior.

However, there is a problem that we run into when using the term “coding,” especially how it is used in human-centered design or ethnography (not to be confused with the writing of computer programs).

On a theoretical level, when we speak about coding it is easy to get lost in a wash of processes like in-vivo coding or grounded coding, a-priori coding, structured data and unstructured data analysis, an entire world of statistics and technical analysis. These discussions are important but the question these discussions answer is quite narrow, mainly: how do you compare what one person says over here with what that person over there said and how do you compare all of that with what everyone else in your research study say?

The problem arises when we consider the practical level. In practice, not only do we need to compare what everyone says, but we need to extract a deeper meaning from these comparisons, like a more foundational, deeper truth. We also then need to look at these meanings and comparisons, all of the data so far and represent this somehow to an audience. To take it a step even further, we then need to come up with ideas that play with the deep, foundational truth that we must discover.

The problem is that everything in this practical level is also referred to as “coding” and the theoretical language doesn’t cover it. In lieu of having technical language, in the design industry, we have come up with many tricks to use on the practical side of things. For example, there are user personas, “How might we…” statements, 2x2s, journey maps, importance-impact charts, the list goes on and on. There are hundreds of these kinds of tricks, they all solve different things, and it’s not always clear why to pick one over the other or when or what employing the trick is going to provide.

So, to avoid these kinds of pitfalls, I’ve attempted to describe “coding” — what to do with the data now that the researchers have got it — using a different kind of language. The goal is to describe the essence of what I’m usually dealing with and what I usually do with it, to ultimately come up with clever ideas.

When you are looking to somehow map qualitative data, stories that you heard from fieldwork, are you considering:

contrasting perspectives?
complimentary perspectives?
perspectives that change over time (unstable perspectives)?
or perspectives that resemble other perspectives from a different subject?

These four types of views: contrasting, complimentary, changing, or analogous can be represented by these four maps that allow you to gain insight from your data:

This allows you to compare and contrast multiple perspectives at once. You can make a spectrum where one side is the opposite of another. After you can cross it with another spectrum to make this sort of matrix.

This allows you to look at changing perspectives over time. The Y-axis tends to be positive — negative, good-bad, certain-uncertain, etc. The X-axis is some length of time like an hour, a day, a week, a lifetime, you get the idea.

This allows you to look at complimentary perspectives that grow based on their proximity or magnitude or extremism. The outside is related to the inside, for example the inner circle is your comfort circle. Each circle out can have situations where you feel less and less comfortable. The outermost circle is panic.

This allows you to build analogy and compare like with like. A and B are different but there might be some property that is the same between them. Note that there is a debate about how ethical and valid it is to use this method, especially in anthropological contexts — the argument being that history does not repeat itself so we cannot claim motivations of one group to be the same as another just because there is something similar about them. Comparative frameworks are used to generate ideas and hypotheses, not provide evidence.

There are more but really these four are the major ones I use. You can also combine these maps together into a single glorious framework if you need to.

The goal of creating these diagrams is to map the perspectives you find in your data and discover particular moments of tension or mystery or an attribute, previously hidden, yet suddenly revealed. This tension, mystery, or attribute becomes the cornerstone for new questions and new ideas.

What makes this difficult is that pretty much nobody can fully articulate their perspective on anything. That said, people do speak in themes. Most likely you’ll find thematic patterns among the things you hear people say:

“When people talk about Jill, they always bring up flowers or her radiance or beauty.”

“However, when people talk about Jack, they never talk about his looks. Instead, they bring up his character.”

“This group of people over here do this, that group over there does that.”

If only you could condense people’s thoughts into clear ideas like that in every project… Most likely, you’ll find even more sparse patterns. Little repeated words, or slight actions. Dozens of them with subtle variation. Perhaps the only thing you notice is a certain wistful look people get when answering questions.

The point is that you will most likely have themes, not perspectives. It is your duty to piece together the perspective a person holds about a particular subject from the various themes and other notes or information that you might have. These perspectives then get built into a framework and then you’ll discover something that will suddenly make everything clear.

If I had to make this into a process it would look like this:

As I was saying before, the term “coding” usually encompasses everything from Themes to Perspectives to Frameworks.

Just to make this clear: a researcher starts by interviewing someone and looking at the prose and poetry that the person is using. This turns into snippets of stories that the researcher assembles into a knowledgescape. The researcher considers what common or important stuff is coming up over and over in the data and groups that according to theme. These themes are then reflected by different perspectives (contrasting, complimentary, changing, or analogous) and a framework is born.

The framework is usually the stopping point but at least for designers, the frameworks must also generate ideas. If this entire chain is done with due diligence, then a single framework can generate nearly endless ideas and even help edit ideas that have not worked out previously.

We have already discussed going from the prose and poetry of an interview to the data — and building that knowledgescape.

Next, going from the Data to the Themes seems intuitive. It’s all about staring at the data, squinting at it, looking at it from far away, and letting your pattern recognizing brain do what it does naturally. Alternatively, this step can also be called “labeling unstructured data” and some computers can do this, but I won’t bore you with those details. In fact, most discussions of in-vivo or grounded coding fall into this step.

So we come up on the first tricky part. How do you go from Themes to Perspectives? How do you get many perspectives of multiple people? We will need to be really careful about the exact meanings of those two words:

A Theme — an idea that seems to somehow prevade across an interview or many interviews. A cluster of data that share some common through-line like a word, a feeling, an act, a texture, a color, etc.

Theme 1 — “Jill is beautiful.”

Theme 2 — “Jack has a very determined character.”

Theme 3— “Jack seems to love Jill. It is unclear how Jill feels about Jack.”

Theme 4— “Gertrude is independent and not very related to either Jack or Jill.”

A Perspective — tying a collection of themes back to an individual or a group which then reveals something new about their character.

Perspective 1 — “Not only does Jack love Jill, but he is so enamored that he cannot do or think about anything else. He is obsessed. Everything relates to Jill.”

Perspective 2 — “Gertrude is not obsessed with Jill. She lives a full life.”

The way to collect themes is by clustering certain data together. You cluster little tidbits of information from everything you have and then you must try to name that cluster. The name of the cluster invariably is the Theme, the line that connects the little data pieces. Objectivity here is important. It is far too easy to transform data or invent a common through line that is imposed; meaning it didn’t emerge itself.

The next tricky step is to connect the Themes to an individual or a group to get a perspective. One common way, at least in the design industry, is to create a “user persona”: strawmen who are stand-ins for individuals the researchers have spoken to. This is done in order to generalize from one individual to a group of people.

Jack as a user persona — Jacks love Jills. A Jack is a male aged from 18–35, likes apples, works at a bank, lives in New England.

It’s not enough to stop at building this generalized strawman. The Themes you connect to Jack must also work together to reveal something about his character.

Jack as a user persona (continued) — A Jack’s love can be so profound that he stops eating apples and stops coming into work. He is penniless, hungry, yet unwilling to think about this, let alone begin to fix the situation he is in.

Here is the rub: it’s natural to look for Jack’s opposite, to look in our data if there is someone who is indifferent to Jill. Like Gertrude. There is this sneaky thought that happens where many researchers feel a drive to juxtapose Jack and Gertrude. But what if Gertrude also liked Jill? Not in the same way as Jack, but there is some kind of adoration. What if Gertrude also loved Jill as obsessively as Jack but these feelings developed differently? What if everyone in the town is obsessed with Jill, shouldn’t we take a closer look at Jill? And what if Gertrude is obsessed with Jack? When we go to write Gertrude’s perspective it is vital to capture the truth of what is important to her. The focus of the overall project can shift by accurately revealing the nuance across all perspectives. Capturing accurate, revealing, perspectives is the heart to building a powerful Framework.

This space, between Themes, Perspectives, and Frameworks is also exactly where more modern philosophies of design and ethnography can be found. For example, a researcher practicing reflexivity could include their own research approach as data to be grouped in themes, and later connected to a perspective.

“Jack seems to love Jill but only when filling out an online form. The researchers will now conduct at home interviews in the town to see if the Jack-loves-Jill assumption is correct.”

The reflective researcher could also link the themes to a perspective about themselves, or at least attempt to in order to articulate their own stance.

Philosophies like inclusivity in design look to see if the Themes, which emerge from the data, exclude anyone by their very nature. This excluded character could then become the primary focus when writing a perspective: either it is this character whose perspective is being captured or they become the subject which the perspective revolves around.

Participatory design combines these two ideas of 1) allowing a researcher to be a part of the data, themes, and perspectives and 2) examining who is excluded in the data. Interviewees, users, and other “stakeholders” can see and comment on the themes and perspectives. The researchers job is then to find other voices or people who are somehow involved or affected by the research and put them into this co-design mix as well. The idea is that all of these people have access to the data and themes and can help group themes and even help write perspectives about themselves and others.

The next tricky part is moving from Perspectives to Frameworks. It is in this moment when a lot of interpretive work happens. Are the perspectives you wrote really contrasting? Could the perspectives change (i.e. are they stable or unstable)? Or are you actually looking at a set of perspective across time rather than in one particular moment? Answering these questions can be hard and you will most likely answer them over and over again. Remember, we are not dealing thematic clusters anymore, but with multiple perspectives; each of which is revealing something deeper about a person or group.

When people say that human-centered design is “iterative,” they are usually speaking about writing a set of perspectives and trying to map them onto a framework, have it not yield any good results, so must go back and see if they have to refine any perspectives. Either they need to capture more themes or the perspectives they wrote didn’t capture the heart of the person/group.

A good question now is: how do you know if a framework is working?

Many people say that the point of a framework is to create a neat and tidy map of all the perspectives and everything is accounted for. No, the framework should not be descriptive. The point of a framework is to highlight a moment of tension or mystery. This moment either guides you to ask new questions or it is a place of opportunity for a good idea.

Now comes the third tricky part. As you might have guessed, one researcher could construct the entire chain of Themes-Perspectives-Frameworks and discover a moment of mystery. It’s possible a second researcher create their own chain of Themes-Perspectives-Frameworks from the same data but end up discovering a very different moment of mystery.

In fact, these two researchers (let’s call them Jack and Jill for old times’ sake) could have the same set of Themes but write their Perspectives differently from one another, and in so doing, end up constructing different Frameworks. Jack could end up with a contrasting spectrum while Jill could end up with a time based framework.

On one hand, this speaks to the importance of capturing the heart of someone when writing Perspectives; as opposed to trying to invent a contrasting perspective just to have some juxtaposition. It also explains why so many researchers turn to the world of statistics and computer algorithmic coding when clustering data into Themes. It brings some objectivity to deal with human biases and fallibility. Even philosophies like participatory design are relying in part on the “wisdom of the crowds” to ensure that thematic clusters are relevant and perspectives are accurate.

On the other hand, this fact: that a single research topic could result in multiple, equally valid frameworks, speaks to the complex nature of the types of problems that human-centered design and ethnography is trying to solve. This complex nature is relatively small in issues more focused on particular goods and services like in the realm of human factors. However, this complex nature rears it’s head when designers or ethnographers are dealing with more systemic issues, like governmental or society wide policies, race or feminist related structures, etc. A researcher might even start in a relatively benign space, but as the research continues, could discover that this benign-ness was hiding much more complex, interwoven topics.

So when you have multiple, valid, frameworks with multiple moments of mystery, doesn’t it mean that multiple ideas could affect that “deep foundational truth,” and all be answers to a single research question? If you had that thought, then you would be correct. This is what Horst Rittel and Melvin Webber were talking about in 1973 when discussing “wicked problems.” No single idea will “solve” a systemic issue, but instead an idea can shift an issue and transform it into something else. So any of the ideas that arise from any of these multiple frameworks (if they are valid and true) has the capacity to impact the people the researcher has been studying, but impact them in different ways.

If anyone has read enough Spengler or Adorno, then you’d recognize a criticism of this particular way to code data. I’ve presented a process that starts with a group of people and through data collection, extrapolating themes, writing perspectives, and building frameworks, ends with an idea. It is theoretically possible to use this process in reverse: starting with an idea and ultimately finding not just people to sell this to (“potential users”), but also find out how to break an idea into key components from which a framework can be built, necessary perspectives and themes can be derived, and instead of “collecting data,” can create “identifiers” for people to resonate with. I wrote about this before and so did Eugene Schwartz in his textbook: Breakthrough Advertising. In a similar vein, without changing direction it is possible to use this process in a predatory way, and prey on feelings of inadequacy, fear, temptation, to create something desirable yet destructive.

In the words of people who proclaim the death of our civilization — doing so allows the machine of mass consumption to present the people with the same object of desire, disguised by personalization. In turn, rather than expanding thought, making new ideas, it becomes possible to regurgitate the spread a kind of sameness that takes on different forms to appeal to individual tastes. It keeps people “under the deafening drum-fire of theses, buzzwords, and standpoints” (Decline of the West, Spengler). Or in a less grandiose way, isn’t it possible to use this to make trivial stuff which is at minimum entertaining yet profitable? Just fun commodities to pass the time?

In the quest to create new products, services, and policies, are we the culture that is “compulsively engineering our own destruction, or creating wings that allow us to fly into the future?” (Leonard Bernstein).

I’m not sure.

I choose to look at this type of work as an endless trip into the depths of modern human experience in all it’s dynamism. To me, by engaging with this work, it is an act of preservation of the nuances of human life. By playing with this stuff, and showcasing it, offer some freshness that may create progress.

Anyway, that is what it means to me, but that is not the important part. The important part is the question that lies in the center of this duality: whether design offers freedom or control. This question is the very first question that should be asked at the beginning of every project:

“Why have you decided to study this topic, or this group of people in the first place?”

I was hired by a land developer to help them rebuild a library. There wasn’t a lot of room to work with, and the surrounding community was quite poor relative to the people in the nearby towns. So I did my best, conducted some interviews, got some data, wrote some perspectives which were OK’d by the people living there. Finally I handed over this nice, tidy report.

I never stopped to ask why the developer choose to specifically target the library. Who even started this project anyhow: the community? Some public official in the state government trying to assuage an outcry from this town? A land developer looking to make a quick buck? How would the answer to this question change the report I ended up submitting?

Ultimately, this information: where the project originates from, is really the final ingredient that you must consider when thinking about how to code your data.

Coding human centered data

Written by Yuri Zaitsev