The Journey to Democratize Data

Some first steps in our user research process.

Published in

Redivis

6 min readFeb 8, 2017

Data are numbers and figures, but they are also people’s lives. Whether it’s revealing the hypocrisy and inhumanity in an immigration order, articulating how misguided foreign aid policies can hurt women’s health, or even something as trivial as mapping inauguration crowds, data are the best way we have to represent large patterns in humanity.

But these data are only powerful if we are able to use them. As the design lead at Redivis, it’s my job to design a platform that empowers people to understand and tell the stories in data. It is within this mission that Ian Mathews, Sean McIntyre and I aim to create tools and interfaces to make huge amounts of information accessible, collaborative and intuitive.

After launching the beta version of our platform last spring, we approached building the full version eager to dig in and start solving some problems. Unsurprisingly, many questions came up: How will it fit in the current technological landscape? How will these ambitious tools work? And fundamentally — where do we start? I wrote this post to reflect on our journey to find answers using the principles of human-centric design, and to share some of the resulting insights that continue to drive us forward.

User Research

Our goal is ambitious but it’s important to start with a scope that is focused and attainable. In our beta, we worked with researchers at Stanford and focused largely on building visualization tools. The resulting interactive visualizations are powerful, and turn a meaningless list of numbers into something that our visual selves are quickly able process.

3,144 data points on median household income, interactively displayed in context

However, visualizations are only as meaningful as the data that underlie them, so in moving beyond our beta version we wanted to include the access and curation of raw data in our scope. We are lucky enough to be working with Stanford’s Center for Population Health Sciences, which gives us access to academic researchers who will use Redivis as data consumers and also an organization with huge amounts of data they want to share.

With these users as our starting point, we set out to learn as much as we could about data usage specifically in academic research. We conducted research in the form of interviews and workplace observation with a cohort that included statisticians, radiologists, data librarians, med students, health researchers and full professors. Each one was kind enough to take time to answer our probing questions about their work processes — they even seemed to enjoy it, with one researcher likening our interview to “data science therapy.”

Unsurprisingly, one of the biggest takeaways from this process was that researchers really like doing research, and just want to work on it without being bogged down or side-tracked by everything else. Hooray we are in the right space! But what exactly is this “everything else?” How can we define those problems so we can start working on a solution?

Research Synthesis

Here is where we take hours and piles of user research and distill it down to key insights, commonalities, patterns, and trends; to turn an unwieldy amount of information into the compass for the next phases of our product development, and ultimately answer our basic question of “who is our user and what do they need?”

Enter the (somewhat cliched but still incredibly helpful) sticky notes. We took over an office in our workspace as our design headquarters and talked about our users for days. (I love being part of a team that’s as into the design process as I am. 😎 )

We processed our data initially by discussing our impressions of individual reactions, creating POV statements and profiles for each interviewee, and mapping interviewees on various metrics (straightforward vs. iterative in workflow, experienced vs. novice in tool use).

I then retreated into my computer for some more quantitative analyses as well, including making clusters of common themes/comments and mapping individuals through their research processes.

Some Insights

The research process most of our users conducted fell into three large phases: obtaining information, processing data, and distributing findings. While obtaining data was chronologically the first phase, our research showed that this was also the phase where most projects fail or get derailed entirely. We know our ultimate solution will address all aspects of this research process pipeline but it was great to confirm a natural starting point.

A breakdown of the steps our typical user’s research process

To focus on the “obtain” phase I mapped users chronologically through the various steps of their most recent project. From this it became clear that researchers generally aren’t seeking out and using datasets they are unfamiliar with. Even though everyone said they want to work with the best possible datasets for their research, perceived high investment cost for finding and using a new dataset incentivizes sticking with known materials.

User mapped chronologically through the “obtain” steps of their most recent project, with terminating lines indicating where a project has been discontinued

This made sense given that accessing restricted data was the number one pain point our users brought up and the number one reason for a research project to stop altogether. In trying to further understand this major blockage, we realized that those who had the most success in navigating the bureaucracy and hassle around working with data generally were a part of a larger group or had a personal connection that facilitated the process of finding and securing the dataset. (Most of the time this was in the form of utilizing their lab or working group’s resources.)

Next Steps

With a better understanding of our users and our starting point, we are now developing and implementing solutions to address some of these key pain points and blockages. We can’t remove all restrictions around data (nor would we want to, as patient privacy in healthcare data — for example — is paramount), but we can create a system which streamlines access and lets users know upfront if the dataset they are preliminarily interested in is worth the time it takes to jump through all the hoops to access it.

We can also focus our energy around building tools for making the process of learning a new dataset easier and standardized, in order to enable exploration. In the fast-paced world of academic research where missteps sometimes have great negative impact on a career, we can build an environment where users feel confident to explore and pursue the research that they really want to be doing. Because ultimately, we all need that ground-breaking research (in healthcare, in economics, in policy) to exist.

I, for one, am excited for all that is to come. Stay tuned for prototyping and product implementation that builds on this research, and please don’t hesitate to reach out — we would love to hear your story as well.

Erin DeLaney is the design lead at Redivis. She studied humans at Stanford (‘10), and in her spare time tinkers with LEDs.

Want to join the Redivis team on a quest for democratizing data? Good news, we’re hiring! 💪