What If You Could Touch Data?

Tangible new perspectives on data accessibility

Rebecca Sutton Koeser
Nightingale

--

A few years ago, I became interested in creating physical representations of data. I’d been working as a software developer for several years in the Digital Libraries / Digital Humanities space, so while data visualization wasn’t my primary job, I’d certainly created my fair share of charts and graphs. I’d also been tinkering with 3D printing and modeling on the side, and I knew that printable 3D models could be created with code or based on parameters. Why couldn’t those parameters come from data?

I was curious what insights and possibilities working with three dimensions would offer, and I also hoped it could improve accessibility for data visualization. I went looking for other work in this space, and found an amazing gallery of “physical visualizations”.

A portion of the gallery of physical visualizations at dataphys.org; the collection is CC-BY-SA.
South Americans quipus encode information using string color, length, and knots. The Yakama Native American tribe used strings of hemp as personal diaries, where they marked major life events by a knot, a bead, or a shell.

I found these objects fascinating and inspiring, but also frustrating. One the one hand, it’s amazing to think of the Incan Quipu and Yakima Time Ball, which were used as data storage devices, as physical representations of data. Seeing them helped me recognize that this was a broader and older field than I’d previously thought, and reminded me that humans have been tracking and making sense of information with the materials available to them for centuries. I also found 3D printed objects like a weather bracelet and a network visualization, which expanded my thinking about the possibilities of 3D printing for representing data. On the other hand, I was frustrated that that few of these objects provided user-friendly instructions to follow that would enable me to create my own versions based on different data. Either they offered little information or they had complicated steps that looked intimidating to follow, without any guarantee of success.

Above: Screenshot of OpenSCAD showing one of my first attempts to create 3D-printable data. Below: A series of 3D-printed charts displayed together (the dark blue is the model shown in OpenSCAD).

I started experimenting with my own simple approaches to 3D modeling and printed data, and made some slow progress. I generated and printed a series of time analytics data— my team’s development activity across multiple projects and supporting codebases over the course of a year. I used OpenSCAD, which uses a scripting language to create and transform solids to create models. I created a row of cubes of varying height based on my data. As I worked on printing this, I gained tactile insights into the data — leaning down to look at project activity side-by-side, it was clearly visible where we’d switched from one project to another, and I could feel with my finger the gap that indicated the holidays. I’m not sure there were any insights that couldn’t have been discerned from a good data visualization; perhaps it’s simply that the slower and more iterative process of modeling and printing the data meant that I inevitably spent more time working with this data and thinking about it.

In the fall of 2018, I learned about an opportunity to develop a data physicalization installation for a Digital Humanities conference. I asked three current and former colleagues if they would collaborate with me to develop a few pieces for an installation the next summer, and thus began the project that we eventually named “Data Beyond Vision.” We decided to draw on data from Digital Humanities projects upon which we’d worked, which would allow us to bring a new perspective and sensibility to materials with which we were already familiar. One of the datasets we chose to work on came from the “Shakespeare and Company Project”, which all four of us had worked on. The Project tells the story of the English-language bookshop and lending library in Paris owned and operated by Sylvia Beach from 1919 to 1941, based on archival materials held at Princeton University. Although the bookshop was run by a woman, Beach and her bookshop seemed to us to be seen as famous based on proximity to famous men— James Joyce (Beach was the first publisher of Ulysses), Ernest Hemingway, and other members of the Lost Generation who patronized her lending library. And yet, our experience from working on the project was that this was a community of women readers. We wanted to foreground that activity to make it more visible.

The other project we chose was “Derrida’s Margins”, which catalogues every quotation, citation, footnote, and reference in Jacque Derrida’s de la Grammatologie, and connects them to annotations in his personal copies of the texts he cites. None of us are particularly fond of Derrida or deconstructionism, but we thought it would be interesting to bring a new perspective to the data and highlight the massive amount of work that went into curating the data for this project.

We brainstormed different methods and materials we could use to represent the data, and this was where working with collaborators was incredibly valuable. I got new input and advice on my 3D modeling, and I worked with one collaborator on an idea to use weaving to represent one dataset, while our two other collaborators pursued possibilities for data representation with paper using origami and kirigami. They also suggested ways to make our data representations more participatory and engaging: we could make our installation interactive by showing a weaving in-progress that visitors could add to, or by providing paper and instructions to fold a representation of some aspect of our project data.

At some point along the way, I encountered Catherine D’Ignazio and Lauren F. Klein’s work on Data Feminism and Feminist Data Visualization, and I quickly recognized that Data Beyond Vision was a profoundly data feminist project. In particular, it aligns strongly with these three principles of data feminism: elevating emotion and embodiment; considering context; and making labor visible.

Physical objects have to be approached in space, with the body. You have to choose — and change— your angle or perspective, and you have to get close enough to touch. Making data representations tangible combats the assumed neutrality and objectivity of visualizations — feminist philosopher Donna Haraway describes this as the “god trick of seeing everything from nowhere”, and D’Ignazio and Klein elaborate that what seems to be a neutral, omniscient view is still a partial perspective. With a physical object, you’re inevitably more aware of yourself in relation to the object, and you can’t touch or grasp the entire thing at once.

Creating physical representations of data is much slower and more laborious, at least for now, because there aren’t existing tools that can be easily applied. Creating 3D models is an iterative process of modeling, printing, testing, trying, failing, adjusting, and trying again. Folding paper or weaving yarn requires working with your hands and spending more time with the data.

We also had an inside perspective that gave us a better understanding of the the data. Since we worked on the Digital Humanities project teams that generated the data, we had seen and touched some of the archival materials it was drawn from, we knew how large the teams were, how much time went into collecting, refining, and curating the data, and contributed to it ourselves. This gave us deeper understanding of the limitations and potential of the data, which made it natural for us to consider the context and take it into account when presenting it.

I continued to prototype and refine my approach to 3D modeling time-series data. I tried Legos, I experimented with simple three-dimensional versions of bar charts. I was working with a new, more complex set of data that I knew fairly well but not deeply — monthly membership totals from the Shakespeare and Company lending library. The first time I showed a 3D printed prototype to Joshua Kotin, the faculty director on the Project, he immediately recognized the dip in the middle as the place where we were missing records — something I had not previously noticed, which had somehow not been obvious to me in all my other work on the project. Perhaps this was because I had been focused on data modeling, migrations, and dealing with edge cases; the 3D prototype gave me a different perspective.

Eventually, I was inspired to try a 3D variation on a lollipop chart, and was surprised to find it was much more readable than the 3D bar chart.

Various prototypes: Lego, 3D model, 3D printed bar chart and lollipop chart
A screenshot of Blender scripting view showing a portion of my code and a model generated by the code.

Somewhere along the way, I gave up on OpenSCAD for creating my models. The scripting language that OpenSCAD uses was too limited and simplistic, and it couldn’t handle the size and complexity of the models I wanted to generate. Eventually I settled on Blender, an open source 3D creation suite that supports creating printable models. Blender has a powerful Python API and script view that lets you import and run your code, and then see and interact with the model you’ve generated. After some trial and error, and much consultation of the documentation and Blender’s stack overflow site, I eventually developed Python scripts that could read in CSV files and generate lollipop chart models directly from the data.

For the Shakespeare and Company Project membership data, I wanted to show two datasets alongside each other. The Project is based primarily on two sets of archival materials— logbooks with subscription information, and lending library cards which detail the borrowing histories for a subset of members. Both records are incomplete — logbooks are missing for some years, and many of the lending library cards were lost or destroyed (Beach sometimes even used the backs as scratch paper!). I wanted to see how these two sources combined and complemented each other to represent the membership of the library over time.

A close up of 3D printed lollipop chart with labels.

What I came up with is a two-variable lollipop chart of membership totals by month and year. Months run left to right from January to December along the short end, and each row is a year starting with 1919 at the front. Each data point is represented as a half-lollipop. If the number of members with subscriptions in a particular month exactly matched the number of borrows, the two halves would line up and make a single lollipop. But where they don’t match, you can see and touch the difference between the two. I also used two different shapes, so that it would be easier to discern the two data series by touch. This piece is interesting to explore by touch, because it’s large enough that you can’t touch the whole thing at once, and it’s not clear exactly where to start.

Have access to a 3D printer and want to make this yourself? I’ve made the models available with instructions on how to slice and print them. The labels will work best with a dual filament printer, but the actual model is printed in pieces and can be printed on any 3D printer.

Rendering of two-variable 3D printable lollipop chart and labels representing Shakespeare and Company lending library membership over time.

For our Data Beyond Vision installation, we also created data physicalizations by folding, weaving, and stacking.

The origami piece, created by Nick Budak, gives the illusion of two intersecting shapes, a cube and an octahedron. Those two shapes provide a volumetric representation of the borrowing activity of the famous members of the Shakespeare and Company lending library (the cube) compared with the activity from lesser known members (the octahedron). Names of the lesser known members are printed on the paper used to create the octahedron, as a way of making them known. This object can be picked up and held, which means that you can literally “grasp” the two sets of data.

The weaving uses a pattern created by Gissoo Doroudian to represent the references in chapter 1 of de la Grammatologie. Each type of reference (epigraph, citation, quotation, footnote) is represented by a distinct yarn and weaving pattern. This piece is interesting and engaging to touch, and brings a much more human scale to the data. At our installation, we displayed a loom set up with an in-progress weaving and directions, and invited visitors to participate and weave a few rows of data.

Want to try making it? Gissoo has written up detailed instructions to teach you how to learn the weaving patterns she included and the pattern to recreate this weaving.

The stacking piece, created by Xinyi Li, uses the technique of pop-up box folds to encode multiple variables into a hybrid of time-series and stacked bar charts that shows part-to-whole relationships. Each unit represents one year with nine variables from Shakespeare and Project lending library membership data, conveying the number of active members, members with borrowing activity, and the difference between renewing members and new members.

Want to try your hand at cutting and folding? We have directions and a downloadable PDF of the model; all you need is a printer, card stock, and a blade.

Do you want to learn more?

We recently published an article on the Data Beyond Vision project, with more details, including some of the theoretical background to our work and approach, and how we see data physicalization in relationship to other modes of representing and visceralizing data. The article includes a section devoted to each object, with detailed descriptions of how and why we created them, what insights can be gained from them in contrast to familiar data visualizations, and our ideas about possible next steps. We thought it was important to provide instructions (where permissions allow) to recreate our objects, so that people can follow along with what we’ve created, experience what it’s like to touch and hold a data physicalization, and hopefully be inspired and empowered to create their own versions.

Data physicalization has tremendous potential to give new perspectives on data, to make it approachable and accessible in new ways, and remind us all of the incredible amount of labor that goes into collecting and curating data and visualizing it. Paper and yarn are readily available and easy to work with, and have such range of expressibility that could be used in a variety of ways to represent data. I’m excited to continue my own explorations of writing code to generate printable 3D models of data, and hope I may eventually be able to clean up and generalize my code to the point that I could share it. I hope that in the future it will be even easier to print braille labels to display with, or better yet embed in, printed models. And while the 3D printed models I created as part of “Data Beyond Vision” were inspired by standard data visualizations like bar charts and lollipop charts, I look forward to future models that take greater advantage of newer modeling algorithms and alternate fabrication methods to create something distinctively new and different.

Rebecca Sutton Koeser is the Lead Developer at the Center for Digital Humanities at Princeton, where she and her team design and develop custom software for Digital Humanities research.

--

--

Rebecca Sutton Koeser
Nightingale

Digital Humanities, Agile software dev, English lit. Coder, reader, mother. Lead developer, Center for Digital Humanities at Princeton.