D3.js — Data Visualization with Javascript for beginners

liz lovero
Jul 16, 2015 · 5 min read

The power of the unaided mind is highly overrated… The real powers come from devising external aids that enhance cognitive abilities. — Donald Norman

In many ways, D3.js (aka Data-Driven Documents — JavaScript) was a coding gateway drug for me. Last year, while still wholly unfamiliar with Javascript, I was able to use the library to make a interactive visualization with data from the Harry Ransom Center, where I was a graduate research assistant. D3.js produced beautiful results, gave me a great deal of control over design and display, and was frankly easier to make than a comparable diagram in Adobe Illustrator. I was hooked!

Turning code into something pretty was a big thrill. Using that code to make a point at work feels like super success! \\ ٩( ᐛ )و //

About D3.js

D3.js is a javascript library written in 2011 by Michael Bostock with Vadim Ogievetsky, and Jeffrey Heer as members of the Stanford Vis Group. (Serious Academic Paper here) Bostock went on to be the editor of interactive graphics at the NYTimes and hosts the D3.js source code on his Github.

The library can be used to make a huge range of interactive visualization types — really anything that you can imagine — and the dynamic effects have been repurposed for games and tarot card fortune telling.

Image for post
Image for post

The basics are as follows: (adapted from the docs)

  • D3 is a JavaScript Library embedded within an HTML webpage
  • D3 uses pre-built JavaScript functions to select elements, create SVG objects, style them, or add transitions, dynamic effects or tooltips to them.
  • These objects can also be widely styled using CSS.
  • Large datasets can be easily bound to SVG objects using simple D3.js functions to generate rich text/graphic charts and diagrams.
  • The data can be in various formats, most commonly JSON, comma-separated values (CSV) or tab-separated values (TSV).
  • For n00bs, there is a gallery on Github of example code that can be easily adapted to your purposes, including both html/js and sample data.

D3.js is also widely used by outlets like the Guardian and NY Times.

Image for post
Image for post
From the Guardian UK http://www.theguardian.com/world/interactive/2013/feb/12/state-of-the-union-reading-level

Using D3 as a coding novice

For this novice, the easiest plan of action was to select a template that would best represent my data. I went with the interactive heat map to compare the growth and development of collections at the two largest UT archives over time.

Fixing up the Dataset

I had an existing dataset of XML documents (encoded EAD finding aids — basically formulaic metadata describing the contents of an archival collection) from the two largest University of Texas Archvies — the Harry Ransom Center (where I was working) and the Dolph Briscoe Center for American History. A colleague helped scrape data from Texas Archival Resources Online (TARO), a public website of UT collections info.

Image for post
Image for post
Example of a single finding aid in EAD

Here’s what the source data looked like: first as a finding aid on TARO, then as an XML doc. I wanted to group the collections to get a better sense of how UT was approaching its collecting — specifically was the University just obtaining the papers of famous individuals, were they more focused on records from organizations, or were they producing a lot of thematic collections usually assembled by another collector and donated en masse.

Image for post
Image for post
XML file of a single finding aid.

The scraper used key words from the XML tags (aided by a fair bit of hard-coded data normalization on my part), to sort the collections into three groups to illustrate trends in collecting at the respective institutions in the last 15 years.

  • Records (collections belonging to organizations)
  • Papers (collections belonging to individuals or families)
  • Other (for thematic collections like Circus ephemera)

The final result was a three-column tsv file with the group, year, and value (or number of collections).

Image for post
Image for post

Putting it all together

I used the Interactive HeatMap (which based on this example I found from Trulia) to display my TSV data.

This is the basic import of data and layout code.

I adapted the gridsize and legend element to fit my data, as well as selected some colors to encapsulate the range. Since I had a pretty diverse set of data (values from 1 or 0 to several hundred) I played around quite a bit with the scale of the data, as seen below.

Image for post
Image for post
Image for post
Image for post

So, now feast your eyes on the final results!

For more on D3 and dataviz generally

I used DashingD3.js to learn all the basics of manipulating the data and elements of SVG necessary to build my first viz. I can’t recommend it more strongly to the newbie.

The best place to learn basic D3.js is from the online tutorial DashingD3.js.

Also, Mike Bostock has adapted his really insightful talk from Eyeo 2014 about the nature of visualization and algorithms into a blogpost or this video. He rocks a serious manbun while making sharp insights.

The Data Experience

Get immersed in the art and science of data

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store