Learning data viz with D3
It was a Sunday morning in Melbourne. Having watched a few video courses on data viz with D3, I decided to set myself a goal: by the end of 2020, I will feel confident to design data visualizations of the complexity and visual impact of those by the New York Times. Today marks 30 days on that journey.
For me, data viz is a creative outlet, and a powerful intersection of design and code. It lets you tell captivating stories. The satisfaction of producing something visual, something alive with motion that you can interact with, keeps me motivated to keep playing. And because everything is new, I’m learning an incredible amount really quickly, and that fast feedback loop is compelling. Gamification is built right in.
For 30 days, I journalled in a notebook on Observable, learning out loud as I begin the journey to master this craft. You can find all my entries in this collection, Journal: Getting Started with Data Viz.
Here are the resources I used, my learning process, and a reflection on the journey so far. The second half of this story is an appendix that summarises the journal entries.
Resources
- Learning what “good” looks like
- Data exploration and visualization tools
Process
- Step 1: Gather resources
- Step 2: Data exploration
- Step 3: Data visualization
Reflections
- The learning curve
- Gather resources as you go
- Weeding out the noise
- Avoiding mistakes
- Finding your way in D3 docs
- Motivating projects
- Conclusion
About Diana MacDonald
Appendix
- Journal entries
Resources
Learning what “good” looks like
When I teach design for coders, I encourage engineers to crank up the volume on design in their lives. Follow designers on Twitter, subscribe to newsletters, have coffee with designers at work. A lot of engineers learn to build websites from “hello world” examples with no styling, so it’s important to start improving the balance of how much quality design they’re seeing day to day. Lots of engineers also tell me they don’t “have an eye for design” or don’t know what sources to trust because “design is subjective”, so it’s helpful to have a hand in changing that.
Here are some of the industry legends in data viz recommended to me. Their works have helped me learn what “good” looks like in data viz:
And some folk to follow on Twitter:
Data exploration and visualization tools
During the first 30 days, I explored data in Vega-Lite and then levelled up with a D3 visualization:
- Vega-Lite: a visual grammar for specifying charts in JSON. It’s convenient for quickly switching between “marks” (bar, area, point, etc.) with a single underlying data set to get a sense of the shape of the data.
- D3: A JavaScript library for visualizing data with HTML, SVG, and CSS. It’s powerful and works seamlessly on the web. It’s a swiss-army knife for visualising data in digital products, making it a great tool for web developers.
Moving forward, I want to skip Vega-Lite and instead use Data wrapper. It’s a much faster method to see your data, while D3 gives you far greater control over the final results.
To search D3 documentation by method name and so on, I use Dash.
Process
Data visualization has a wildly steep learning curve. It’s multidisciplinary, requiring expertise in data, design, and development. Within each, you need to learn the theory and tools, as well as practice the craft.
To start with, I wanted to focus on only the development step. This is a challenging place to start — you cannot build a visualization without data or design. This may have made it more difficult for myself, but it helped me get to a tangible outcome faster: produce charts.
While building my first bar chart, I needed to learn the minimum amount about Observable itself and data wrangling.
Observable:
- Observable user manual. If you’re learning Observable, I suggest you read at least the first two sections. Side note: Observable is hands down the best live coding environment I’ve used. Blows codepen, JSfiddle, and all the others out of the water. It’s snappy and reliable. You can edit a cell, run it, and see the result instantly.
Data wrangling:
- JavaScript Data Wrangling by Michael Freeman, Interactive Info Vis
- Learn JS Data on Observable based on Learn JS Data
Step 1: Gather resources
For each chart type I wanted to learn, I’d start with learning some theory. I’d focus on the function of the chart, such as comparison, concept visualisation, correlation, distribution, geographical data, part to whole, or trend over time. I’d also consider the appropriate data types for them and look out for caveats. Finally, I’d compare them to similar charts, such as differentiating line charts from sparklines or scatterplots in 10 May 2020: Line charts.
For learning the theory:
Standing on the shoulders of giants, I would then check out visual examples, code examples, and examples in design systems.
For visual examples:
- Search Pinterest
- On Dataviz project, each chart comes with visual examples, such as you can see under Stream graph
- The Upshot by The New York Time and other news sites
- Chartipedia and Grafiti.io chart search engines
For code examples:
For design systems:
- Shopify, Polaris: Data visualizations
- Morningstar Design System: Charts
- Urban Institute Data Visualization Style Guide
- AntV: G2Plot a charting library
For any remaining questions I had, I would search Slack archives:
- Data Visualization Society
- D3 Slack, linked from the D3 wiki
Even if you’re not into live chat, these Slack communities are a treasure trove of questions answered by industry experts, so you can treat them like a modern and kind Stack Overflow for data viz.
Step 2: Data exploration
For each chart type, I would assemble a dataset appropriate to that chart type. Either I would collect my own or search for a beginner-friendly dataset (that is clean and not missing data):
To see the shape of the data and learn a data exploration tool, I’d build each chart type in Vega-Lite:
- My notebook on Working with Vega-Lite
- Data Visualization Curriculum by Jeffrey Heer
- Observable for Vega-Lite by Visnu Pitiyanuvath
Step 3: Data visualization
After gathering resources and exploring data, I built charts in D3. Generally, I’d find a handful of working notebooks on Observable and recreate them by hand from memory instead of forking, to ensure I understood every line.
For learning D3, I recommend reading these tutorials:
- Introduction to D3 by Tongshuang Wu, UW Interactive Data Lab
- How to learn D3 by Amelia Wattenberger
- Interactive Charts with D3.js by Amelia Wattenberger
- Learn D3: Introduction by Mike Bostock
For more resources, see my Links notebook on Observable.
Reflections
The learning curve
Beyond data viz being a multidisciplinary field, learning to wield D3 requires advanced knowledge of SVG, JavaScript, and CSS, and a healthy amount of patience. Otherwise, it can be challenging to learn these web languages at the same time while attempting to debug and overcome setbacks.
I’ve tackled projects with steep learning curves before, though one notable difference is the amount you need to learn in data viz before you can reliably produce a single, basic visualization. With most other projects, I’ve worked quickly to skill up enough to complete the first, end-to-end realistic piece of work, before continuing on to iterate, refine, practice and expand on that end-to-end piece. For example:
- In music, you might focus on slowly playing a song correctly, before building up speed.
- In product design, you might focus on producing a minimum viable product, then evolve from there.
- In web development, you might focus on scaffolding an app and deploying it to the Internet, then building it out from there.
In data visualization, there’s very little you can achieve without first grasping data collection, cleaning, and other wrangling. You must also know how to code, and learn a new library. With D3, especially, you need to know exactly what you’re trying to achieve with every mark, axis, label, legend, and interaction. This means applying visualization design principles at the same time as coding.
Little strokes fell great oaks
It’s easier to tackle a huge task with steady and persistent effort. Aim for 1% better everyday. I made this work for me by aiming to publish a notebook everyday. There was no minimum length or time required (though I am sure it would feel silly to publish a single sentence). Sitting down to write 1 notebook and otherwise change 1 line of code was enough to keep data viz top of mind for 30 days. But it’s easy to focus on the first five minutes to get yourself started.
Gather resources as you go
Along the way, I published useful Links in Observable.
Whenever I stumbled across a link that might be useful in the future but I wasn’t ready to look at it yet, I collected it in my private notes. When the day arrives that I want to build maps or learn Python, I already have great resources to fast-track that effort. Once I’ve road tested them, I’ll add them to my published links notebook.
Weeding out the noise
D3 changed a lot from v3 to v4, and again some to v5. At this time, the Internet is still chockers with tutorials and blog posts for older versions. Some key things to look out for:
- Use
.join()
instead of.enter()
,.update()
and.exit()
in the general update pattern. - Use d3-delaunay instead of the deprecated d3-voronoi.
- Use scale variables named
x
,y
, andcolor
instead ofxScale
,yScale
, andcolorScale
.
Avoiding mistakes
Where possible, learn from Mike Bostock’s notebooks. They’re the most up-to-date notebooks you’ll find and follow lots of great conventions that will save you headaches, even before you understand the reasons for them. One drawback is that these SVGs don’t appear to have been designed accessibly.
What to use in your own notebooks:
- Use a specific version of D3 (or other modules) in imports e.g.
d3 = require("d3@5")
- Use generic, conventional variable names, such as
x
,y
,xAxis
,yAxis
,margin
,height
, anddata
. This makes it easy to adapt other people’s code to your needs. - Similarly, rename columns or row headers from your data when you load it in so you can reuse your own chart code with different datasets and only update your code in 1 place (the data cell) instead of every cell where the data field names are used. For example:
Object.assign(d3.csvParse(await FileAttachment(…).text(), ({MyDataTitleField: name, MyDataXField: x, MyDataYField: y}) => ({name, x: +x, y: +y})))
. Learn more in 23 May 2020: Area Charts in Vega-Lite and CSV Parsing. - Pin a
data[0]
cell to remind yourself what the data looks like.
What to look out for when learning from other people’s notebooks:
- Check the D3 import in the appendix at the end to see if it’s pulling in other modules.
- Check the
data
cell to see what data wrangling work might be necessary and the shape of the data. - Check the appendix for anything else unusual.
Some debugging tips I’ve figured out:
- To
console.log()
something in the middle of a chain of method calls, you can returnconsole.log(variable) ||
(notice the logical OR pipes||
). For example:.join("circle").attr("stroke", d => console.log(d) || d.r)
- To test the behaviour of your scale functions, first try to run it with any data like
x(123)
instead of intended usage,x(d.x)
- Check your brackets, e.g.
.range([height - margin.bottom, margin.top]);
(one challenge of Observable is debugging when silly stuff happens like forgetting brackets) - Give elements that you append IDs or classes to make them easier to identify during debugging and to make sure they don’t clash with other things on the page (if you put your chart on a page with other elements). That is, don’t depend on broad selectors, such as
d3.selectAll('path')
.
Finding your way in D3 docs
D3’s documentation is written in a style that’s not intuitive to me. Here are some hints to give you the lay of the land:
- Many functions are overloaded. This means the function will do one thing with no arguments, something else with 1 argument, something else with an argument of a different type, and so on. You may need to scroll up and down the docs to find adjacent explanations for a function called with different arguments.
- To learn about, for example,
.domain()
on ascaleLinear()
, you need know thatscaleLinear()
is a type of continuous scale (along with power, log, identity, time, and radial), so you need to search forcontinuous.domain()
- To learn about, for example,
.on()
, you need know that it’s used on a selection, such asd3.select()
,d3.selectAll()
,d3.create()
, so you need to search forselection.on()
. In examples, it will also likely appear after a selection’s “transformation method”, such asselection.attr()
,selection.classed()
,selection.style()
,selection.text()
,selection.html()
,selection.append()
, orselection.insert()
. - There are loads of aliases and the like, so reading the docs can mean bouncing around a lot, following links to “equivalent” functions.
- There’s a bit of jargon to wrap your head around and it’s often not linked to a definition. Sometimes the jargon isn’t explicitly defined anywhere, just sort of inferred. I found myself occasionally rewriting documentation to strip out the jargon so I could make sense of it. As an example, compare the line.defined() documentation with my explanation of
line.defined()
in 12 May 2020: D3 Line Chart. - Sometimes the syntax is buried within prose, but it’s worth digging for. I’d prefer to see the parameters and return values called out more prominently, like I’ve seen in documentation on MDN.
I’m under the impression there’s a huge amount of work under way to migrate old tutorials and examples from bl.ocks.org to Observable. Some folk are putting in an incredible effort to ease the learning curve into D3, and I’m excited about where it’s headed.
Motivating projects
Given the steep learning curve, I’ve heard a few folk talk about finding motivating and manageable projects. In this round up, Mike Bostock to humans: ‘Try to look for small problems first’ by Sérgio Spagnuolo on the Mike Bostock, Reddit AMA, there were a few quotes that stood out to me:
Try to look for small problems first, the sort of thing you can solve once per day, whenever you have time. The rewards from early victories are strong motivation to keep going.
… think of small coding problems that you are comfortable solving, and then increasingly ramp up to larger problems as you go. The satisfaction you derive from solving the smaller problems will motivate you to keep going.
I find it to be the easiest thing in the world to work on something if you are passionate about it, and you can break it up into small pieces (like examples) that you can publish and share with others for external validation. So probably, choosing to work on things you are excited about, and then finding space to avoid distractions or interruptions is the key.
Along a similar vein, Josh Temple wrote in Why I abandoned online data courses for project-based learning:
The key is to find a project that you’re so excited to bring to reality that you don’t mind pushing through the obstacles along the way. Here are four attributes that I find make a project motivating. Maximize one or more of these and you’ll have a much better chance of actually finishing your project.
- Utility: “This would make my life easier and save me time.”
- Passion: “This would solve a problem I deeply care about.”
- Curiosity: “This dataset really intrigues me and I want to explore it more.”
- Competition: “I want to win this prize and beat other competitors.”
For my part, I’m yet to tackle a large and meaningful project, so the greater challenges are persevering through bugs and maintaining momentum. I’ve mostly focused on tackling hard problems over the weekend with the time, space, and focus to work through thorny problems or track down documentation and solutions. On work days, I’ve scaled back my efforts to focus on learning theory or tweaking the previous day’s chart in some small way.
Once I have a good handle on the basic tools, I will be deciding on a lovable project. I am noodling with some ideas already.
Conclusion
And that’s a wrap. In the appendix below, you can find more details about what I covered and lessons learned each day in my journal.
- Follow me on Twitter to see what happens when I make it to 100 days.
- Join the Data Visualization Society Slack to connect and chat about data viz.
Otherwise, I only want to add that I’m grateful for the selfless people who have spent countless hours creating tutorials, books, and courses, and making them available for free. I’m inspired by the passionate creative coders who make beautiful works that have impact in the world, and share them everyday. Thanks so much for all that you do.
About Diana MacDonald
Diana MacDonald is the author of Practical UI Patterns for Design Systems and creator of Typey Type for Stenographers. She leads the design systems team at Culture Amp. Raised in the tropical north of Australia, she has spent the last decide in the tech industry, exploring the digital space with progressive organisations like Culture Amp, Bellroy, and SitePoint. Blurring the lines of designer and developer, she believes in the value of considered, inclusive, and remarkable stories. She wants to help you effortlessly execute your digital ideas.
Appendix
Topics covered and lessons learned each day in Journal: Getting Started with Data Viz.
Day 1–7: Learning the basics with bar charts
- Google Trends suggests that “data viz” is more common than “dataviz” in the US, but globally they’re almost on par.
- D3 bar chart.
26 Apr 2020: Bar Chart Revisited
- Setting
.padding(0.1)
reduces the width of each bar rather than adding space between each bar. This makes sense for conveniently working with bars and increasing padding without blowing the whole chart out. .tickSizeOuter(0)
removes the stump on the bottom left of the chart (it hides the first tick on the axis).nice()
ensures the top of the chart finishes at 13% instead of 12% with the highest bar poking above the highest tick.width
is the current width of a cell and is part of Observable’s standard library. It is a Reactive variable that instantly responds to changes in the window size.- “Use caution when applying global styles: if a style affects a cell’s height, the runtime may not notice unless the affected cell is re-evaluated.” — Introduction to HTML
- Some numbers can’t be represented in binary floating-point, meaning code like
tickFormat(d => (d * 100) + "%")
can produce “7.000000000000001%” instead of 7%. You could force a “fix” using.toFixed([digits])
, or round it usingMath.round(yourNumberHere * 10 ) / 10
, but use the better option,d3.format(".1f")
from the d3-format module. - Showing the median as a line when it’s one value or highlighting bars when it’s two.
27 Apr 2020: Finalising the Bar Chart
- The general term for lines showing the median, mean, benchmarks, or “best fit” is “reference line” (or “reference band” for showing a range).
- Use
.attr('dy', '1em')
(specifically 1em) to shift text along its y-axis by the height of the text. - Set styles using
.style('font', '400 12px/1.5 "Work Sans", "Open Sans"')
instead of.attr('style', 'font: 400 12px/1.5 "Work Sans", "Open Sans"')
- Use
Object.assign(target, ...sources)
to add labels for axes as properties to thedata
object after parsing it. Add theawait
keyword to ensure the data promise is resolved before adding the new properties to it. - There is no
translateY
SVG transform (it’s a CSS transform). Usetranslate
instead e.g..attr('transform',translate(456, 0))
- Styled axes by Mike Bostock. Use a right axis with tick sizes that match the length of the chart to push it to the left side of the chart.
- Mike Bostock’s Stack Overflow answer to styling axes
- D3 chart title.
- D3 axes labels.
28 Apr 2020: What Else I Learned From That Bar Chart
- Specify a specific version of D3 in the appendix to future-proof against breaking changes:
d3 = require("d3@5")
band.range([range])
… sets the scale’s range to the specified two-element array of numbers.band.rangeRound([range])
Sets the scale’s range to the specified two-element array of numbers while also enabling rounding. This will “give the results as integer values, using Math.floor() to avoid overflowing the range. This will usually leave some unused space, that must be allocated to the left and right padding (even if those have been set to 0). Align them with band.align.” —d3.scaleBand on Observable
29 Apr 2020: Setting up Observable
- Interactions: Drag the slider to set the exit transition duration on mouseover/hover.
- Use
.align([align])
on bands to left all the bands to the left (0), center (0.5), or right (1). - Observable view.
30 Apr 2020: What Else Can I Do With This Bar Chart?
- Reorder adding SVG elements so they appear in the correct order on screen. For example, ensure the bar value labels appear above the median line for readability.
- A CSS hack to give labels some distinction: apply a white text shadow in 4 directions using
.style("text-shadow", "white 1px 1px, white -1px -1px, white 1px -1px, white -1px 1px")
- D3 bar labels.
1 May 2020: Formatting Numbers and Sorting a Bar Chart
- D3 Format to format numbers the way you like using d3.format. This is a longer explanation of the format specifier.
- Use JavaScript’s
array.sort()
to order bars before binding them. - D3 grouped bar chart with color.
Day 8–11: Data wrangling and exploration with Scatterplots
2 May 2020: Exploring More of Observable and Vega-Lite
- Use HTML tagged literals to add CSS style tags: “html
<style> .highlight { background: yellow; } </style>
” - Read Observable’s Introduction to HTML by Mike Bostock to learn about reactive Markdown, “${tex
\KaTeX
}” for math, sparklines for a chart inline with prose, using canvas, using D3, using SVG tagged template literals, usingviewof
to react to an input element’s changing value, and adding global CSS using html tagged template literals. - Read Observable’s Introduction to Promises by Mike Bostock to learn about Observable’s
delay
,tick
andwhen
functions. - Read Observable’s not JavaScript by Mike Bostock to learn some “gotchas” to look out for working with Observable, such as “you’ll need to wrap object literals in parentheses or use a block statement with a return” e.g.
object = ({foo: "bar"})
. It also introduces Observable’s “special mutable operator so you can opt-in to mutable state: you can set the value of a mutable from another cell”. Finally, it talks about how to import things across notebooks. - “inlineCode” helper to add some custom inline code styling in Observable.
- D3 sortable bar chart.
- Vega-Lite scatterplot.
3 May 2020: Exploring More Vega-Lite
- I published a fork: Vega-Lite API Vega-Lite V4. This way I can intentionally import Vega-Lite major version 4 into my notebooks. When Vega-Lite major version 5 comes out with non-backwards-compatible breaking changes, my older notebooks using this import won’t randomly break. This future-proofs my notebooks using Vega-Lite V4.
4 May 2020: Vega-Lite Scatterplot and Heat Map, and More Posts on Getting Started With Data Viz
- Read Observable’s Introduction to Imports by Mike Bostock to learn a bit more about Observable imports, like how imported cells are lazily evaluated and you can import from private notebooks, even though that could cause broken behaviour in a published notebook.
- Use
~~~
for code fences in Observable including tagged code fences:~~~js
- The SVG hyperlink element
<a>
is a container around any shape. If it’s around a circle, it will have a circle-shaped hover/click target. - In Observable, a view is “a user interface element that directly controls a value in the notebook”. It has two parts: a view (typically an interactive DOM element) and value (any JavaScript value). See also: Introduction to Views by Mike Bostock.
- It seems that a viewof is a second hidden cell that shows the current value of a user interface input element. See: A Brief Introduction to Viewof by Mike Bostock.
5 May 2020: Vega-Lite Scatterplot continued
- Scatterplot theory.
- Vega-Lite proportional area (bubble) charts.
- Vega-Lite color encoding.
- Vega-Lite rows.
- Vega-Lite quantize scale.
- Heatmap theory.
- Vega-Lite Heatmap.
- Tidy data is where:
1. Each variable forms a column
2. Each observation forms a row.
3. Each type of observational unit forms a table.
Day 12–15: D3 Scatterplots with legends and tooltips
- Vega-Lite tooltips.
- Vega-Lite grid removed.
- Vega-Lite line mark interpolation.
7 May 2020: D3 Scatterplot with Legends
- d3-time-format
- Use
man strftime
on the command line to figure out the arguments for thed3.timeParse()
function so that “%d” is replaced by the day of the month as a decimal number (01-31) and “%b” is replaced by national representation of the abbreviated month name. - Renaming fields upon parsing CSV data file. There’s a longer post about CSV parsing in 23 May 2020: Area Charts in Vega-Lite and CSV Parsing.
- A JavaScript Map for
color const exerciseColorMap = new Map([["Bike", "#8242a8"], ["Run", "#ff1493"], ["Walk", "#FFCE1E"]])
to conditionally set the data points stroke color according to its type. - Read scale.ticks by Fil to learn more about ticks. Roughly,
.ticks()
gives us the array of some values from the scale’s domain while.tickFormat()
takes the same arguments and gives us the formatting for thoseticks
. This kind of detail is really helpful for making sense of the API: “The tick format provided by time scales ignores the specified count; the returned string is based solely on the given time. This is sometimes surprising, but allows the format to behave consistently across views when the domain changes, improving readability during animated transitions and zooming.” - A screenshot of creative commons licensing used in a notebook.
- D3 scatterplot with legend
8 May 2020: D3 Scatterplot with Tooltips
- Read about Observable’s Introduction to require. You can have multiple inputs at once such as:
d3CsvAndFetch = require("d3-dsv@1", "d3-selection@1")
- Using Mike Bostock’s color legend, add swatches as a separate cell:
swatches({ color: d3.scaleOrdinal(["Bike", "Run", "Walk"], ["#8242a8", "#ff1493", "#FFCE1E"]) })
- Choropleth with Tooltip by Duy Nguyen shows an example of a legend presented on a chart using the same color legend notebook
- Import Susie Lu’s d3-legend using:
d3Legend = require('d3-svg-legend')
. Notice the ‘-svg-’ in the module name there unlike the name of the project. - D3 scatterplot with legend
- Selection join by Mike Bostock shows a join example with separate enter and update behaviour.
- D3 scatterplot with tooltips using Line Chart with Tooltip by Mike Bostock, callout and bisect.
- I added a crude fix,
if (!b) { return a; }
, to prevents errors when mousing over right edge of chart. These errors appear in the original notebook. - SVG has its own span element for styling parts of text elements: `tspan`.
- Read MDN’s
toLocaleString
documentation - Voronoi Tooltip by Ajayyy shows notebook appendix formatting that chunks out “setup”, “scales”, and “data”. It also adds cells specifically for “plotAreaWidth” and “plotAreaHeight” to substract margins, which keeps some of that noise out of the chart cell:
plotAreaWidth = width - margin.left - margin.right
- Expand d3 imports with other modules:
d3 = require("d3@5", "d3-delaunay@5")
9 May 2020: D3 Scatterplot with Voronoi Tooltips
- “Use transparency to help with overlapping data points”.
- Read this in-depth tutorial on Using a d3 voronoi grid to improve a chart’s interactive experience by Nadieh Bremer to understand the intent and usage of voronoi in charts, but then read Learn D3: Interaction by Mike Bostock to learn how to use d3-delaunay to implement polygons for tooltip trigger regions.
- It looks like d3-delaunay replaces d3-voronoi completely, even though
d3.voronoi
appears to be built into d3 and recommended in the API docs whiled3-delaunay
appears to be separate. Thed3-voronoi
repo shows this message: “Deprecation notice: Consider using the newer d3-delaunay instead of d3-voronoi. Based on Delaunator, d3-delaunay is 5-10× faster than d3-voronoi to construct the Delaunay triangulation or the Voronoi diagram, is more robust numerically, has Canvas rendering built-in, allows traversal of the Delaunay graph, and a variety of other improvements.” - Links to other articles and notebooks about using voronoi.
- Voronoi tooltips with overlay.
- Note: you don’t need the SVG overlay. It may be possible to use
delaunay.find()
directly without the overlay as shown in Summer heat 🍦 🌞 by Fil.
Day 16–17: Line chart theory and exploration
- Line chart theory
- Spline graphs, monotone functions, curving functions, and oversampled data.
- Don’t “cut” the Y Axis
- Spaghetti plot
- Dual axis
- 11 May 2020: Vega-Lite Line Charts
- Vega-Lite line charts
- “You have to be like the worst tabloid newspaper in the front and the Academy of Science in the back” — Hans Rosling
- Covid Tracker Tracker [sic]
Day 18–21: D3 line charts with annotations and shading
- D3 line chart with 1 series
- A Y Axis label can double as a chart subtitle. Here, I’ve used “Australian life expectancy in years” as the Y Axis label and chart title.
- Add classes to SVG groups just to name them, so you can easily find and navigate them when inspecting them in the console.
- Different “nice” formatting:
.domain(d3.extent(data, d => d.year)).nice(d3.timeYear.every(2))
- Use the line function’s “defined” accessor:
.defined()
to check if there are any data points missing. If so, skips the missing line segments. This is a longer explanation of “defined”. This is the line generator code:line = d3.line().defined(d => !isNaN(d.life_expect)).x(d => x(d.year)).y(d => y(d.life_expect));
- Examples of cutting the Y Axis
- Sparklines, including inline canvas sparkline
- “The only function of economic forecasting is to make astrology look respectable.” — John Kenneth Galbraith
13 May 2020: D3 Multi-Line Chart
- Reformatted data cell to include a time series array of names and associated values (
series
) and the list of years as an array (dates
). - The console showed an error:
Uncaught TypeError: d3.least is not a function
. This usesd3-array
:d3 = require("d3@5", "d3-array@2")
. - The subsequent error,
y = TypeError: t is not iterable
, revealed that my time series values data was a single value instead of an array of all the values for that series. - Multi-line chart with dot labels and hover to emphasise 1 line at a time using
bisectLeft
and moved, entered, and left methods.
14 May 2020: D3 Multi-Line Chart with Annotations
- D3-annotations:
d3 = require("d3@5", "d3-array@2", "d3-svg-annotation@2")
- Example of
d3.annotationCallout
- The annotation note title also lightens when hovering over other lines.
15 May 2020: D3 Multi-Line Chart with Shading
- Multi-line chart with shading
- A notebook by Zachary Dodge taking another approach to appendix cells for managing the width, height, margins and “plot area” (called
boundedWidth
andboundedHeight
here)
Day 22–24: Pie and donut chart theory and exploration
16 May 2020: Donut Charts and Pie Charts
- Pie chart and donut chart theory
17 May 2020: Pie Charts in Vega-Lite
- Vega-Lite pie chart of scone ingredients
- Errors due to versions: It looks like the example in Hello, Vega-Lite by Mike Bostock is using an older version of Vega-Lite. The notebook from which we’re requiring
vegalite
appears to be an old version of the notebook. - Use
vegaEmbed = require("vega-embed@5")
- Use an embed function
embed = function(spec) { return vegaEmbed(spec, {loader: vegaEmbed.vega.loader({baseURL: 'https://vega.github.io/vega/'}), actions: true}) }
- Wrap data keys in double quotes, because JSON demands strings for keys.
"view": {"stroke": null},
removes the chart border"legend": null,
removes the category color legend in"color": {"field": "category", "type": "nominal", "legend": null}
- It looks like
"stack": true
places the labels correctly. The Vega-Lite stack docs focus on stacking bar charts and only mention once that it could be applied to thetheta
channel. It does not describe what it means to have things “stacked” with a theta channel so I will just file away in my brain as “the thing that does what I need with pie chart labels”. After a bit more reading I spotted “For now, you need to addstack: true
to theta to force the text to apply the same polar stacking layout.” in Pie Chart with Labels - Use
, "stroke": "#fff"
to thearc
mark to distinguish sectors. - Use
"radius": {"field": "quantity", "type": "quantitative", "scale": {"type": "sqrt", "zero": true, "range": [40, 100]}},
to create a Vega-Lite radial plot. - As far as I can tell, the default sort order is ascending but that will only work on a flat array of data using the specific key
values
. My data uses an array of objects, so I needed to add this line to my top-level encoding:"order": {"field": "quantity", "sort": "descending"}
- Vega-Lite tooltips:
"tooltip": [{"field": "category", "type": "nominal"}, {"field": "quantity", "type": "quantitative"}]
- Chart title (
"title": {"text": "A Scone", "dy": -12},
) and padding ("padding": {"top": 24, "right": 24, "bottom": 24, "left": 24},
) - Chart colors:
"color": { "field": "quantity", "type": "nominal", "scale": { "range": ["#FDBAAB", "#FFEBA5", "#99C3E1", "#90D1C5"] }, "legend": null },
18 May 2020: Donut Chart in Vega-Lite
- To change a Vega-Lite pie chart to a donut chart:
"mark": {"type": "arc", "innerRadius": 100, "outerRadius": 150, "stroke": "#fff"}
- Add arbitrary text using a
text
mark and avalue
key instead of afield
. See the Vega-Lite textdocs and the Vega-Lite value docs. - Give the arc its own encoding and tooltip to avoid tooltips over the center of the donut.
- In Observable, give images alt text:
{ const image = await FileAttachment("image.png").image(); image.alt = "Screenshot of donut chart with open cell clipping tooltip." image.width = 375; return image; }
Day 24–27: D3 donut charts with footnotes and styled tooltips
19 May 2020: Donut Chart in D3
- Choose Sequential, diverging, categorical or cyclical color scales as appropriate for the data.
- “There is no difference between [line or thick arcs], other than the thinnest donut being worse than the rest (we’re not sure exactly why)” in The humble pie chart: part2 by John Nixon, Office for National Statistics.
- Experiments with different color schemes and approaches to adding them to
range
:.range(d3.schemePastel1)
,.range(["#8242a8","#cf3f93","#fc5972","#ff8a51","#ffbf40"])
,.range(d3.quantize(t => d3.interpolateCool(t * 0.8 + 0.1), data.length))
,.range(d3.quantize(t => d3.interpolateSpectral(t * 0.8 + 0.1), data.length).reverse())
- Color resources.
- D3 donut chart.
20 May 2020: Donut Chart in D3 with Footnotes
- To add a “source” line of text, append a
text
element and set theinnerHTML
using.html()
from D3’s selection.html. - Alternatively, dump all the source details inline in a
.html()
call. - Use a
figcaption
element to add notes and source:<figcaption style="padding-bottom: 12px; text-align: center; font-family: 'Work sans'">Source: <a href="https://www.sarahcooks.com.au/2020/04/small-batch-scones.html" style="fill: #3c72d7;">Sarah Cooks: Small batch scones</href></figcaption>
- Wrap the chart and
figcaption
element in afigure
element:<figure style="max-width: 100%;">${chart} ${caption}</figure>
- “Avoid chart footnotes where possible. If extra information is needed: annotate the chart, include the information in the statistical commentary accompanying the chart, add a footnote to the chart title”
- Some notebooks showing source or notes.
- The SVG title element can also be used on an “graphics element”, including
<circle>
,<ellipse>
,<image>
,<line>
,<mesh>
,<path>
,<polygon>
,<polyline>
,<rect>
,<text>
, and<use>
. As withtitle
attributes used on HTML elements, this produces shabby tooltips if you hover for long enough.
21 May 2020: Donut Chart in D3 with Tooltips
- D3 donut chart with tooltips.
- The approach here uses HTML for the tooltip itself and translates it into position using CSS according to the data passed to the
onmouseover
function from the D3selection.on()
listener on the arcs. The advantage of using HTML and CSS is that the tooltip automatically resizes based on the width of the text content inside it. - I have two working theories on the absence of standard conventions or libraries for styled tooltips in D3 charts:
1. Every visualisation is unique and D3 and SVG both give you such control that you might as well tailor your tooltip to every situation. Compare the 3 examples in AntV’s G2 Visualization Grammar Tooltip examples.
2. Data viz folk generally use other methods to let people access “details on demand”, such as showing values directly on data on hover (especially for bar charts). - d3-selection: “By convention, selection methods that return the current selection use four spaces of indent, while methods that return a new selection use only two. This helps reveal changes of context by making them stick out of the chain”
Day 28–30: Area charts and CSV parsing
22 May 2020: Area Charts, Stacked Area Charts, Stream graphs and Ridgelines
- Theory about area charts, stacked area charts, stream graphs and ridgelines
23 May 2020: Area Charts in Vega-Lite and CSV Parsing
- Vega-Lite area chart showing 863 cells written across 28 Observable notebooks.
- Using CSS counter to count cells in notebooks via the minimap:
element.style { counter-reset: count 0; } .minimap-cell-row:before { content: counter(count)"."; counter-increment: count; }
- Wrangling CSV data
- Observable file attachments: Calling
attachment.text()
returns a promise to a string. We then wait for this promise to resolve before callingd3.csvParse
. await
waits for a Promise.- From the d3-dsv module,
d3.csvParse
callsdsvFormat(",").parse()
with a comma as the delimiter, which constructs a new DSV parser and formatter. Just belowdsvFormat
in the docs, you can seedsv.parse(string[, row])
, which tells us about the DSV parser. - Row conversion function, parameter and body.
- Instead of
d
in our cell, you can see an Object using ES6 destructuring assignment syntax to assign the data row’s name (e.g. “Number of cells”) to a local JavaScript variable (e.g.y
). - Our
d
destructuring assignment is wrapped in parentheses because they’re “required when using object literal destructuring assignment without a declaration.” - The
=>
indicates an ES6 arrow function expression using the syntax,(param1, param2, …, paramN) => expression
. - The docs show that destructuring within the parameter list is also supported as advanced syntax, which is what we’re doing.
- “In a concise body, only an expression is specified, which becomes the implicit return value. In a block body, you must use an explicit return statement.” In order to use a “concise body” to return our object literal, we wrap our expression in parentheses.
- From the d3-time-format module, we then use
d3.timeParse
, which is an alias forlocale.parse
(on the default locale), to create a time parser:const timeParser = d3.timeParse("%d %b %Y");
- This time parser expects a date that looks like “23 May 2020”. We use it to parse the “Date” as our X value:
x: timeParser(x),
- The
+
in the liney: +y
is an Arithmetic operator: Unary plus (+) that “attempts to convert it into a number, if it isn’t already”. - We add extra JavaScript properties to our
data
array to store our X Axis and Y Axis labels:const extraPropertiesSource = {xAxisLabel: "Date →", yAxisLabel: "↑ Cells"}; return Object.assign(dataObjectTarget, extraPropertiesSource);
- This uses
Object.assign(target, …sources)
to copy all the (enumerable own) properties from one or more source objects to a target object and return it. - Experimenting with different
timeUnit
values:yearmonth
andutcmonthdate
- The Vega-Lite time unit docs explain that “The specifier monthdate is sensitive to month and date, but not year, which can be useful for binning time values to look at seasonal patterns only.” and to “use UTC time, you can add the
utc
prefix (e.g.,"utcyear"
,"utcyearmonth"
). - Read Vega-Lite axis to learn about
"titleAngle": 0
to rotate the Y Axis label and"titlePadding": 25
to pad it from the tick markers. - Using
labelExpr
described under Labels in Vega-Lite: Axis documentation and the example shown under Example: Using Axis labelExpr to Display Initial Letters of Month Name, I can show the first letter of the X Axis ordinal datum label using"labelExpr": "datum.label[0]"
. This is a bit useless here, but it may come in handy in the future. - Using Vega-Lite condition:
"condition": {"test": "datum['cumulative_count'] > 850", "field": "cumulative_count"},
24 May 2020: Area Charts in D3 with Tooltips
- Bisect for tooltips.
- Redrawn SVG path for a callout with the tip underneath the tooltip.
- SVG path visualizer by Mathieu Dutour
- SVG path editing to include variables to adapt to the width of the text label’s bounding box.
- In Radial Area Chart by Mike Bostock, the text labels appear to have a white shape around them, achieved using duplicated text elements using a large white stroke behind the actual text.
- D3 area chart with tooltips.