A visual analysis of UK number 1s: digging into design

Becky Rush
9 min readApr 14, 2019

--

I’m no designer, and this article will probably prove that. Plus, due to my data mishaps and spending far too long on the analysis side of the project, I left barely any time to properly concentrate on my design process.

Design Process

I doodled sketches of different ideas for the graphs throughout the data analysis and research stages of the project.

When I began the design phase properly, I collated the information to present in the article. I wrote down interesting points I had identified during data analysis in a notebook and then transferred these onto cue cards. This allowed me to group ideas into themes and consider which order they should be presented in. I wrote a draft article structure which included some of the main points from the data analysis to be highlighted, with examples of graphs which could be used.

Design Inspiration

To design the graphs, I found inspiration from existing data visualisations, below I have briefly discussed a few examples, broken these into 3 themes.

  1. Scrollytelling
    A number of articles make use of scrollytelling. When done well, these can effectively illustrate complicated ideas or direct the readers attention to certain aspects of the data.

This full-screen scroll based data visualisation proportionally depicts the number of immigrants affected by Trump’s proposed deportations. It is a particularly captivating and memorable visualisation, making use of animation on scroll to emphasise the points being discussed.

This visualisation shows how the number of bands performing at larger venues decreases from those playing at smaller venues as the text is scrolled. It also integrates audio previews of tracks being discussed and uses a dark background which is unusual.

This is a great example of a scrollytelling approach using graphs. As the user scrolls down the page, labels are added to the graph or nodes are highlighted — drawing the user’s attention to the specific parts of the data. The graphs are interactive, and the user is encouraged to explore the data after the author has discussed their thoughts on it.

2. Static
Often, articles use static images and graphs, with no user interaction, to evidence points made in the article.

  • Washington Post: Corruption in Latin America

The use of the area and radius of circles here is used to show the quantity of money being taken as bribes and benefits received. This clearly indicates which countries have had the greatest involvement with the corruption, but it is hard to understand without first reading the text.

  • The Financial Times: A Visual History of Women’s Tennis

This gantt style graph shows the length of time female tennis players held a Women’s Tennis Association ranking. The use of annotations directs the reader to particular points in the data without needing an interactive graph.

3. Dynamic

Multiple articles use interactive graphs, coded with JavaScript, which encourage the user to interact with the data and explore it for themselves, making it more personally relevant.

  • The Pudding: The Most Timeless Artists of All-Time, The Largest Vocabulary in Hip Hop

The Pudding uses bubble charts in multiple articles when discussing music. These make use of artist images (to make it more visually appealing) as well as some sort of user interaction. The Most played Hits on Spotify graph includes a search function for users to search for a particular song, whereas the Unique Words graph allows the user to filter the data, including by just Wu-Tang Clan members.

  • The Pudding: Newspapers: A Black & White Issue

This graph, produced in collaboration with Google News Labs, shows the diversity (or lack of) within newsrooms. It presents the data as a bubble chart, but does not indicate what the size of a bubble represents (an assumption is made that it is number of staff). Colour is also used, but this seems to represent the same variable as the x-axis (points of over-representation).

  • The Pudding: Why the Republican Party wins when robots take your job

This graph illustrates the likelihood of jobs being automated. When the users selects or searches for an occupation from the dropdown, the relevant node is highlighted. As the user scrolls down the page, the graph transforms into a list, making it easier to see and compare each of the occupations included.

  • Visualizing and predicting Spotify genre characteristics

This is someone’s personal project rather than a professional visualisation. The graph depicts different genres and their audio features. The description above the graph explains what the size of the bubble represents, but this could be better illustrated with a key. The use of colour, position and size representing different variables simultaneously, with genre (the main variable being compared) only being communicated on hover, makes the visualisation slightly overwhelming. However, in the example above, it’s easy to see that valence (indicated by colour) has a correlation with danceability (indicated by x position) —but it’s hard to know if this is a positive or negative correlation without a key showing which colours represent low/high values. The visualisation gives the user control over which attributes are being shown, making it possible to investigate their own interests in the data.

Looking at examples of data visualisations (not limited to music data) gave me an opportunity to consider different approaches to design.

Graphs are a common method of visualisation, but not the only option. The Trump deportation example is particularly memorable and uses a very different approach. I thought designing a similar visualisation would be interesting, but decided graphs seemed the most appropriate for the data in this project.

Area and bubble charts were frequently used throughout the examples I looked at, which could be appropriate for my design — particularly to illustrate how long different artists have spent at number 1.

Designing the Graphs

I was really excited to try out the scrollytelling approach I had seen in a few articles. Before committing to this idea, I did a bit more research, including looking at how to implement scrolling and best practices.
Bostock, author of D3, warned against ‘scrolljacking’ emphasises using standard scrolling behaviour (which incrementally changes what’s visible, rather than a slight scroll affecting the entire page). Users should be in control of the scrolling at all times, being able to move at a speed they are comfortable with. Scrolling should be reversible, meaning that users can scroll back up the page and undo the changes that have been made to the screen. He asserts that standard keyboard controls should be supported (such as using the directional keys to move the page up and down).

During my research into scrollytelling, I found an article which discusses how different creators at The Pudding design data visualisations which described using Keynote or Powerpoint to mockup designs. This seemed like a particularly good approach to prototyping a scrolly article rather than using Adobe Photoshop or InVision (the tools I have previously used). It allowed me to think about each section separately, design in terms of what would be visible on the screen and easily move sections around to figure out the flow.

To assess whether the design would be appropriate, it was necessary to see if the data would work in certain graphical formats. To do this, I used Plot.ly and RStudio without much success. Plot.ly didn’t easily support the graph styles I wanted to try out, and using RStudio was too time consuming (my R knowledge is improving, but not there yet). Instead, I found existing examples of similar graphs and adapted them to use my data. For example, to illustrate the number of number 1s each artist had had, and how long these had lasted at 1, I thought a graph similar to those featured in the pudding might be appropriate. To create this, a d3 ‘swarm chart’ example was combined with a ‘force cluster’ example to mock up the artist graph:

I spoke with a lecturer at my uni, experienced in data visualisation, about this graph. We discussed how the eye is drawn to cluster at ‘1’ rather than the larger nodes to the right due to logarithmic scale used on the x-axis. Furthermore, the lack of labels (which was an oversight rather than a design decision) and the all black colour scheme makes it hard to read and understand what’s going on. The graph is being described in the text to the left, but users are more likely to look at the graph before reading the text as that is what draws the eye. These were issues I tried to improve upon during development.

We also discussed use of area in graphs to represent quantities — I realised that although you can see when an item is larger than another, it is hard to assess by how much. This was supported by some further reading, which suggested that circular graphs in particular are harder to read.

Despite the problems with this graph type, there are limited other appropriate styles. One could be a scatter plot, depicted below, but this is less visually engaging and when a ‘force’ is applied to stop nodes from overlapping, it may become too cluttered to easily read.

For the purposes of this project, the swarm chart seemed appropriate, particularly as similar charts have been used in so many of the examples I had seen during my research.

Taking inspiration from ‘The Differences in how CNN, MSNBC & Fox Cover the News’, the graph would update inline with the user scrolling through the text. I thought a good way to help the user understand what is being said would be to add labels and zoom in on the relevant points as the they scroll:

For the ‘re-entries’ section, I thought a gantt chart similar to the one used in the Financial Times’ ‘A Visual History of Women’s Tennis’ would be appropriate, as these are often used to depict varying lengths of time, including those which are not necessarily consecutive.

To keep it simple and easy for the reader to understand, a line chart might be most appropriate for the ‘audio features’ section.

The above graphs were created using RStudio and ggPlot using the project data.

My initial research showed the importance of showing no more than 4 variables on a graph, as this is the maximum most humans can comprehend. Therefore, there should only be 4 variables displayed on this graph at any one time, unless the quantity is controlled by the user.

This is the personalisation section. This graph allows the user to select two songs (one from their most played songs, if authenticated) to compare on a variety of audio features. This graph was created using Keynote.

Now I had a basic prototype, I could move onto development! I would have to reassess my designs as I went, considering them in further detail, but this was enough to help me figure out what I was aiming for.

Back to Part 2- Getting down and dirty with data
Next Part- Delving into development

--

--

Becky Rush

Software Engineering Team Lead at BBC News Visual & Data Journalism. Hiker/adventurer/solo explorer/ dog parent / samba drummer. From Brighton, UK.