Visualizing Patterns

Project 3

Jesse Wilson
14 min readNov 3, 2016

In the Beginning

  • William Playfair — tried to find ways to make comparisons between mathematical data and made sketches.
  • Charles Minard — moved visualization a step further after Playfair. He looked at how diagrams could move beyond mathematical data to more comprehensive data. (We looked at a diagram showing Napoleon’s troops in the attempted conquest of Russia)
  • Better Life Index — Cool interactive interface that lets us look at different facets of a “better life.” Allows for user manipulation due to flexibility of design.
  • They Rule.com — Value of slow introduction to the access of data.

Some other points:

  • It’s important to think about displaying these visualizations and information in a way that is not reliant on text. Size, shapes, and colors can plan a big part in these comparisons.
  • Keep in mind entry points. If you first pull something up, there has to be a way to introduce your audience to what you have created.
  • With data, we might not know exactly what we want to show until we actually start looking at the data. Which patterns might emerge?
  • Minimize split attention. Do you need to read and watch a visualization at the same time? Giving your attention to one takes it away from something else.

November 15

Display Data in Class

Over the past week, my group has collected data relating to our topic of creation.

Some initial questions we’ve come up with were:

  • How do we measure creativity?
  • What are the outputs of creativity?
  • What effects creativity?

We settled on first looking at businesses, namely start ups as they are conduits in which much gets created. The thinking was to look at the data of start ups and where they are located around the world. That data didn’t really seem to be that conclusive (or useful) to us with regards to creation.

How do you measure which startup is more creative than another? It seemed that most of these measurements would be in the form of financial statements, which don’t seem to really paint the full picture. That data would exclude things like what’s the culture like and how is creativity affected by happiness? How do you even measure happiness?

We then started to look at correlating our data between creativity, happiness and corporate culture in the terms of paid leave, maternity leave and hours worked per year. We also found an aptly titled World Happiness Report. It uses various metrics (listed below) to calculate happiness for each country.

  • GDP per capita
  • social support
  • healthy life expectancy
  • freedom to make life choices
  • generosity
  • perceptions of corruption

Class Discussion

Some ideas presented in class to keep in mind while working through my data and determining how I want to introduce my audience to it.

  • How do you take someone through a story?
  • How do you attract someone to the content?
  • How do you orient someone to the information and slowly walk them through it?
  • Storytelling is organizing the data and providing a structure.
  • Presentation is the actual form that your data takes. But it’s important to represent it accurately.
  • If you don’t present cognitive models on your own, your users will come up with theirs on their own. Regardless.
  • Visual variables: color, size, shape, location/proximity, line weight, value/saturation.

Class Activity

In class, we took an interesting opportunity to examine ways to break down information as a visualization.

How do we use:

  • shape
  • color
  • line weight
  • saturation
  • text

to show differences between information? Using these forms, we began to create visualizations that range from abstract to concrete.

This example of squares shows the alteration of shape AND color to demonstrate two instances of data at the same time.

November 22

Our aim in this class was to sort out and to continue to develop our ideas further.

Ideas that take into account coordinate systems (cartesian, polar, and geographic) from the Yao reading.

We also looked more at our data and worked through ideas of how to organize and show it.

  • What’s the type of data we’re looking at? For us most were ordinal and linear, with a categorical data set for the various countries.
  • How do we want to group our data together?
  • And how do we want to want the narrative to flow? How do we introduce this data and what do we want to do with it?

Our group had a helpful chat with Stacie. Prior to that we each had a few ideas of what we liked, but I felt that she was able to hone in our thoughts a little bit. Post-conversation, we (or I) realized a few things:

  • We don’t need to use all our data. We initially had a ‘creativity index,’ a ‘happiness index,’ and data on various sorts of time off given in each country. But we don’t need to combine everything with everything. Stacie pointed out that if we only showed how the data from the creativity index worked with a selection of countries, that would be enough.
  • We also went back to look at some of our data. The ‘happiness index’ for one, hasn’t proven to be a very reliable source. While there are numbers and figures for each country, we have been unable to make any sense out of what those numbers mean. We like the idea of just dropping that and if we replace it, finding something more tangible.
  • We also thought about removing the ‘creativity index’ as well. While we found data on how everything is calculated, it still didn’t make that much sense to us. Could we find something more visible to show and in a sense, come up with our own (but certainly less complex) ‘creativity index’ and compare that with some of our data on vacation time, hours worked, and maternity leave?

As far as visualizations go, we started to make a little progress:

Figure 1

Towards the bottom of Figure 1, we explored some ideas of shape. As countries are physical spaces, it would make sense to represent them as a shape. How would we affect that shape in order to show creativity then?

Our thoughts were if we started off with a square representing a country, then would it transform into another more complex shape if it were ‘more creative?’

Another idea would be to start off with the square, and add in various colors that make a new shape overall. (see Figure 2 for that example) that correspond to our data. For instance, green could represent one aspect of creativity, blue could represent days off, yellow could represent maternity leave, and so on and so forth.

Figure 2.

It’s possible to make this transition in one of two ways, more abstractly making the lines of a square more squiggly, or adding meaning to each line. “If it goes out this far, that must mean…” I would much prefer to add in that extra level of detail.

Another idea that I’d like to explore more with is the polar length coordinate system.

Figure 3

I think the bars could be modified to store additional data, whether it moves length wise or width wise.

Overall, I feel that the project is moving forward and while there’s still a lot left to figure out, I feel that some positive steps have been made.

November 23

Since it’s the night before the break and I have a cold, I thought what better idea than to try to actually get some work done. At this stage of the project, it’s still my attempt at getting my ideas out.

I’m not moving away from the examples I posed yesterday, but I had inspiration and wanted to explore where that went. Basically, the idea was green squares of diminishing saturation representing the countries I’m going to consider, on a black background. I almost want an “outer space” feel, without going through and adding in stars. While it would be cool, I fear it may be distracting.

Anyhow, the shapes could bounce and spin among other things. Perhaps the saturation/intensity of the green color of the square could represent the creativity. (Of note though, these colors I’m using don’t typically represent with the content, unfortunately not many colors do — or if we’re talking about creativity here, anything’s game, right?)

Then the idea came to me that the squares could spin. So the thinking was, the ones that spin faster are “working harder” and the ones that spin slowly are not — they would be the ones with fewer days worked and more time off. More “lazy” in a sense.

Then, when clicked on, the screen could zoom in and there would be 3 (or 4) little spinning boxes below the big daddy. They would represent the various aspects of time off (maternity leave, hours worked per year, paid holidays, and paid time off). The same spinning rules would apply. The main downside to this again is that the colors don’t really represent anything. What color reminds us of work? Of time off? Of maternity leave??

See the ideas below in Figure 4.

Figure 4

November 29

This week, my group decided to take a step back and re-examine some of our data. We felt the creativity index just wasn’t working. The sub categories which made it up (technology, talent, and tolerance) were provided as rankings for the countries — without any real data. We kind of thought that wasn’t as useful to us.

Our meeting with Stacie helped. One team member, Eunjung, was still interested in using the creativity index. Stacie cautioned that she would have to identify what (and how) their data is. And suddenly, there was an idea in my head.

What if we used all or part of the creativity index and basically tested to see how accurate it might be? Then we’re in a sense saying, “I agree with you that these results may be questionable. I’m going to do some research of my own to see whether that’s true or not.”

And that’s where the challenge lie, to determine what data to use that effectively gives us something to compare the index to, or at least a part of the index.

December 1

After more consideration of my goals, and working to condense all of the data down into one organized spreadsheet, I began to get a better sense of what precisely I was trying to accomplish.

Eunjung mentioned that she was going to look at the tolerance aspect to the index and compare it with same-sex marriage statistics and policy from each of those countries and I had decided on a similar format, yet wanted to take the the data into an alternate direction, something relating to talent and education.

At this point, I had data on education levels, test scores, government R&D budgets, and information on patents granted. But which of these would be most effective? It’s easy to think, “why not put it all in?” but I understand that we don’t actually need to have all that much data, just several simple things to compare.

After more deliberation, I decided I’m going to rank the countries by the talent rankings of the creativity index. I think it will also be helpful to show the creativity index number here too. Because while we’re focusing mainly just on talent, the study was a lot broader here and this number gives a clue to the countries’ overall creativity levels beyond the scope of my more specific comparisons.

The Martin Prosperity Institute, which publishes the Global Creativity Index defines talent as:

“Talent is a driver of economic growth in today’s creative economy. We measure talent two ways — by the share of the workforce in the creative class and the share of adults with higher education.”

They continue to describe their ideals of talent as:

“Our measure of educational attainment is based on the share of population that participates in tertiary education including universities, colleges, community colleges, and technical training institutes.”

I feel confident that looking at the data that measures the mathematical, scientific, and reading literacy of 15 year old students can be indicative of talent levels.

Figure 5: A snippet of my finalized data sources

I needed to narrow down the countries which were going to be shown. To explain how the countries were selected, I started with the top 20–30 countries in terms of population, which quickly changed to the top 20–30 on the creativity index re-sorted according to talent. However the data I found that included student literacy levels didn’t include every country, so a few such as Singapore had to be removed.

Let’s see how this gets fleshed out… stay tuned!

December 6

I like where my data and direction was going last week, and I have some thoughts on specifics this week.

I want to start this off with the introduction of the countries, in the order of their talent rankings. While talent is a subset of the creativity index, I want to call that to light and make it the predominant figure. I still think it’s worth showing the creativity index ranking. In a sense, I want to say, “here’s how the countries rank on tolerance, but here’s a sense of the larger picture and how those other unnamed attributes stack up in terms of creativity.” So I’d like to be able to incorporate that info in some way.

But the heart of this really, is comparing the talent ranking to the educational scores.

How does the creativity index rankings on talent correlate to actual standardized test scores from each country?

As I’ve stated before, their talent ranking is, “… based on the share of population that participates in tertiary education including universities, colleges, community colleges, and technical training institutes.” So how does that stack up when looking at another aspect of education? This time we’re bringing in testing scores from the various countries to see if it holds up or not.

Visualizations

This is where I’m currently working. I’ve been thinking hard about how to give the countries shapes. I started with squares representing each country. The order/position that they appear in corresponds to the talent ranking. Then their number and size correspond to the creativity index. (I know I still need to do some work here and make it so both factors are not working to represent the same data.) The colors (at this point) do not have any reference, but will likely change. Color and value can be great indicators of creativity. (Perhaps I will utilize that idea further in the next iteration.) See figure 6 below.

One area to this that I’m still lacking is just how to take in the data for the testing scores. But it was suggested that I could look at the vertical positioning of these to reference the scores.

Figure 6

After presenting this idea to Stacie and discussing it, it appears that I still have some work to do. These ideas certainly do not seem to be as concrete as they could be and are feeling somewhat abstract still. I need to continue on the quest to find more appropriate letters/shapes.

Final Conclusions

I was at a loss of direction. Seeking inspiration, I googled “educational data” and performed an image search. The results that came up had a lot of images showing no. 2 pencils and test scores in those scan-tron sheets. And there, my idea was born. I was going to use pencils to represent my data.

The next challenge was to figure out just what to use in order to represent each of the major aspects to this data which I was trying to convey:

  • countries sorted by their Talent rankings
  • their score on the Creativity Index
  • their average test scores (plus a breakdown of those scores in math, science, and reading)

I created a pencil which to use as my base. From there, I decided that each pencil would represent a country, its height would represent the test scores (a higher score meant a taller pencil) and the line it would draw would represent the creativity index score.

M pencil design
An Illustrator line that I thought was a good representation for my Creativity Index line

The Next Step

I felt like I was making good progress. But one issue that I encountered in the project still had not been resolved; the linear progression of countries according to their Talent rankings wasn’t as easy to follow as I had hoped.

My solution was to break the countries down into groups. I settled on using the top 25 countries, and thus, sought to divide them into groups of five. Continuing the “pencil and paper” effect, I thought about putting five pencils each on five different sheets of paper. That way, one can easily break down this liner list.

The template for countries 21–25, which would be revealed first.

I decided to eliminate the eraser part of the pencil. I wasn’t using that to represent any data and felt it just distracted from what was relevant.

So in presenting my data, five screens (or printed sheets of paper) would be shown, beginning with the last one (21–25) and working up to the first one (top 5).

Once all five pages are laid out, the lines representing the Creativity index would appear.

Countries 21–25, with green “pencil” lines representing the Creativity Index
All cards laid out
Legends

For the PISA scores and the GCI, the two legends to the left indicate the highs and the lows for each. I thought about putting tick marks along each “card” so that there could be a better sense of what precise number each data point was at, but didn’t for two reasons.

  1. The scales are different and this would have further confused the viewer.
  2. The precise score isn’t needed to verify the results here. The idea is to be able to compare one data set to the other.

Getting Specific

As noted, I have data for the math, science, and reading
PISA scores. Once all of the cards and pencils are laid out
on the screen, hovering the mouse over an individual
pencil will darken the rest of the screen so that one pencil
can be highlighted.

Then, the three visible sides of the pencil are used to show a line representing each score. Below are indicators for each. A “plus” sign for math, a “beaker” for science, and a “paper” for reading. Through this feature, a user can see the breakdown of test scores for each country.

As these themselves are not the most obvious, a small tool-tip pop up will appear as the mouse hovers over these columns and indicators.

Breakdown of test scores by country

Conclusions (if any)

This project didn’t lead me to any strong conclusions regarding the validity of the Creativity Index and its Talent ranking.

Looking at the data, there is some correlation with lower scores, notably Latvia and the Russian Federation score low on both the Creativity Index and PISA scores.

A lot of countries seem to generally do well on the Creativity Index, regardless of their test scores. It’s also interesting to note that Singapore, while scoring the highest on test scores, did not have the highest GCI, that honor went to Australia.

While these results don’t call any results into question, it highlights the initial concerns of my group when working with subjective data. What information is chosen to be utilized in a study such as this has a major impact on the results. It can be more easily manipulated. It’s not that this data or conclusions are incorrect, but it’s important to be aware of and understand where the data is coming from, and what angle (if any) the publishers are attempting to convey.

--

--