CD Project 3 — Visualising Patterns

https://gradcdstudio.wordpress.com/project-3-intro/

Sanika Sahasrabuddhe
MDes CD Studio

--

A postcard from Giorgia Lupi and Stephanie Posavec’s Dear Data

Chapter 1: Introduction to the Project

  • Let the data tell us what the relationships worth highlighting.
  • What kind of data to look for.
  • for us to understand information inward.
  • For others to communicate outward.

Eg.: What if we had to investigate water quality in Pittsburgh

What would you look at? <image>

How to approach it —

  • Don’t look for a story in the beginning.
  • 3–5different types of data
  • You will be visualizing some things.
  • Interactive visualisations.
  • Proof of concept.
  • 20–30 data points (not over 50)
  • Need to get data to the point where it starts comparing apples to apples.
  • examples: All data can be in terms of Neighborhoods OR zipcodes, but not both.
  • Data can be : Amount, Time, Location, Categories,
  • Reading — Richard Saul Wurman — Describe ways of organising information

Chapter 2:

Richard Saul Wurman

Organising Principles for Data

  • Wurman talks about the different ways of organising data. Every choice made while organising portrays a different understanding.
  • A clear presentation might come for having made data accessible and retreivable.
  • Knowing which category to use is important based on the attributes you want to highlight.
  • For example, location can be of different scales — the scale of a city, or that of the body — but we are still showing the information based on location.
  • Using Alphabetical ordering is helpful when location is not cognitively the easiest to understand. For example — dropdown menu to pick country.
  • Categories — when things are of similar importance. This ties back to what Stacie mentioned about ensuring the attributes of two data points are apples to apples.
  • Hierarchy used when you want to show magnitude.
  • Conscious awareness = saving frustration of searching info
  • New arrangements = new relationships.

Yau — Data Points

What is the nature of representation?
  • Visualization is about finding the right fit between the dataset, it’s interpretation and the geometry and colors we use. It is related to the goals you aim to achieve in your data.

A.Visual Cues

Keep this image in mind for Context: Context is important.
  • It is important to keep in mind how visual cues are positioned in terms of other objects and elements. This relationship is important.
  • While depicting direction, timescales can be dramatically emphasized by compressing or expanding. But stretching the scale just for dramatic effect without context is problematic.
  • While scaling up or down multiple dimensions of a shape, be careful of the ultimate size of the shape that it may not be too big or small.
  • Combining consistent visual cues with color is helpful to make your visualization inclusive of audiences with visual impairment.
  • Co-ordinate systems: Cartesian, Polar, Geographic (you may use one of these)
  • What type of data categories are designated to which axis, is important.
  • Change in sorting can shift focus.
  • Keep in mind use of color values and transparency to show difference is two points in the same categories.
  • Use of color value in chloropleths shows relationships and differences.

Rohrbach — Patterns in Scientific Information

  • Pattern Detection —
  • Representation
  • Interaction and Experience
  • Experience
  • types of patterns,
  • numbers & hierarchy,
  • grouping (for larger number of things),
  • temporal building — People compare subsequent images to their predecessors, finding similarities and forming categories.
  • Categorization and Appropriateness
  • Notice that scale and progression affects how you see parts in contet of the whole.
  • “Experiential cognition is aided when the properties of the representation match the properties of the thing being represented.”
  • Pacing and Simultaneity
  • Expectation and Perception
  • Customization, Personalisation
  • Mimicking Known Behaviors
  • Social Cognition. They
  • explain that enabling people to answer questions based on schemas helps them anticipate the types of interaction, making the situation more predictable and controllable.
  • Recall and Engagement
  • Discovery and Critical Thinking
In context of the Periodic Table

The context of understanding data is the city of Pittsburgh or not beyond Allegeny County

After fluctuation between Crime and Transportation, I picked the later. Crime was interesting to me because of one maybe able to correlate crime rates with many environmental factors like temparature and accessibility and also infrastructural issues like street lighting.

The theme of autonomous vehicles and it’s relation to trafiic violations as crime is interesting. Crime and Transportation are usually measured to understand the livability of a city and what interests me is understanding how that reflects in the day to day of citizens. And what does that mean for overall safety?

It is interesting how crimes are categorized, most of the categories are not perpetrated by individual-agents with intent, but are incidents and accidents and it is interesting how we can see the infrastructure of a city.

However, I think those similar interesting stories can be found in transportation, including traffic offenses, speed of development and the changing nature of a city, emerging commercial districts, environmental degradation, etc.

What questions can you ask about the data you find and it’s context?

One of the key questions after reading Yau, is the idea of the context of data and finding a relevance between the questions we ask and the data.

Organizing Questions:

Data typically responses to one or more of the following questions. Thus, you may find it helpful to record the type of questions your data answers as a means of aiding your understanding and comparing of data types:
What kind? (category)
How many/much? (amount)
When? (time)
How long? (time/duration)
How often? (time/frequency)
Where? (location)

Chapter 5: Questions

(Still looking at both Crime and Transport as themes)

1. Is there a relationship between traffic incidents and temperature or environmental conditions?

2. Are (traffic related) crime rates related to temperature and environmental conditions?*

3. How accident-prone are autonomous vehicles?

4. How liveable is Pittsburgh?

5. Is there a relationship between private car purchases

Visual Examples:

First guys to visualize: William Henry Playfair, Minard (Napolean’s March)

Storytelling: General >> Specific

Chapter 6: Types of Data I’m Collecting:

As I looked at datasets, a lot of them have multiple variables, that may not be the best labels when it comes to trying to comprehend it. For examples, neighbourhood data is not in zipcodes but in lat-long.

(You don’t need more than 3–5 types of data)

I think there are two kinds of data:

Yellow: For Context | Blue : Lending to Story

Data that Context:

This is basic geographical data like neighborhoods and demographics that helps anchor the story in the context of Pittsburgh

Data that tells a story:

This would be data that I use frame a questions and eventually find a pattern.

I started building columns into an excel sheet to understand what variables I would like to look at —

Next Steps:

  • Write your questions
  • What do you want to poke at?
  • Make note of Latch components
  • Do you see a co-ordinate system emerging?
  • Begin to jot down the sequence in which you want to move with the data

Very Initial Ideas on the form —

Something interactive but also something in print? Will be interesting to see the scale of things.

Useful Links

Chapter 8: Framing Questions

To understand patterns in the data, it is important to frame questions so that they allow a range to study, instead of a yes/no question.

Based on looking at data available, what is interesting is —

How does seasonal change affect the nature of crimes and offences committed or reported in Pittsburgh/Allegheny County?

Arranging Data in Excel:

Temparature Data — Avg Temparature and Snow Depth
Crime Incidents (timestamp, intensity of crimes, neighborhood)

Yau, in-class discussion:

Scales: (in-bold are the scales I will look at)

  • Logarithmic (powers of 10)
  • Linear (integers etc.) (Cartesian scale)
  • Categorical (eg.: cloudy, sunny)
  • Time (day, month; seasons, cyclic, linear)
  • Percentage (part of whole)
  • Ordinal (Good, bad, neutral; hierarchy)
  • Hierarchy
  • Location (Geographical or Polar)

Chapter 7: Analysing the Data set—

Analyzing Data:

  1. The two levers of my collected data are crime report incidents and avg, temparature and snow depth. After analysing the long list of incidents from police blotter data, the best way to break it up into smaller chunks based on the hierarchy of the incident.
  2. I also chunked the data according to time of the day (day divided into three parts — 12am–8am, 8am–4pm, 4–12 am)
Contains timestamp, neighbour address, description of crime, and XY location.

Chapter 8 - Storyline and Narrative?

Crime data form references and inspiration:

https://flowingdata.com/2009/06/23/20-visualizations-to-understand-crime/

Chapter 8: Framing Questions, LATCH & Visual Cues

In-class, we started organizing data, into data types, based on Yau and Wurman and Rohrbach readings, to match a data type with appropriate scales and use the right visual cues.

The order of matching was Data type → Question → Ranges → Scales & Visual Cues

The question that I composed from knowing existing data was —

How did seasonal changes in temparature and snow depth affect the nature of crimes in Allegheny County in 2018?

Inspiration to show density of crime across time of the day.
Ways of categorizing data (in-class activity)

Using the in-class learnings, below are the visual form explorations using the visual cues and LATCH points —

Chapter 9: Layering Information

After the in-class crit with Stacie and Brett, it was helpful to see how some of my data types can be prioritized over others. For example, my question is —

How does seasonal change affect the nature of crimes and offences committed or reported in Pittsburgh/Allegheny County?

Looking at the phrases highlighted in the question — Ideally, a geographical scale, and looking at patterns spatially may not highlight the right information.

Another piece of feedback, that I will work on next is to reduce the number of ranges in each data type.

For example, temperature can be in ordinal scales like low, medium, high, instead of 0–10,10–20,20–30, …

I could also take the approach of not looking at the charge at all, but only looking at the hierarchy of crimes.

On the left is the charge (description) and on the middle is the hierarchy of the crime and the number of incidents listed in that category. The right most image is a screenshot of a Uniform Crime reporting manual to see how crimes are categorised.
This visualisation trial uses a cartesian scale, where X axis is time of day and Y axis is hierarchy of the crime. The right hand side is the inspiration. from Nathan Yau’s blog flowing Data.
Other inspiring visualizations

In-class Progress Update

  1. What question are you exploring? Make it as clear as possible and write it in a manner where your data supports answering this.
  2. What types of Data are you using? My data is about temparature and crime categories — through these I am trying to look at relationships between the two. Does the nature of crimes change?
  3. What co-ordinate system are you using? I am going to start with a Cartesian system, to see relationships between time of day, and intensity of crime. My other option is a polar system, because of the cyclic nature of time. Geographical is the least relevance at the moment.
  4. Scales? I am looking at linear, ordinal, hierarchical, and time scales.
  5. Ranges? For crime intensity (hierarchy), I have too many scales right now, so I am going to reduce them to not more than 5 buckets. Snow depth is low medium or high. So is temperature (might add a ‘severe’ to it).
  6. Do you propose a narrative or indexical structure? It would be a narrative structure and reveal nuances of crime through the day and seasons through interactivity (indexical, show and hide)
Narrative Structure

7. Pathway of Data? What will the audience see and experience at each stages?

Example of narrative structure trial. Pathway through the data

Temparature + Snowdepth → Crimes Density → Time of Day → Hierarchy → Charges (Filter) → Location

8. Visual Cues you’ll use?

  • Size for the intensity of crime
  • Position for density
  • Color Hues for seasonality
  • Color Values for temperature
  • Weight for snow depth

NEXT STEPS:

  • Finding Goodness of Fit between form and information topic
  • Thinking of aural channels and interactivity.
  • Thinking of and laying out the story.
  • Think of audience
  • Things That Make Us Smart — Don Norman, Appropriateness Principle (close cognitive collection)

GOALS (Why are we doing this):

  • Data to information
  • Storytelling & Narrative

Theory

  1. Patterns & Detection (To see Patterns)
  • Numbers & Hierarchy
  • Temporal Building (Bring slowly someone into the visualization).

2. Representation

  • Categorization
  • Pacing & Simultaneity
  • Narrative + Indexical Structure
  • Prior Expectations (Schemas) + Perception

3. Interaction (Shedroff)

  • Customization — how much control are we giving the audience?
  • Mimicking known behaviors is good, but may be detrimental. might not make way for new ways of looking and create bias towards one kind of representation

4. Experience

  • Recall + Engagement
  • Discovery + Critical Thinking

Interim Presentation

Interim Process Feedback

Chapter 9a: MapBox Experiments

My plan was to design the narrative in two bigger parts —

  1. When did the crime occur?
  2. Where did it occur.

For the 2nd part, I plotted the location of the crime and tried to understand patterns of the location of occurrence, based on type of crime. I was able to see that burglaries are most common while arson and homicide is less comon. Denser dots showed that some neighbourhoods had more burglaries than others.

Chapter 10: Wrapping up the Visualisation

After the interim presentation and desk crits, as I explored the patterns in my data, the way I approached it evolved.

The question I have now explored through my data is —

How did seasonal changes affect nature of crimes and their frequency in Pittsburgh in 2018?

The proces of working with a cartesian sale, was challenging, especially because I was trying to balance multiple aspects of one type of data.

For example, for time, I had time of day and day of year. SO in the final version, I have used only day of year and changed the question to look at frequencies and amounts of crime.

Interesting insights

  • The rate of crime tends to go up on hotter days.
  • Clear relationships between precipitation and snow depth did not arise.

Final Proto Images

Narrative Structure: was a combination of narrative and indexical, I took the approach of being able to filter data in many ways, therefore highlighting more of the indexical structure.

Excel Tools most useful:

  • The LOOKUP function
  • Creating filters
  • Chnaging strings to numbers and vice versa.

Learnings!

  • Focus on quality and relationships before finding volume.
  • It is challenging to visualise patterns without adding several filters/ In my next try, I would think of ways to use leser filtering.
  • Learnt Mapbox

Link to submission files:

Next Steps:

I think looking at this data spatially would be important. There are relationships that may come up between crime types, location, and neighborhood income!

In the future, while working with data, reflecting on this process will help to make data more concise, and frame questions more critically.

--

--

Sanika Sahasrabuddhe
MDes CD Studio

Graduate Studies in Design for Interactions @ Carnegie Mellon University, School of Design