Visualizing Patterns

Anna Boyle
Nov 3 · 13 min read

Project 3 | Education

Carnegie Mellon University
Fall 2019
Communication Design Studio 1, Stacie Rohrbach & Brett Yasko

10.31.19

Today in class we discussed Project 3: Visualizing Patterns. Where we are going to dive into the process of conceiving and designing visuals that communicate information in ways that are useful, usable, and desirable by crafting the form in ways that intuitively match the content of the piece.

We started class by discussing:

  • Data > asking questions around that data.
  • Discussing visual strategies
  • How these visual strategies help us understand and how we communicate that information outwardly.

After talking through the different ways data can consider we were encouraged to pick a topic area of interest that we will be looking into for our project.

I decided to move forward with Education considering my own background in Higher Ed.

After our topics were picked we moved into groups where we started to develop lists of areas of interest surrounding the topic we chose.

Over the weekend we have been asked to look into our topic and pull the following

  • 3–5 Types of Data
  • 20–30 Points (Not over 50)
  • 🍎to 🍎 comparisons

And consider the following questions.

  • What kind? (category)
  • How many/much? (amount)
  • When? (time)
  • How long? (time/duration)
  • How often? (time/frequency)
  • Where? (location)

11.07.19

Since last Thursday I have dived into reviewing two different data sets in association with my topic area which is Education. Both data sets are from the Western Pennsylvania Regional Data Center (WPRDC ).

The first data set I focused on was school enrollment data in the City of Pittsburgh from 2015 and the second one focused on playground equipment locations throughout the city. Through these data sets, I was originally trying to look to see if children that have access to playground equipment perform better in school. Unfortunately, this project direction led to a dead-end and was challenged to rethink my question and direction further over the weekend.

11.12.19

Over the weekend I was looked into the following data sets pulled from the Pennsylvania Department of Education and the Research for Action Organization.

Research for Action is a pulls Pennsylvania School Outcome data from a conglomerate of Department of Education Data, Civil Rights Data Common Core Data and PA Safe Schools data.

Image Courtesy of Research for Action.

The Outcome Data set that I pulled specifically reviews.

  • Attendance rates
  • Standardized test participation rates and scores
  • Algebra I passing rates
  • Suspensions and expulsions
  • Graduation rates
  • Dropout rates

From this data set, I pulled data from High School Seniors from 2016 -17 in Allegheny County in combination with the 2016–2017 enrollment history from a Department of Education specific data set.

After cleaning and refining this data further in an effort to see comparisons over the course of the same year 2016–2017, I defined the following 5 categories.

  1. Top 30 largest High Schools in Allegheny County
  2. The Schools Final Academic Score — derived from The Pennsylvania School Performance Profile.
  3. The % of Seniors Scoring 22 or Higher on the ACT
  4. The % of Student that met the SAT/ACT College Ready benchmark

This then probed me to ask the question.

“Of the top 30 High Schools in Allegheny County, how do the High Schools Final Academic Scores correlate with the % of Seniors that are prepared for college? “

The 4 of the 5 different LATCH — Location, Alphabet, Time, Catagory, and Hierarchy components are outlined in the following areas.

  • Location — Allegheny County
  • Time — 2016–17 Academic Year
  • Catagory — The Top 30 Largest High Schools in Allegheny County
  • Hierarchy — The Schools Academic Score compared to the % of students who met the SAT/ACT College Ready Benchmark.
  • A — Alphabet was not a factor of my LATCH components, is that ok?

Thinking through how I would plan to represent this data I am thinking of visualizing it with a Cartesian, coordinate system with two bar graphs in association with both the Schools Academic Score & % of students who met the SAT/ACT College Ready Benchmark.

When thinking through how to move through this data I am interested in starting at more of a geographic level of the state of PA and then zooming in on.

Allegheny County >

Allegheny top 30 High Schools>

The Final Academic Score of Each Highschool >

% of Students that meet the SAT/ACT College Ready Benchmark

Ideally, I would like to represent this information digitally in some kind of application or interactive web page.

11.12.19 — In Class Notes

YAU Scales
- Logarithmic ( Multiples of 1,10,100)
- Linear (0,1,2,3) Alphabetical, Time, Hierarchy
- Categorial (Cloudy, Sunny)
- Time (Day/Month/Year; Leaner/ Cyclical; Seasonal)
- Percentage (%- Parts of a whole)
- Ordinal (Good, Neutral Bad) Hierarchy, Location

-Education Breaks
-18/21
-Working Age
-Development
- 10/20/30/40
- Birth year

Thinking about how you would turn things on and turn things off.

Polar — Parts of a whole, critical element or cyclical information — months
Cartesian — Comparing data on an x & y-axis
Geographical — Related to a map in some way, general location

Notes: Reduced Buckets prematurely > Go back on step

Before the Next Class

There may not be significant changes, and that means you need to look for another set of data?

🌈 color code your different buckets, 4–6 buckets per column

Things to put on Medium before Thursday

  • Image of your Google sheet
  • 3–5 columns
  • 4–6 rows buckets out of, 50 data points

Data

- What are the scales you are looking at
- What are the buckets you are proposing
- What coordinate system are you considering as an anchor
- Iterate on the question — you can change it

Form

  • What you are representing?
  • How do you make something tangible enough to make the abstract concrete?
  • Diving into data through distance > 2 Dimensional
  • Depending on the data you have you can take a more comical or
  • How can your story be meaningful — and show subtlety
  • How do you use visual, temporal, physical, audible

11.14.19 — HW

  • Total High School Population
  • Total Seniors in the High School Population
  • Number of Senior Students Scoring 22 or Higher on the ACT (2016–2017)
  • Number of Senior Students who took the PSAT
  • Percent of Students who met the SAT/ACT College Ready Benchmark
  • Final School Academic Score — How the school is ranked

I have decided to organize my group’s by quintile, dividing the population into 5 different groups over the total of 48 high schools and comparing the relationships of my scales from high quantities to low quantities.

I would like to start with geographic and then go into cartesian.

An update to my probing question below >

“Looking at the High Schools in Allegheny County, how do the Final Academic Score rankings of each school correlate with the number of Seniors that are prepared for college? “

Initial grouping of categories in my 5 different buckets.
Group 1, Group 2, Group 3, Group 4, Group 5

Final Data Set with 🌈 color coding and of Groups/ Buckets.

IN-CLASS

Visual Cues — Key

L — Outlier
H — Hierarchy
C — Catagory
A — Amount

Connect Scales to Visual Cues > Pay attention to a close cognitive connection and consider the layering — consider how these items can be consistent throughout.

Area — Space
Only look at the layering of the data between items.

For Tuesday — Answer all the following questions and bring your attempts of mapping the cues to your content > Print them out.

11.19.19 — HW

DATA TYPES

Location

Linear

Linear

Linear

Percent

Value

RESEARCH QUESTION

Of the High Schools in Allegheny County, how do the number of Seniors that are prepared for college correlate with the school's Final Academic Score?

COORDINATE SYSTEM

Geographic/Cartesian

Over the weekend I worked towards trying to wrap my head around the different data points I had collected in addition to the thinking about how they could be layered.

Because all of my data points are primarily linear scales of numbers I am unsure of the best way to represent them so that they are distinguished from one another.

11.21.19-

Since class as recommended I re-bucketed my different categories but am now concerned that I may not have enough data to tell a compelling story.

I also transitioned from geographic location to Cartesian.

Re bucketed Data — Some of these correlations aren’t connecting in a way that is telling a compelling story?

Reset Check-In

Of the High Schools in Allegheny County, how do the Seniors that are prepared for college correlate with the school’s Final Academic Score?

  1. 48 High Schools in Allegheny County
  2. Percent of Students Scoring 22 or Higher on the ACT
  3. Percent of Students that meet the College Ready Benchmark
  4. Schools Final Academic Score

Cartesian

Data Types — High School Name & Size > Location

Total Enrollment — Seniors > Linear

Seniors Scoring 22 or Higher on the ACT > Linear

Seniors who took the PACT > Linear

% of Students Who Met the College Ready Benchmark >Percent

  1. 48 High Schools in Allegheny County
  2. Percent of Students Scoring 22 or Higher on the ACT
    0–10
    11–20
    21–30
    31–40
    41–50
    51–60
  3. Percent of Students that meet the College Ready Benchmark
    0–20
    21–40
    41–60
    61–80
    81–100
  4. Schools Final Academic Score
    0–20
    21–40
    41–60
    61–80
    81–100

What do you see at any given time > are there points when you enter the data

  • I would like to start entering the data showing the size of each school — Narrative
  • After seeing the size of each School you would see the Final Academic Score
  1. School Size — Size
  2. Schools Final Academic Score — I would like to show this in terms of color value.
  3. Act Score — Shape and direction
  4. College Ready Benchmark — Shape and Direction

These questions are what we have to present on Tuesday ^

Why > Transfering data into information

  • Information and how it can be understood
  • Developing a narrative
  • How to add order and understanding to dense information and communicate with others
  • Interaction > How do help the Viewer understand the information

Pattern + Detection

  • Number and Hierarchy — * Typically people can only interpret 7 different kinds
  • Temporal Building — Building a narrative over time > What will be introduced 1, 2, 3…What steps do you want someone to go through to understand the data.

Representation

  • Categorization — Similarity
  • Pacing & Simultaneity — Where do you want relationships to be shown within the data. Where is it important to see information at the same time?
  • Narrative & Indexical Structure
  • Expectations +Percerperation — Things that make us smart appropriateness principles — Thinking of the entire structure
    What do you think that the users will take away from the experience
  • Semantic Differential

Interaction

  • Customization
  • Mimicking Known Behavior — there is value in this but there are issues with this as well.
  • How much information are you providing them?
  • How many options are you going to give them?
  • How can you build on what people know, but then build their curiosity?

Experience

  • Recall and engagement — how do you leverage past experiences? How do you think about the experience?
  • Discovery and critical thinking — Try not to include text at least at the start. How do you frame your question and get your viewer to think critically?
  • Temporal Building
  • Expectations & Perceptions
  • Narrative & Indexical Structures
  • Pacing + Simultaneity > Layering

11.26.19

Over the weekend I reworked some of my data to better fit the following narrative.

Preparing a high school student for college can be difficult, and parents look to High School academic performance ratings to find a school that will prepare their students for the rigor and demands of a college education.

This data will examine Seniors in Allegheny County (2015–2016), and how their college prep scores and college preparedness correlate with the Final Academic Score of the High School. I changed the data by year to better align with my question.

  1. How many public high schools are located in Allegheny County
  2. Of these high schools how many high school Seniors does each
    school have?
  3. How did Seniors from these schools perform on the SAT?
  4. What percent of these seniors by the school are college-bound?
  5. What is the relationship between students that are college-bound and the Final Academic Score of the school?
  6. Do these trends communicate the quality of education being received, and how these students are ultimately prepared for future success?

I also continued to play around with form and layering.

12.03.19

Over the holiday, I worked towards refining my form and thinking through how I would walk through presenting and representing this data through an interactive Web Page. The bucketing of information is working well, however, how I represent the final form and the representations between the ttwo data points — The % of Seniors that are College Bound v.s. The High School Final Academic Score is a bit challenging. How is this information layered on top of each other?

12.05.19–12.12.19

Last Thursday Stacie was really able to help provide some clarity around how to represent my x & y-axis and transition them smoothly. This led me to work on my final prototype over the weekend and into the week. Thank you, Stacie!

After playing around with my new layout and orientation I ultimately decided to change my color palette in an effort to open up visual cue opportunities with my Final Academic Score Data. I also took design inspiration from an academic standardized testing service called Magoosh which seemed like a nice modern approach to conceptualizing this data.

Image from Magoosh

FINAL PROTOTYPE

The final prototype here.

Takeaways

Overall this project was really challenging for me, but I learned SO much. Ultimately I could see this data being used by parents, legislatures, and administrators.

  1. This data could help new families that are moving to the area to select a School and School District to best meet their children’s goals or needs.
  2. State Legislators, PA Department of Education Administrators, and Allegheny County Schools district representatives look to this data to evaluate the Final Academic Score against the actual performance to better understand the needs of preparing students for the future.

And identifying outliers within the data can hopefully lead to aid in evaluating and refining the Final Academic Score rubric.

If I were to explore this project further future considerations would include looking at evaluating other data points surrounding family and regional demographics.

Anna Boyle

Written by

Carnegie Mellon University | School of Design | Masters Student

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade