Visualizing Data Patterns

Parks & Recreation facilities in Pittsburgh

Post written for Communication Design Studio, Fall 2017| Stacie Rohrbach | Carnegie Mellon School of Design. This article is an on-going documentation of in-class activities, assignments, and reflections.

For our third project, we are focusing on visualizing data patterns. We are using public data sets available for the city of Pittsburgh to create these visual representations. The objective is to use these visualizations as a means to explore co-relation and dependencies within various data points and analyze their impact on a larger scale. The aim is to translate the empirical information into something visceral, more relatable and meaningful.

Stacie provided us with some areas to choose from and shared relevant datasets for them. I guess a fair share of my excitement died when I looked at the excel sheets. The data is pretty overwhelming and I am having a really difficult time making a sense out of them.

11.09.2017 | Finding an area of Interest

Find an area of interest and define a prompt. Pick up 5 datasets that would support the prompt

Today, we got together in groups and discussed the areas about Pittsburgh that interests us. We helped each other to formulate questions about their relevant areas of interest. The goal was to formulate a question that would encapsulate the different datasets we are trying to investigate and emphasize on the correlation in the information.

We also discussed about the different layers of information that would be required to effectively demonstrate the correlation between these different datasets.

Here are some of the areas that got me interested:

Transport: How does the convenience of commute affect the demographics of an area? Do people decide to stay in an area on the basis of how accessible it is?

Transport and Food: What creates a food desert? How does availability (convenience stores & restaurants) and accessibility (transport) to food factor in creating food deserts?

Public facilities and Neighborhoods: How might the presence of parks, public facilities, and restaurants affect the demographics of a neighborhood?

Education, Ethnicity & Poverty: Does the ethnic makeup of a neighborhood affect the education and poverty levels of that neighborhood?

Since all the data is on Pittsburgh and I haven’t had the chance to explore the city much, I am contemplating to use this project to enhance my knowledge about the city. In my short trips around the city, I have noticed small lending libraries and share gardens outside people’s home, signs that welcome neighbors from all religions and ethnicity and community centers and shelters open for all. This always makes me think, the people in this neighborhood care. But, how do you measure the care quotient of a community? Can I use the presence of public facilities in a neighborhood as a measure of its residents' involvement in its civic and social responsibility? and I will use this exercise to explore:

How might the size, population, and education level of a neighborhood shape the public facilities in the area?

The datasets that I might need to effectively demonstrate the co-relation between public facilities and the demographic makeup of a neighborhood.

•Parks in each neighborhood (amount of space; how much)

•Public facilities in each neighborhood (how many)

•Schools (do school districts play a crucial role in defining the characteristic of a neighborhood) (how many)

•Restaurants in each neighborhood (will the type of restaurant be relevant?) (how many? What kind?)

•The age of residents in the neighborhood (how many in each age group)

•Household types (is this relevant?) (what kind?)

•Is the ethnicity of a neighborhood dependent on the public facilities present in the area. (Do people prefer staying near restaurants that serve cuisines from their culture or do people prefer staying near a community or religious centers that they associate with? )


The two readings talked about ways of organizing. While Yau focused on ways of organizing data, Richard Saul Wurman (Information Anxiety2) emphasized ways of organizing information.

These ways of organizing data and information would be helpful in making sense of the data that I would look at for this project.

Yau (ways of organizing data)


  • Location
  • Alphabet
  • Time
  • Category
  • Hierarchy

The arrangement and rearrangement of items reveal different kind of relationships in the information that you get.

14.09.2017 | Revision

Today, we revisited the objectives we’d summarised in the first class.

Stacie asked us to answer the following questions to help us make sense of the project.

What interests you?

How a neighborhood becomes a community? Who decides the common ground where people come together? what factors in the makeup of a neighborhood contribute in its civic infrastructures.

What do you want to learn?

What affects or governs the formation of public utility infrastructures in a neighborhood? Is it defined by the size or the population of a neighborhood? Or the overall education or income level of the neighborhood play a role in the development of these public utilities? Or is it people of a certain age group that take more interest in the civic infrastructure of their community?

What is your project question?

What type of data might help you explore your interests?

I am using the data on Public facilities in each Pittsburgh neighbourhood as a measure of a neighborhood’s involvement in its civic infrastructure. I am using the following datasets to compare against the data set and see if there are any co-relations between them.

1. Percentage of land area in each neighborhood designated as park space

  • LATCH–Hierarchy: What is the ratio of land under park space in each Pittsburgh neighborhood?
  • Scale: Percentage
  • Range: Areas of land under park space–0–60%
  • Buckets: (0–5%)(5.1–10%)(10.1–20%)(20.1–30%)(30.1–40%)(40.1–50%)(50.1–60%)
  • Coordinate system: Cartesian — X/Y (Neighborhoods on one axis and percentage of land under park space on another) Polar? (Do I need to show the land area under parks against the total land area for each neighborhood)

2. Population by Age demographic of each neighborhood

  • LATCH–Hierarchy: What is the age of the population in each Pittsburgh neighborhood?
  • Scale: Categorical
  • Range: Different age groups in each neighborhood, Under 5 years– 70 years and above
  • Buckets: (Under 5 years of age)(5 to 15 years of age)(15 to 20 years of age)(21 to 29 years of age)(30 to 39years of age)(40 to 49years of age)(50 to 59 years of age)(60 to 69 years of age)(Age 70 and above)
  • Coordinate system: Cartesian — X/Y (Neighborhoods on one axis and Different age groups on another). Polar (For each category of age range over all the neighborhoods or to show the different age groups within a neighborhood to show the whole picture?)

3. Household type in Pittsburgh neighborhoods

  • LATCH– Category: What are the different type of household in each Pittsburgh neighborhood?
  • Scale: Categorical
  • Range: Different types of households in neighborhoods
  • Buckets: (Family households)(non family household)
  • Coordinate system: Polar (For each category of age range over all the neighborhoods or to show the different age groups within a neighborhood to show the whole picture)

4. Levels of educational attainment in different Pittsburgh neighborhoods

  • LATCH–Hierarchy: What is the level of educational attainment of each Pittsburgh neighborhood?
  • Scale: Percentage/Categorical
  • Range: Different level of educational attainment (0–100%)
  • Buckets: (less than High school)(High school)(Bachelor’s degree) (Professional degree)(Postgraduate degree)
  • Coordinate system: Cartesian — X/Y (Neighborhoods on one axis and the different educational attainment levels on another). Polar (For easy comparison of each category of educational attainment over all the neighborhoods) or (to show the different levels within a neighborhood to show the whole picture?)

5. Median incomes of Pittsburgh neighborhoods

  • LATCH–Hierarchy: What is the median income of each Pittsburgh neighborhood? lowest to highest.
  • Scale: linear
  • Range: Median income from ($0–Above$100,000)
  • Buckets: $(0–10,000) (10,000–20,000)…..(90,000–100,000) (Above 100,000)
  • Coordinate system: Cartesian — X/Y (Neighborhoods on one axis and the different median income categories on another). Polar (For easy comparison of each category of median income level over all the neighborhoods) or (to show the different levels within a neighborhood to show the whole picture)

11.16.2017 : Visualisation

I am finally beginning to get the hang of this project. I guess I was trying to find answers in the data and the fact that the various datasets do not align neatly and draw clear inferences really bothered me. But, the more I studied about data sets and data visualization, I realized that I wasn’t approaching this exercise in the right way. I drew some hypothesis based on my general assumptions and was expecting the data to prove me right in some way.

The purpose of Data Visualization is not only to provide answers but it also helps to visualize a space and enables us to ask questions? If things don’t line up like they should, why don’t they? The questions can help us evaluate a system;

I had decided to study Public facilities because I am curious to understand how the city decides what gets built where? Do the demographic make up of its residents play a role in it or is it decided on the basis of population density and area availability? How does the presence of civic amenities and recreational facilities in a neighborhood affect its residents?

11.16.2017 : In-class Peer Review

I presented the geographical approach to the vizualisation for peer review in class in more depth. I also presented a small portion of a list view with layered datasets. I asked for their opinion on both.

Brendon thought that the selection area on the map is not intuitive.

Angela thought that there is value in map system. She suggested it is useful to evaluate neighborhoods in proximity to each other.

The different arm lengths are difficult to interpret because of the the radial orientation.

It was difficult to remember the different icons without an index.

The gears are difficult to interpret and can be simplified.

Stacie’s feedback:

You’re tackling the integration of lots of information, which is admirable. I look forward to seeing the types of relationships that emerge. Although it’s good to see you thinking about interaction issues, I think the function of the piece is overshadowing form exploration right now. Perhaps set the toggles aside for the time being?

In terms of representation, some of the forms connect more closely to the content than others. For example, size aligns nicely to scale. However, the meaning of the prongs is difficult to decipher and remember without referencing a key. Therefore, how might you use visual variables more effectively to match the content. What does color link well to? What might shape connote? I also encourage you to look for ways of integrating information rather than adding components to the visual form. Those are the only issues I thought would be useful to bring to your attention right now

Next Steps:

I am still juggling with a lot of data sets. I need to streamline my data further.

Should I just explore one kind of public facility over all? Parks & recreation Facilities, maybe?

Explore other ways of representation for public facilities.

Could do both grid and geographical view.

11.28.2017 : Visual Explorations

Visualization exploration for various datasets and how they overlay together
Icon for representing Parks & Recreation facilities- Linear hierarchy
Explorations for Icon for representing Parks & Recreation facilities

I am using trees to represent parks & recreation facilities. After exploring with different visualization ideas (you can see them above), I think the current form is communicative and clear. The little patch of ground adds character to the representation. What I am struggling with still is whether to use different values of green to distinguish between the trees in the icon. As you can see here, the graphic on left uses shades of green to distinguish between different trees. The difference in color value doesn’t connote any difference between the various categories and is mostly for aesthetics. It is also useful as these represent different kind of facilities (As shown in the individual neighborhood active state).

I am using color value to distinguish between different Population density groups as well(see below. Though I don’t foresee it being a problem, I am concerned if it could cause any confusion.

11.30.2017 | Visual Development

I still have to clean up the graphics and work on the alignment etc. This is the basic narrative of my data visualization. I wanted to show all the different neighborhoods together so that people can make over all comparisons apart from studying individual neighbourhoods. I am still debating on whether I should put all the neighborhoods together in one screen or use the scroll feature. For demonstration purposes I am detailing out just 32 neighborhoods and the final interaction would probably have to have the scroll feature. The data would other wise get too crammed to be meaningful. For the sake of space efficiency, I am contemplating using a floating, overlay toolbar.

I found a data vizualisation that uses a similar style of representation and scroll.

The data here is comparable within each neighborhood where as I am also comparing the data sets across all the neighborhoods. Would a scroll be detrimental to the interaction?

Visualization of P&R Facilities with an individual NH hover state

The visualization shows the P&R facilities at the onset of the interaction. The data is sorted in a descending order by default. The final interaction would give the user an option to sort it in an ascending order too.

Visualization of P&R Facilities with an individual NH active state. It shows the break up of the facilities

Show the individual active state show the no. of different facilities or should it be a list of the name of different P&R facilities.

Visualization of P&R Facilities and NH land area with an individual NH hover state

The individual Active state show the total land and the land area under natural parks to give a better understanding of the topology of a neighborhood and the probable area available for the development of the said facilities. I have used color to demarcate the area but now I think I should use texture/ pattern instead.

Visualization of P&R Facilities and Land area with an individual NH active state. It shows the information with more details.
Visualization of P&R Facilities and Population density

Quick Observation: Both, more and less densely populated neighborhoods have less P&R facilities with the exception of Hazelwood. Could we use this to learn more about what makes a neighborhood unique?

Visualization of P&R Facilities, Land area and Population density

Can we use this to compare different neighborhoods? Shadyside and North Oakland both are approximately the same size and have the same population density but what brings the difference in the development of P&R facilities in both these neighborhoods?

Visualization of P&R Facilities and Money
Visualization of P&R Facilities, Land area, Population density and Money

I could see no direct co-relation in the median income of a neighborhood and the no. of P&R facilities in a neighborhood.

About the visualization, Although, fruit has no direct co-relation to money or Income but for the purpose of visualization, I think it is a fairly easy link to remember and recall without the need of constant referral to the legend. Should I look for a more direct/easy to interpret graphic to represent money?

To Figure out:

Representation of Household type: Buckets- Family and non-family households.

Age: What would be the best way to represent age. Explore movement. Can the trees sway? Can I add an icon to show the overall age? This seems plausible. Will this make the representation very graphics with too many elements or would the icon aid in quick interpretation. (To Explore)

In the compare view, the narrative unfolds in a similar fashion with the exception that at the onset of the interaction all neighborhoods get de-selected and the user can choose the neighborhoods they’d like to compare.(To do: Visualize Interaction for more clarity)

Explore the geographical view too to see if any geographical patterns emerge as a result of neighborhood proximity.


What should be the flow of the Interaction? How do I want people to interact with this and access the information. What would be the most efficient and engaging way?

What should be a default screen?

The geographical state shows the details of one neighbourhood and let the user compare it with one other neighbourghood at a time. While the grid view lets you compare all or any other number of neighbourhoods. Clicking on any one shows all the details of that neighbourhood.

12.13.2017 | Next Steps

I need to figure out my transitions better. I had thought of my narrative but somehow couldn’t incorporate it in my presentation better. I think my visual are at a good place now but the transitions aren’t supporting it well.

I think prototyping in Keynote has it constraints and restricted some of the transitions I had initially planned.

The comparative view in the geographical view still needs work. Comparative view allows for spatial comparison in an area, should mention that.

Explore button on home screen. More info for on-boarding?

Narrative needs work big time.