Atlas of Me: Personalized Spatial Analogy Maps for Unfamiliar Measurements
We created Atlas of Me, a Chrome plugin that generates personalized spatial analogy maps for distances and areas. Read about how we did it below!
How often do you encounter measurements while reading online news articles or other text documents? Imagine that today you encounter an article that describes forest fires breaking out in Spain just 4.3 miles from a well-known city. The reporter informs you that the fires have already consumed 5,000 acres of land. The fire sounds close to the city, and destructive, but how close, and how much destruction? Can you really relate to these statistics?
Or, imagine that the forest fires described in the article are in your state, and are just 4.3 miles from the edge of your city. You want to tell a close relative in another state about fire, but will they understand how close it is, and how large it is, well enough to grasp the danger?
“We frequently encounter spatial measurements online, in the news, and in everyday life, but rarely have a clear idea of the quantity that the measurement expresses.”
Designers and journalists create spatial analogies to address problems in understanding: re-expressions of an unfamiliar measurement in terms of the measurements of one or more located objects that are more familiar. For example, to give readers a sense of how far the distance 4.3 miles is, a journalist for a New York City based paper might say it is about the distance between the Empire state building and the Brooklyn Bridge. If you live in New York and have experience driving or walking by these landmarks on a daily basis, then you are likely to find the analogy helpful. But, if you are from another location, say, Detroit, you may not arrive at a good sense of how long 4.3 miles is from the New York analogy. Instead, you are likely to be helped more by an analogy related to your location, because our experience of space is individualized.
Creating human generated spatial analogies customized for readers in different locations is beyond the capability of most news organizations. We wondered, could we automatically generate measurement analogies, so that a larger number of readers could benefit?
Interactive Browser Plug-in: Atlas of Me
To scale spatial measurement analogies to the many contexts where these measurements are encountered, we developed a Chrome extension, Atlas of Me. It generates spatial analogies based on the user’s location as they read text articles online. The application identifies measurements and country references, and provides a spatial analogy when the user clicks on the tagged measurement.
Sometimes, areas are not directly given in a text article, but are implied through references to locations. For example, an article might mention a foreign country. Our tools can be applied to generate an analogy for the area of the country, or for the distance between point locations in articles.
Building Atlas of Me
How people experience space
To design Atlas of Me, we needed to understand what makes a spatial analogy useful. What kind of object works best in an analogy? We consulted research on the psychology of space, also known as cognitive cartography: the study of how we experience and build mental models of spatial information. This work indicates that landmarks are especially memorable locations in a person’s mental map. Landmarks are important because we rely on them for orienting and navigating in space [7,9].
A landmark can become personally significant to a person in a few different ways. One reason is because a person interacts with the landmark on a regular basis : for example, the Starbucks in your neighborhood that you drop by every day. We might expect personal familiarity to be difficult to predict or even define in a general way, since it involves personal experiences, values, and habits. However, research indicates that one piece of information — a person’s proximity to a location — can predict how likely it is that the location will be a part of their personal cognitive map [3,5,9]. In other words, people’s mental maps tend to be more detailed near where they spend time . As a result, studies have shown that people are able to more accurately estimate distances around places they are in regularly like their home or workplace .
Landmarks can also be generally or culturally significant to many people. For example, the Space Needle is a landmark that is widely recognized as a symbol of the city of Seattle. We wanted our tools to use both forms of familiarity in order to find the right analogy. But first we needed a set of landmarks.
A database of landmarks
Luckily, the web contains several publicly accessible sources of landmark data. We designed a pipeline to scrape and combine geo data from different online sources.
Our first step was to seed our database of landmarks with a large set of landmarks across the U.S. We scraped all landmarks from Yelp, which is a fairly comprehensive U.S. based collection of businesses and other locations. From Yelp, we obtained around 240,000 locations. Yelp provides the name, latitude and longitude of each location. These properties give us a set of distance landmarks, landmarks that can be used to generate distance analogies. Given an input measurement, say 4.3 miles, our goal was to develop tools that could to re-express that distance in terms of the distance between the user’s location and a landmark.
To re-express areas through analogies, we needed landmark areas that are likely to be recognizable to people. A person may be familiar with many locations near her. But it is not necessarily easy for a person to recognize the area of any nearby location. For instance, consider a department store down the street from a person’s home. The area can be hard to conceive of, because the department store has multiple floors and publicly inaccessible areas. On the other hand, some areas have clearer boundaries, like parks. Using the definition of area landmarks from cognitive cartography [6,8], we use a heuristic to identify area landmarks by filtering to certain categories for places on Yelp: Parks, Stadiums & Arenas, Amusement Parks, Botanical Gardens, and Campgrounds in our Yelp set. We added U.S. states to the set of area landmarks as well, since most U.S. residents have some understanding of the sizes of different states, especially their own and others near them. Finally, we needed the areas for our area landmarks, and the footprint polygons for presenting them on a map as part of an analogy. We scraped these from OpenStreetMap (OSM), a large user-generated geospatial dataset.
Modeling personal and general familiarity
We had constructed a database on landmarks. But to be able to generate a personalized spatial analogy for a distance or area for a specific user, we needed a way to predict how the personal and general familiarity of a landmark to a user. If, given a user, we could quantify the personal and general familiarity for each landmark in our database, we could then find the landmark that best fulfills the properties.
Recall that proximity can predict personal familiarity. Using tools to detect a user’s location, we quantify the personal familiarity of each landmark for that user as the Vincenty distance between the user’s location and the landmark. We used Vincenty distance because it takes into account the roundness of the earth.
Next we considered how we could quantify general familiarity. We considered what types of the traces people leave when they interact with landmarks. If we could find a data set of such traces created by a large set of people, we could infer the frequency of interactions with a given landmark, capturing general familiarity. Flickr, as a massive user-generated photo collection, is a good source for signaling landmark familiarity across many people. Using a fuzzy matching strategy, we got the count of photos tagged with a specific landmark’s name and coordinates from the Flickr photo collection. We reserve the count for each landmark as a signal of general familiarity.
How people reason with numbers
Knowing that the user’s personal and general familiarity of the landmark is important, a naive approach might simply find the most generally familiar landmark that is close to the user’s location to re-express a measurement. For example, I live near the University of Washington campus. If I want a more familiar analogy for an area, say 7 acres, it could be re-expressed in terms of the size of the Red Square, a well-known paved square on the UW campus. Using the naive approach in this case gives me an analogy like “7 acres is about 2 times the area of the UW Red Square.” It’s not a bad analogy; I get an immediate sense of the size. But what if the measurement I need help with is 80 meters squared? The same approach might generate an analogy like “80 meters squared is 0.0055 times the area of the UW Red Square.” Or consider 1,047 acres: “1,047 acres is 297 times bigger than the area of the UW Red Square.” Neither of these examples is very helpful for conveying the area! This because very small or large multiplicative factors make the analogy hard to relate to.
To better understand what makes the multiplier in an analogy easier to interpret, we consulted research on number sense, how well people reason with different numeric magnitudes [2,4]. This research suggests that people are best at understanding numbers that are between 1 to 3. People are okay with quickly recognizing numbers between 3 to 10, but the accuracy with which we can reason with numbers drops off quickly as numbers get bigger than 10, or smaller than 1. Number sense research also indicates that people are likely to round numbers to integers where possible. We wanted to capture both of these insights in our tools.
Generating spatial analogies
Given a user and an unfamiliar measurement, we’d like to choose one final landmark from our database to appear in an analogy, using our measures of personal and general familiarity, and making sure that the multiplier is reasonably easy to understand. To do this, we designed an equation comprised of three terms: personal familiarity, general familiarity, and multiplier.
Let’s take a look each term closely. The personal familiarity term takes the distance between the user and the landmark, so that a closer landmark gets a smaller value. The general familiarity term takes a log inverse value of the Flickr photo count for the landmark, so that a more famous landmark (i.e., one with a higher count) results in a smaller value. Our multiplier term penalizes the multiplier in proportion to how far it is from the range of understandable numbers. The penalty is least for numbers 1 to 3, slightly higher for 3 to 10 and increases steeply as the multiplier dips below 1 or above 10. We set the weights of each term based on the implied importance of the factors in prior research.
Visualizing the analogies
Finally, we visually present the best analogy for each measurement and location that Atlas of Me identifies in an article. But first, to preserve the intuition that rounded numbers are easier to grasp, we devise a rounding strategy that tries to round multipliers to the nearest integer. Then, we display the location of the user and the selected landmark. For distance analogies, we use Leaflet to generate the map. For area analogies, we present the landmark’s area and footprint polygon using d3.js. We use d3 because it lets us use a different projection method — the Lambert Azimuthal equal-area projection — that minimizes the distortion of areas on the globe.
Deeper understanding of personalized analogies
We have conducted initial user studies and found that people relate to measurement better using Atlas of Me. But, we think there is much more to learn about personalized analogies! For example, comments that users have provided on what makes a helpful analogy lead us to believe that the trade-off between personal familiarity and the multiplier can be complex. We are also exploring whether individual differences can affect the best relative importance to assign to each term in our approach. Other next steps include integrating icher user models that incorporate information about a user’s preferences from their online behavior and digital traces, and developing a similar approach to create analogies for other common measurements (e.g., weights, heights, volumes) using a database of familiar objects.
This post was authored by Yea-Seul Kim and Jessica Hullman. Work is in collaboration with Maneesh Agrawala and Francis Nguyen.
 Caduff, D., & Timpf, S. (2008). On the assessment of landmark salience for human navigation. Cognitive processing, 9(4), 249–267.
 Campbell, J. I. (2005). Handbook of mathematical cognition. Psychology Press.
 Couclelis, H., Golledge, R. G., Gale, N., & Tobler, W. (1987). Exploring the anchor-point hypothesis of spatial cognition. Journal of Environmental Psychology, 7(2), 99–122.
 Feigenson, L., Dehaene, S., & Spelke, E. (2004). Core systems of number.Trends in cognitive sciences, 8(7), 307–314.
 Golledge, R. G. (1997). Spatial behavior: A geographic perspective. Guilford Press.
 Hansen, S., Richter, K. F., & Klippel, A. (2006). Landmarks in OpenLS — a data structure for cognitive ergonomic route directions. In Geographic information science (pp. 128–144). Springer Berlin Heidelberg.
 Lynch, K. (1960). The image of the city (Vol. 11). MIT press.
 Richter, K. F., & Klippel, A. (2004). A model for context-specific route directions. In Spatial cognition IV. Reasoning, action, interaction (pp. 58–78). Springer Berlin Heidelberg.
 Tversky, B. (2003). Structures of mental spaces how people think about space. Environment and behavior, 35(1), 66–80.