Angels or Demons? Airbnb, open data, and the sharing economy.

Innovation will drive us forward.

But at what unseen cost? Housing and lodging-related web-based services like Airbnb, the $20 billion dollar vacation rental website, catalyze market forces in our cities by improving access to market information. Airbnb has made the world more accessible than ever by allowing locals to share not only their homes, but also their networks and local knowledge, while also providing a modest supplementary income to those who make their spaces available.

However, there are negative impacts resulting from these services as well. They can affect sensitive market balances in places where housing is exceptionally unaffordable. By exposing a place’s local market dynamics to global willingness-to-pay, these services create new unforeseen pressures, adding to the challenges faced by residents concerned about the future of their homes. These services are, at once, angels and demons, opening the world to our keyboards and cellular devices while shifting established processes of market exchange in difficult and unexpected ways.

For those of us who do research and build technology in the pursuit of a better world, how do we keep up our work of studying the social impact of policy and services in this rapidly evolving digital space?

One method that is becoming popular and increasingly possible is data scraping. Data scraping is a technique used in data analysis wherein data that is presented on a website in a human-readable format, but not necessarily made accessible as a structured, machine-readable download, is collected via a script. Once collected, the data can be cleaned and refined for use in data analysis.

Last year at a public meeting in Somerville, I presented an analysis of Airbnb based on data scraped from the service’s website. The results I presented showed nothing particularly polarizing. Based on estimates from the data, if Somerville introduced a new city ordinance collecting something similar to a lodging tax on Airbnb listings, the city might collect a few hundred thousand dollars in additional annual revenue, depending on enforcement and popularity of the service. My analysis showed that Somerville had the third highest number of Airbnb postings in Metro Boston, after Cambridge and Boston, and it was clear that community members were interested in pursuing some form of taxation on the service. The information I presented grounded the city’s Airbnb conversation in data, and helped residents and policy makers think through their approach. It was an opportunity to see how easily data — specifically, scraped data — can be used to inform the public process as it grapples with the introduction of new technological forces that characterize our transitioning economy.

Data scraping is not a new technique, but it is a complex approach from both a technical and sometimes legal (if there are terms connected to the data of interest) perspective. There are times, however, when it is the only method available for collecting critical information for understanding a problem.

Scraping Airbnb for data is a perfect example. The question, “Is there causal link between Airbnb and housing affordability?” cannot begin to be answered without data. Thankfully, Airbnb makes its data readily scrape-able. The City of San Francisco’s Office of Budget and Legislative Analysis scraped data from Airbnb for an in-depth study on the impact of short-term rentals on affordable housing in San Francisco. The study investigates whether the service squeezes already limited options for affordable housing, or provides a means of supplemental income for cash-strapped households.

The conclusion in San Francisco, much like the conclusion of my analysis in Somerville, is that it’s complicated. But both analyses provide metrics for communities to consider when thinking through policy interventions or regulations to make Airbnb an asset for their community, instead of a contentious divider. Whether the data is scraped or subpoenaed, there is a need for public access to information. Because the latter can be a time-consuming and difficult process, especially for smaller communities, data scraping may be one of our remaining options.

An excerpt from MAPC’s annual calendar, using scraped Airbnb data to map distribution of rentals and distinguish full unit rentals from private room rentals in Boston’s inner core.