Wait… Is It A Trap?

Simon Li
Data Mining the City
12 min readDec 10, 2018

Datamining the City, Project B

Columbia University, GSAPP

Chengqi Tian, Lingyu Li, Xuantong Zhang, Zhengzhe Jia

Introduction

In our final paper, we are building a program simulating the impact of building a new Amazon headquarter on the area’s property value. Specifically, we will predict the trend of property value change in Crystal City, Arlington, Virginia. We will focus on the census tracts surrounding the construction site of the new headquarter. The simulation will allow the users to manipulate the number of positions provided by the new headquarter and the year. The simulated result will be the average predicted property value of the headquarters’ surrounding neighborhood.

Background

Amazon has announced that its second US headquarters will be split between the New York City borough of Queens and the Crystal City area of Arlington, Virginia. Amazon had promised 50000 jobs and $5 billion of capital spending of for the so called HQ2, which will be split equally between the two chosen location. Crystal City and its surrounding areas and Long Island City will each ultimately get more than 25,000 Amazon employees as part of the deal.

According to Jeff Bezos, “These two locations will allow us to attract world-class talent that will help us to continue inventing for customers for years to come. The team did a great job selecting these sites, and we look forward to becoming an even bigger part of these communities.”

The benefits start with the increase in jobs and taxes that comes with a large company and the economic boost provided to a multitude of other local businesses that support a corporate headquarters. A magnetic company like Amazon will also help the region to transform into a tech hub by encouraging other firms to set up or invest in the area, according to the company. Building such HQ can lead to higher annual wages, job growth, and salary growth.

However, the new HQ will also bring in potential problems such as traffic congestion and a spike in rents and property price. Although Amazon has employees their employees to take public transit over automobiles, the large number of employees will cause a lot of problems especially during workdays and holidays. In terms of housing price and rents, the new headquarter could double the price within years. This would replace lower income residents in the neighborhoods and cause adverse social problems in the area.

Logic behind the simulation:

By reviewing articles on determinants of housing price change, we identified four major determinants: housing demand, housing supply, economy, demographic factors. Housing demand include those factors such as population growth and household growth; economy factors include those factors such as inflation rate and interest rate; demographic factors such as real income growth and other socioeconomic indicators. In our study, the entrance of the new headquarter will mainly impact the demand factors and demographic factors. Therefore, we focus on these two areas when we are discussing our research question and come up with assumptions regarding the pattern of housing price growth before and after the construction of the Amazon HQ2. Since we cannot control the economy and demographic factors, we are taking them into consideration.

The new Amazon headquarters will impact housing demand and demographic factors in four stages. Firstly, the HQ will attract young people and employees to Crystal City for employment. The added jobs will also increase real income of the city and reduce unemployment. Therefore, the entrance of Amazon will higher demand for housing and more income for people to spend on housing. Specifically, there will be four stages by which the new HQ will impact the real estate market, as demonstrated in the diagram below. In stage 1, when Amazon has made the announcement that it will build its new HQ in Crystal City, the locals who have been considering buying will do so because they expect higher competition later on, homeowners who don’t need to sell immediately hold their properties longer, and speculators will purchase properties because they expect Amazon’s arrival will boost their investments. All of these are driving up housing demand and therefore cause housing price to rise. In stage 2, when the HQ was constructed, HQ2 and Millennials will come in for employment and opportunities. However, they will rent at the beginning to test if they like the city. In stage 3, these millennials have enough purchasing power and are ready to make an investment. In stage 4, the growth in the inner tracts close to the HQ will slow down and the growth will spread to the entire DC metro area.

As shown above, the housing price Index change over time as Mercedes Benz enters Atlanta follows a logistic pattern.

Based on the above discussions, we can conclude that the housing price growth will follow a logistic growth pattern, with a slower growth rate before the announcement, a rapid growth rate right before and after the construction, and the rate slow down and eventually flatten out after years of construction.

Assumption: The housing price will follow a logistic growth pattern before and after the construction of the headquarter.

Methodology

To make the prediction of median housing price of the regions surrounding the Amazon HQ2, we will study the housing price growth patterns of other large companies’ entrance into a city. Using those data of other company HQ, we can make a model that can be used to predict Amazon’s impacts on Crystal City’s median housing price over time. In particular, we will collect data of median housing value of neighborhoods in six cities that had a new headquarter (of large public companies) entry in history. The years from which we will collect data is from 2011 to 2016.

For each year, we are finding data listed below:

  • The median housing value
  • Time and Location of the HQ built
  • The percent change in housing value from previous year to current year

Percent Change = Value 2012 / Value 2011

Existing Data

Sites:

  • Google Mountain View Office, 2004, 26,000 employee
  • Nike World Headquarter in Oregon, 2005, 8,000 employee
  • Lenovo Morrisville Office, 2006, 1,450 employee
  • Paypal La Vista Office, 2007, 18,000 employee
  • Microsoft in Redmond, California, 2009, 30,000 employee
  • Google Kirkland Office, 2013, 3,000 employee
  • Facebook in Menlo Park, California, 2015, 2,800 employee

We first searched through the Internet for other companies similar to Amazon. The census tracts around these companies are selected as the research objects, and the median housing price of each census tract in each of the six years after these large companies entered was searched. We divide the housing price by the price of the previous year and get the housing price growth rate. Taking the company size as the independent variable and the growth rate as the dependent variable, importing data into R, and the growth rate estimation model for each year is obtained.

Model:

After one year:Y=(-5.92)*10^(–7 )*X+1.026

After two year:Y=5.117*10^(–7)*X+1.003

After three year:Y=4.783*10^(–7)*X+1.016

After four year:Y=(-1.067)*10^(–7)*X+1.025

After five year:Y=(-4.899)*10^(–7)*X+1.093

Animations or GIFs of the simulation:

The program will allow us to select “Census Tract” and “Years From Now”, and gives a prediction of of median housing value based on the the selections, as shown in the GIF below.

Housing Price Simulation Around Crystal City

Code Highlights:

Code part 1

In this function, we define where our mouse is. If the mouse clicked on the census tracts, the overSelection value will be the tract ID, if the mouse clicked on the year number, the overSelection value will be the year chosen.

Code part 2

In this part, after giving the overSelection value, we will further define if the value represent the tract or the year. Since the tracts’ ID are a set of number larger than six digit, and year chosen is a single digit number, we used the length do determine the value returned. If the overSelection value is less than 100, the value represents the year chosen and will be given to the yearChosen variable, and if the overSelection is higher, this value represents the tract and will be given to the tractChosen variable.

Code part 3

In this part, we used the mouse click position, which represent the year after the construction of headquarters, to trigger the if sentence. For example, if the year chosen is 1, the function will return the housing value of the chosen census tract and the average housing value in one year.

Results:

Learnings and Conclusions:

Our assumption of median housing value growth follows roughly a logistic pattern, with slow growth at the beginning, rapid growth in the middle, and no growth in the end. However, we did not specify a time frame for the assumed pattern. Our actual simulated curve indicates a roughly exponential growth pattern of median housing value. This indicates that the impacts of the new headquarter will not show a flattening out effect within five years after the announcement of the headquarter. As shown below, the growth rate starts out in 0.2% in from 2018 to Year 1, 2.4% from Year 1 to Year 2, 3.5% from Year 2 to Year 3, and 2% from Year 3 to Year 4, and 7% from Year 4 to Year 5. That is, the five years from 2018 to 2023 can be the rapid growth period (acceleration period) in our assumed pattern. After this period of rapid growth, the impacts of the new Amazon HQ could start to weaken and eventually flatten out.

According to some reports, the median housing value has surged 12.5% so far this year. This growth in value way exceeds our prediction from the model. Our simulated results an average growth of around 3 to 4%. This rate of growth signals that the housing value has a possibility for collapse later on. That is, the high expectation from investors and locals could drive price to a point way beyond its actual worth. This poses a lot of risk for a property investor to invest in the Crystal City area. When the buzz for real estate fades, the median housing value starts to decline. Therefore, investors who have seen the price reaching a point where they can make enough money and start to sell their properties.

Recommendations and application in practice

The simulation will be useful for property investors, local government, and the company itself. For investors, they are able to control the timing of purchase and sale of properties. They can purchase in a low point and sell after the property price has finished accelerating. Using this simulation, the investors can avoid holding their property for too long and sell before the price starts to decline. With this simulation, the investors have something to refer to when they buy or sell a property.

For the company, the simulation can be used as a guide when they are making offers to the cities. For the city government, the simulation can be used to predict the future tax income gain from the construction of the HQ. That is, the city knows what to expect from the new HQ in terms of it impacts on housing price and the property tax revenue.

import csvsizeControl = 0.6
a = 1030*sizeControl
b = 1213*sizeControl
tractChosen = 0
xCor = [475,100,300,400,300,300,400]
yCor = [700,400,500,580,900,700,640]
tractList = [103402,103501,103502,103503,103700,103601,103602]
yearChosen = 0
futurePrice = 0
priceChange = 0
hqScale = 40000
tractChosen2 = 0
def setup():
size(880,728)
global crystalCity
crystalCity = loadImage("Map1.jpg")


def draw():
background(220)
image(crystalCity,0,0,a,b)

noStroke()
textSize(15)

for i in range(7):
m = xCor[i]
n = yCor[i]
tract = tractList[i]
fill(255)
rect((m - 5)*sizeControl, (n - 15)*sizeControl, 65, 15)
fill(255,0,0)
text(tract, m*sizeControl, n*sizeControl)

m = 630
n = 650
w = 1
for i in range(5):
fill(63,68,71)
rect(m,n,40,40)
fill(200)
text(w,m+15,n+25)
w+=1
m+=50

overButton(mouseX, mouseY)

textSize(14)
fill(19,57,119)
text("Census Tract Selection:", 630, 50)
if tractChosen >1:
text(tractChosen, 630, 90)
else:
text("No Selection", 630, 90)

text( "Median Housing Value 2018:", 630, 180)

houseValue = 0
if tractChosen == 103402:
houseValue = 434000
elif tractChosen == 103501:
houseValue = 348500
elif tractChosen == 103502:
houseValue = 620100
elif tractChosen == 103503:
houseValue = 588100
elif tractChosen == 103601:
houseValue = 775600
elif tractChosen == 103602:
houseValue = 775600
elif tractChosen == 103700:
houseValue = 829700

averageValue = 624514

if tractChosen >1:
text("$", 630, 220)
text(houseValue, 638, 220)
else:
text("", 630, 220)

houseValue1 = 0
if yearChosen == 1:
text("Median Housing Value in Year:", 630, 310)
fill(230, 83, 0)
text(yearChosen, 803, 310)
fill(19,57,119)
text("$", 630, 350)
houseValue1 = houseValue * (1.026 - 0.02368)
text(houseValue1, 640, 350)
priceChange = houseValue1 - houseValue
text("Median Housing Value Change:", 630, 420)
text("$", 630, 450)
text(priceChange, 640, 450)
averageHV = averageValue * (1.026 - 0.02368)
text("Average Housing Value:", 630, 510)
text("$", 630, 540)
text(averageHV, 640, 540)
elif yearChosen == 2:
text("Median Housing Value in Year:", 630, 310)
fill(230, 83, 0)
text(yearChosen, 803, 310)
fill(19,57,119)
text("$", 630, 350)
houseValue1 = houseValue * (1.003 + 0.021)* (1.026 - 0.02368)
text(houseValue1, 640, 350)
priceChange = houseValue1 - houseValue
text("Median Housing Value Change:", 630, 420)
text("$", 630, 450)
text(priceChange, 640, 450)
averageHV = averageValue * (1.003 + 0.021)* (1.026 - 0.02368)
text("Average Housing Value:", 630, 510)
text("$", 630, 540)
text(averageHV, 640, 540)
elif yearChosen == 3:
text("Median Housing Value in Year:", 630, 310)
fill(230, 83, 0)
text(yearChosen, 803, 310)
fill(19,57,119)
text("$", 630, 350)
houseValue1 = houseValue * (1.016 + 0.0194) * (1.003 + 0.021) * (1.026 - 0.02368)
text(houseValue1, 640, 350)
priceChange = houseValue1 - houseValue
text("Median Housing Value Change:", 630, 420)
text("$", 630, 450)
text(priceChange, 640, 450)
averageHV = averageValue * (1.003 + 0.021) * (1.026 - 0.02368) * (1.016 + 0.0194)
text("Average Housing Value:", 630, 510)
text("$", 630, 540)
text(averageHV, 640, 540)

elif yearChosen == 4:
text("Median Housing Value in Year:", 630, 310)
fill(230, 83, 0)
text(yearChosen, 803, 310)
fill(19,57,119)
text("$", 630, 350)
houseValue1 = houseValue * (1.025 - 0.0042)* (1.016 + 0.0194) * (1.003 + 0.021) * (1.026 - 0.02368)
text(houseValue1, 640, 350)
priceChange = houseValue1 - houseValue
text("Median Housing Value Change:", 630, 420)
text("$", 630, 450)
text(priceChange, 640, 450)
averageHV = averageValue * (1.003 + 0.021) * (1.026 - 0.02368) * (1.016 + 0.0194) * (1.025 - 0.0042)
text("Average Housing Value:", 630, 510)
text("$", 630, 540)
text(averageHV, 640, 540)

elif yearChosen == 5:
text("Median Housing Value in Year:", 630, 310)
fill(230, 83, 0)
text(yearChosen, 803, 310)
fill(19,57,119)
text("$", 630, 350)
houseValue1 = houseValue * (1.093 - 0.0196)* (1.025 - 0.0042) * (1.016 + 0.0194) * (1.003 + 0.021) * (1.026 - 0.02368)
text(houseValue1, 640, 350)
priceChange = houseValue1 - houseValue
text("Median Housing Value Change:", 630, 420)
text("$", 630, 450)
text(priceChange, 640, 450)
averageHV = averageValue * (1.003 + 0.021) * (1.026 - 0.02368) * (1.016 + 0.0194) * (1.025 - 0.0042) * (1.093 - 0.0196)
text("Average Housing Value:", 630, 510)
text("$", 630, 540)
text(averageHV, 640, 540)



text("Years From Now:", 630, 630)

#if

##### PUT THE STRUCTURE HERE
##### when the user click the tract label ("tractChosen" is the tract's mane)


def mousePressed():
global tractChosen, yearChosen
# tractChosen = 0
yearChosen = 0
if overSelection > 600:
tractChosen = overSelection
if overSelection > 0 and overSelection < 100:
yearChosen = overSelection

# global tractChosen2
# tractChosen2 = 1
# if overSelection2 > 0:
# tractChosen2 = overSelection2

def overButton(x, y):
global overSelection
overSelection = 0
for i in range(7):
m = xCor[i] *sizeControl
n = yCor[i] *sizeControl
if m <= x <= m + 65 and n-15 <= y <= n:
overSelection = tractList[i]

if y >650 and y < 690:
if x >630 and x < 670:
overSelection = 1
elif x >680 and x < 720:
overSelection = 2
elif x > 730 and x < 770:
overSelection = 3
elif x > 780 and x < 820:
overSelection = 4
elif x > 830 and x < 870:
overSelection = 5

--

--