Neighbourhood Construction Simulator

Simon Li
Data Mining the City
11 min readOct 17, 2018

Data Mining the City Section A Final Project

Group Members: Chengqi Tian, Lingyu Li, Xuantong Zhang, Zhengzhe Jia

Introduction:

Crown Heights, Brooklyn, New York

In this project, we are finding a neighbourhood in NYC, and, based on this neighbourhood, building a simulator that can be used to predict real property prices based on the availability of public amenities such as parks, hospitals, schools, and public transportation infrastructures. The program will allow users to build new infrastructures, eliminate or alter the function of existing ones. And the program will predict the change in property value as a result of the alteration.

In order to build the program, we have to determine how the value of properties in this community fluctuate as a function of a variety of factors such as availability of certain amenities, and transportation facilities. To accomplish this, several of our group members will be responsible for researching and establishing a mathematical model. After establishing the model, we will focus on creating the environment and agents by visualizing the neighbourhood, the buildings, and the amenities with graphics and pictures. Next, we will be focusing on the behaviours and parameters by adding interactive features to our program. The inputs and outputs of our program will be dictated by the mathematical model we found above.

Overall, the program will allow urban planners, developers, real estate investors, and policy makers to make better decisions and answer the questions below:

Will the different changes in land uses increase or decrease the overall housing price in one neighbourhood?

What characteristics of a neighbourhood or a city make it better to invest?

Model

Agents: Residential property units. In our study, we are only studying condo properties.

Environment: A neighborhood — — Crown Heights, located in the central portion of the New York City borough of Brooklyn.

Behavior: The behavior of our model are rules that define how property price is determined by adding, eliminating, or altering the placements of parks, metro stations, and hospitals. The rules of property value change are governed by the mathematical model shown below, which calculates the housing prices with distances from nearby parks, metro stations, and hospitals.

Parameters: Our agents (condo properties) include only one parameter: price of the property.

Input: parameters that globally drives our model are the number and the location of nearby parks, metro stations, and hospitals.

Output: the outcome of running our model include parameters such as the total residential area, housing price per square feet, and total real estate value of the neighbourhood.

Base Frame

Constructing the mathematical model

We selected 20 house price data from the last month of the crown heights as a sample, searching for real estate price data of crown heights on Streeteasy. Because the sample size was relatively small, in order to ensure the accuracy of the model, we artificially removed some values that differed greatly from the average housing price. After the sample was obtained, we processed the sample. First, we calculated the average value of the sample. The average value was 970. By observation, we selected each $50 as an interval and assigned the sample. After processing the completed housing price sample, we selected the parks, hospitals, and transportation facilities within the community as variables for this study. Through calculations, the distance between parks, hospitals, and transportation facilities to each real estate project was obtained. Re-assign every 0.4 miles as an interval. The distance scores of each real estate in parks, hospitals, and transportation facilities were obtained.

Next step, we imported the processed data into SPSS. With the simulation, we found that the multiple linear regression model has the greatest significance. So the model was obtained by multiple linear regression. The R square of the model passes the test, so the model is valid.

Model

Y=-0.83X1+0.375X2+0.105X3–6.359

Y= housing price

X1= distance from the hospital

X2= distance from transit

X3= distance from park

Learnings

Park alone: keep the current number of parks is the best option.

Hospital alone: 2 more hospitals will benefit most. It will results in a 35 million dollars increase in value.

Subway station alone: 4 more stations will benefit the most. It will result in a 131 million dollars increase in value.

Together, building 2 more hospitals and 4 more subways stations is the best option for the neighborhood. It will result in a 164 million dollars increase in overall property value.

The simulation can help to answer the questions we proposed. Specifically, we are trying to find out how changes in land use affect the overall housing price in a neighborhood. The simulation will benefit different parties including urban planners, policy makers, investors, and developers.

1. For urban planners and policymakers

First of all, the simulation can help urban planners analyze a variety of issues and social implications involved in the construction of amenities and transportation facilities, such as the displacement of lower income tenants and landlords, fluctuation in crime rate, and segregation in terms of income and race. With this simulation in place, urban planners can gain a deeper understanding of these changes when they are making decisions regarding whether and where to build new amenities.

Secondly, the change in property values also implies the change in property taxes collected. This will be important for policy makers and planners. As they are making decisions regarding the construction of a new metro station or a new public amenity, they can consider whether the future increase in property tax revenue can compensate for the construction costs of the new station or the park. If so, how many years does it take to compensate for the costs. Although the benefits of urban amenities are much more than just the increase in tax revenue, the policy makers can at least get some sense of the financial implications and validity of constructing new urban amenities and transportation facilities.

2. For investors and developers

A growing body of research indicates that people tend to pay more to to have access to urban amenities. Amenities such as parks, hospitals, and transit are amenities that make for a rich urban experience. And a package of urban amenities can increase desirability and competition for real estate in an area, and this will in turn increase the rent that property owners can achieve. Higher achievable rents can allow developers to consider larger scale, higher quality real estate projects. The increased investment activities will result in an upward cycle of even more desirability and investment. Therefore, it is important for real estate developers to consider how public and private investment can change the desirability of an area and the best location and number of amenities to add to a neighbourhood to maximize its property values.

Limitations and Future Implications

Because of the constraints of time and funding, we limit our research and simulation for three specific types of urban amenities — — parks, metro stations, and hospitals. This is in no way a comprehensive simulation of the impacts of amenities on property value. Often times urban amenities are not present by themselves; areas often have several different amenities that work together to improve desirability of nearby properties. When it comes to urban amenities, a well functioned package of amenities may have more impacts than the individual sum of its parts. For example, sidewalks, street trees, parks and transit can all present in urban neighborhoods and work together to improve surrounding property values. This synergy of many different amenities that occurs in amenitized neighborhoods is difficult for us to measure in this project because of the time and money constraints.

However, if we have ample funding, plenty of time, and a groups of experts need to make our simulation great, we would study in detail of how exactly urban amenities work in synergy to improve values of nearby propertie. The simulation could include all the urban areas in the United States for people to get a comprehensive understanding. The resulting simulation can be utilized by urban planners, developers, investors, and policymakers to make better and faster decisions and create positive upward cycle of investments in many neighborhoods in the long run. It will allow different parties to get a clear sense of where to build amenities or start new projects to create the most value to surrounding areas.

In addition, large datasets and scholarly articles are difficult to comprehend by people outside the field of urban planning. For home buyers, our simulation will allow them to understand the relation between nearby amenities and property values and make better home buying decisions. For scholars wanting delve deeper into the studying of property value change and nearby amenities, they do not have to go through the hassle of studying the dataset using statistical tools. The simulation, therefore, gave people access to data previously unavailable to some people because their lack of knowledge regarding where to obtain or use the data.

import spatialpixel.mapping.slippymapper as slippymapper
import csv
# global check, drawColor, drawSize,
check = 0
drawColor = color(0)
drawSize = 0
drawButton = 10
list0x = []
list0y = []
list1x = []
list1y = []
list2x = []
list2y = []
#context of value determination
tra = 0
hos = 0
par = 0
totalH = 1333000
ttra = 7
thos = 4
tpar = 5
priceS = 976.4
priceT = priceS*totalH/1000000000
def setup():
size(1400, 800, P2D)
global nyc
nyc = slippymapper.SlippyMapper(40.671610, -73.943295, 15,'carto-lightall', 1400, 600)

global x1, y1, x2, y2, x3, y3, x4, y4, x5, y5, x6, y6, x7, y7, x8, y8, park_x, park_y
x1 = nyc.lonToX(-73.964398)
y1 = nyc.latToY(40.681063)
x2 = nyc.lonToX(-73.952268)
y2 = nyc.latToY(40.678490)
x3 = nyc.lonToX(-73.919259)
y3 = nyc.latToY(40.676696)
x4 = nyc.lonToX(-73.920016)
y4 = nyc.latToY(40.668301)
x5 = nyc.lonToX(-73.930781)
y5 = nyc.latToY(40.663612)
x6 = nyc.lonToX(-73.933094)
y6 = nyc.latToY(40.663529)
x7 = nyc.lonToX(-73.945461)
y7 = nyc.latToY(40.664224)
x8 = nyc.lonToX(-73.960960)
y8 = nyc.latToY(40.663289)
park_x = nyc.lonToX(-73.943220)
park_y = nyc.lonToX(40.674118)
# nyc.addMarker(40.674118, -73.943220, "Brower Park")
# nyc.addMarker(40.674404, -73.934739, "St. John's Park")
# nyc.addMarker(40.666704, -73.927212, "Lincoln Terrace / Arthur S. Somers Park")
station = loadImage("station.png")
hospital = loadImage("hospital.png")
with open('StationEntrances.csv') as f:
reader = csv.reader(f)
header = reader.next()
for row in reader:
latitude = float(row[3])
longitude = float(row[4])
nyc.addMarker(latitude, longitude, station)

with open('HHC.csv') as f:
reader = csv.reader(f)
header = reader.next()
for row in reader:
latitude = float(row[2])
longitude = float(row[3])
nyc.addMarker(latitude, longitude, hospital)
nyc.render()#buttons information
global parkX, parkY, hosX, hosY, resX, resY, subX, subY, buttonSize
buttonSize = 100
resY = hosY = subY = parkY = 650
resX = 150
parkX = 350
hosX = 550
subX = 750


def draw():
background(255)
nyc.draw()
worldSetup()

overButton(mouseX, mouseY)
if check > 0:
# newObject(mouseX, mouseY)
fill(drawColor)
noStroke()
if drawButton == 1:
rect(mouseX, mouseY, drawSize,drawSize)
elif drawButton == 2 or drawButton == 0:
ellipseMode(CENTER)
ellipse(mouseX, mouseY, drawSize, drawSize)
loadTest(check, mouseX, mouseY, drawButton)
# print(list0x)
def worldSetup():
# draw boundary
stroke(255,0,0)
strokeWeight(2)
line(x1, y1, x2, y2)
line(x2, y2, x3, y3)
line(x3, y3, x4, y4)
line(x4, y4, x5, y5)
line(x5, y5, x6, y6)
line(x6, y6, x7, y7)
line(x7, y7, x8, y8)
line(x8, y8, x1, y1)

# text
fill(190)
stroke(190)
rect(50, 525, 220,30)
textSize(30)
fill(0)
text("Crown Heights", 50, 550)
textSize(20)
fill(58, 92, 28)
text("Brower Park", 670, 245)

#Control panel
fill(174, 179, 186)
rect(0, 600, 1000, 200)
#park button
fill(66, 206, 45)
rect(parkX, parkY, buttonSize, buttonSize, 7)
textSize(18)
fill(20)
text("Park", 380, 700)
#residential buildings button
fill(206, 190, 10)
rect(resX, resY, buttonSize, buttonSize, 7)
fill(20)
text("Reset", 175, 700)
#hospital button
fill(206, 47, 45)
rect(hosX, hosY, buttonSize, buttonSize, 7)
fill(20)
text("Hospital", 565, 700)
#subway station button
fill(9, 134, 206)
rect(subX, subY, buttonSize, buttonSize, 7)
fill(20)
text("Subway", 766, 700)

#results panel
fill(250)
stroke(189, 206, 9)
rect(1000, 600, 400, 200)
fill(0)
text("Results:", 1020, 635)

global hos, par, tra, totalH, thos, tpar, ttra, priceS, priceT
text("Total Residential Area:", 1020, 670)
text(totalH, 1220, 670)
text("(Sqft)", 1330, 670)

text("Hosing Price per sqft:", 1020, 705)
text(priceS, 1215, 705)
text("($)", 1330, 705)

text("Total Real Estate Value:", 1020, 740)
text(priceT, 1230, 740)
text("(Billion $)", 1300, 740)

priceC = (priceT - 1.3015) * 1000 - 0.1075
fill(204, 30, 56)
text("Value Change:", 1020, 775)
text(priceC, 1150, 775)
text("(Million $)", 1300, 775)

#introduce value dependent
par = len(list0x)
hos = len(list2x)
tra = len(list1x)
totalH = 1333000 - par*50000 - hos*10000 - tra*2000

#calculate the housing price
ttra = 7 + tra
tpar = 5 + par
thos = 4 + hos
priceS = (0.83*(-thos*thos*2 + 25*thos) + 0.375*(-3*ttra*ttra + 68*ttra) + 0.105*(-5*tpar*tpar + 96*tpar))*5 - 109

#calculate the total value
priceT = priceS*totalH/1000000000

# load previous constructions
for i in range(len(list0x)):
x = list0x[i]
y = list0y[i]
fill(66, 206, 45, 100)
noStroke()
ellipseMode(CENTER)
ellipse(x, y, 50, 50)
for i in range(len(list1x)):
x = list1x[i]
y = list1y[i]
fill(9, 134, 206, 100)
noStroke()
rect(x, y, 20, 20)
for i in range(len(list2x)):
x = list2x[i]
y = list2y[i]
fill(206, 47, 45, 100)
noStroke()
ellipseMode(CENTER)
ellipse(x, y, 30, 30)

def mouseClicked():
global drawColor, drawSize, check, drawButton
check = 1
print(check)
if overSelection == 0:
drawColor = color(66, 206, 45, 100)
drawSize = 50
drawButton = 0
if overSelection == 2:
drawColor = color(206, 47, 45, 100)
drawSize = 30
drawButton = 2
if overSelection == 1:
drawColor = color(9, 134, 206, 100)
drawSize = 20
drawButton = 1
if overSelection == 3:
check = 2
if overSelection == 4:
check = 3

def overButton(x, y):
global overSelection, drawColor
#selec park
if parkX < x < parkX + buttonSize and parkY < y < parkY + buttonSize:
overSelection = 0
#select Subway
elif subX < x < subX + buttonSize and subY < y < subY + buttonSize:
overSelection = 1
#select hospital
elif hosX < x < hosX + buttonSize and hosY < y < hosY + buttonSize:
overSelection = 2
#select clear
elif resX < x < resX + buttonSize and resY < y < resY + buttonSize:
overSelection = 4
#select other
else:
overSelection = 3
# print(overSelection)

# def newObject(x,y):
# global drawColor
# fill(drawColor)
# if overSelection == 1:
# rect(x,y, drawSize,drawSize)
# elif overSelection == 2 or overSelection == 0:
# ellipseMode(CORNER)
# ellipse(x,y, drawSize, drawSize)

def loadTest(m, x, y, t):
global list0x, list0y, list1x, list1y, list2x, list2y, check
if m == 2:
if t == 1:
list1x.append(x)
list1y.append(y)
elif t == 2:
list2x.append(x)
list2y.append(y)
elif t == 0:
list0x.append(x)
list0y.append(y)
check = 1
if m == 3:
list0x = []
list0y = []
list1x = []
list1y = []
list2x = []
list2y = []
check = 1

--

--