Finding a new home is a big daunting task. Buyers are bombarded with information. Apart from the usual data provided by the brokers, buyers usually like to know something more about the neighborhood, the so-called geospatial data or “geo”. This ranges from the basic and general questions such as “how many supermarkets and schools are around?” to the more personal ones like “how far is it from my workplace and how long does it take to drive there”.
If the list is short, it is possible to answer all these questions with a few searches in Google Maps or even some in person on-site viewings. But often enough, buyers first start with a screening in a real estate website. There, they screen through common parameters such as prices, areas, orientations and the numbers of rooms. The geospatial data shall really be a part of this process. So the challenge is: how to get the neighborhood information for a lot of properties at once? Luckily, with a bit of Python and the help of Google Cloud, it is quite easy to do so (Figure. 1).
In this tutorial, I will show you how to get two important pieces of geospatial data: firstly the schools within the 1 km radius and secondly, the distance and drive time from work.
We need the Google Cloud, three Google APIs to be exact, in this tutorial:
- “distancematrix” can give us the distance and driving time between two addresses, the addresses of work and of the property in our case.
- “geocoding” converts a literal address into a latitude and longitude (lat-lon) pair.
- “nearbysearch” takes the lat-lon pair, the type of the establishment and optionally a search radius, and return a list of the establishments.
I have created a Github repository to host the project codes. However, before we can use these functions, we need to create a Google Cloud project, enable the APIs, get the credential key and put this key in the config.py.
Google Cloud Setup
Head over to Google Cloud console and click on the project dropdown right next to “Google Cloud Platform” and create a project called “neighbor-check”. After its initialization, click into the project.
Click “APIs & Services” -> “Dashboard” -> “+ ENABLE APIS AND SERVICES”. In the search bar, search and enable the following APIs:
- Geocoding API
- Distance Matrix API
- Places API
So, after adding these APIs, your “APIs” page should look like:
Afterwards, click “Credentials” -> “+ CREATE CREDENTIALS” -> “API key”.
A popup with the title “API key created” shows up. Copy the “Your API key” with the “copy” button, and paste it into the API_KEY variable in “config.py”.
Finally, we need to enable billing for this project. Head over to the billing page, select the project “neighborhood-check”, select the “Billing account” in the popup and click “SET ACCOUNT”.
Be aware, Google Cloud by default allows only five projects associated with a billing account. So if you get the the message about “Unable to enable billing”, you can either request an increase of your quota, or disable billing in one of your old project to make room.
Use Python to extract geospatial information
Once set up, it is quite easy to interact with these Google APIs and extract the geospatial information we need. Since they are REST APIs, virtually all programming languages can interact with them. Here, I use Python to showcase the three functions mentioned in Figure 3.
In my repository, I have packed three Python files. “config.py” stores the API key, “functions.py” contains the three functions that communicate with the Google APIs. Finally, in “demo.py”, I set my location to “ Herzog Anton Ulrich Museum” and my workplace to the address of DSMZ Braunschweig Germany, my actual employer.
As you can see, the coordinate, the first five schools around the museum and the distance between the museum and DSMZ are returned successfully:
Add geospatial data to the house hunt
Armed with these functions, we are able to enrich our house searching experience. From your brokers or their websites, you could get the data such as prices, areas, orientations and most importantly, the addresses of the properties. After setting both the addresses of the property and of your workplace, you should get the information about the neighborhood and the working distance. I have not provided the scripts here because both the data gathering and later the data integration are too individual and website-specific.
But in the figure below, you can see that I already used my functions in my own house hunt in Guangzhou China. The baisc property data were provided by Lianjia. I have added “latitude”, “longitude”, “school_count”, “supermarket_count”, “drive time to work” and “distance to work” to the table.
In fact, after geocoding, I could even plot the properties with folium (Figure 9.). With this plot and the numbers, I could get a spatial sense of a few candidate properties and narrow down my search considerably.
However, all these great benefits do not exactly come for free. Be extra careful with Nearby Search. Google gives us the generous $200 monthly credit for “Places” and “Routes”. That is, the first 40,000 calls in “Distance Matrix” and the first 5,000 calls in “Nearby Search” are on Goolge. But once the monthly quota is used up, you will be charged $5 per a thousand calls in “Distance Matrix” and a whopping $40 per a thousand calls in “Nearby Search” (Pricing sheet). The gotcha is, more often than not, Nearby Search returns several pages of results and looking into each page counts as a call (Stackoverflow discussion here and Reddit discussion here). So a careless “for loop” can rack up the cost really fast.
I have to admitted that I have been served quite huge a bill when I did my house hunt. Since my list had around 1300 properties and I did some reruns, so I have made around 7700 Nearby Search API calls in total. As a result, I ended that month with a bill of an eye-popping €135.02.
But don’t be discouraged by my experience, because this was totally avoidable. I suggest using these functions sparsely. Firstly and most importantly, first shorten your candidate list based on other criteria and then run these Google APIs on them. Secondly, make use of the monthly credit by distributing your calls across different months. But the latter runs into the risk that your candidate property may be sold while you are waiting for your next monthly credit. So it is important to balance your options.
I must admit this whole thing is quite a geeky way of doing an age-old task that is a house hunt. Most if not all people can find their next dream home without knowing what Google Cloud is. But hey, in this age, everything is “data-driven”. So there is nothing wrong for us geeks to make the most out of what Google can provide. In a house hunt, just as in other markets for lemons, the more data we get, the more we can slightly even the odds against our brokers, and the better decision we can make.
This project just shows that. With Google APIs and a bit of data wrangling, it is possible to enrich our initial data gathering. I believe that the geospatial data from Google are instrumental for a good purchase. It saves us a lot of visiting time and help us to compare a short list of properties. In fact, this technique can also be used when you are searching for a new office or even a new factory.
Happy house hunting!
The scripts for the three functions can be found here.