Data Mining — How to Use Yelp API
Last time, I shared how to scrape Instagram photos based on Yelp restaurant lists. Even though it doesn’t have to be Yelp data but in my perspective, Yelp business information is more reliable than other sources. I didn’t share about how to get Yelp data with API that they provide because I don’t want people just focused on Yelp but if some people want to get data from Yelp, I think sharing how to use Yelp API would be helpful. Yelp API is pretty powerful because this API is provided by Yelp and they are trying to give us better information than other private API machines. So Let’s get started!
First of all, in order to use Yelp API, you have to sign up as a developer.
Yelp Developer Website: https://www.yelp.com/developers/
If you become a member for Yelp, you will receive your Client ID and Clienct Secret, this can be found in “My App” tap
To use Yelp API, we need to have Authentication with Client ID and Client Secret, then you will get a token for API access.
To get an access token, you have to request a POST call. There are various ways to do this but I am using a tool which is called ‘Postman’ Definitely you can do with ‘curl’ or ‘wget’ in Linux environment but this is easier to request POSTs and GETs. Also Yelp recommends to use this!
Get Postman! -> https://www.getpostman.com/
You can get the postman with the link above.
Let’s get the access token!
What we need is URL and parameters to request a POST
If you read the document page in Yelp Fusion, it will give you an url for the access token(https://api.yelp.com/oauth2/token). There are three types of arguments, grant_type, client_id, client_secret but, for the grant_type, only client_credentials is supported. Now we have everything to request the access token. You can simply put information you have now.
Now you have the access-token and token_type! Let’s get Yelp data with this!
I am using Search API(https://www.yelp.com/developers/documentation/v3/business_search)
There are many options that you can choose but this time, I am going to get the restaurant data by review count. So what I need now is ‘location’ and ‘sort_by’ parameters. Obviously, you will be able to give more options according to parameter lists. But then, last time, when I scraped Instagram photos, the point of it was to get photos as many as possible so that I pick the search parameter by review count. More review count can mean more users so I can expect more pictures with these location tags.
As shown above, you have to add Authorization. Value will be ‘token_type + space + access_token.’ Then you need to give parameter separately. I just added location as ‘santa monica’ but you still add location with latitude and longitude if you can’t specify a certain location.
If you send this GET request, you will receive the yelp results as JSON file.
I converted this JSON file to CSV to scrape the Instagram photo but you don’t have to. you can just immediately read JSON file and send the instagram-scraper queries with the name of business and its address.
I hope this helps! I will come back next time with more fun stuffs!