Writing your data connector from Facebook ads to Google BigQuery

Andrey Osypov
Beards Analytics 🇺🇦
9 min readMay 27, 2020

--

If you work with Facebook ads then you will understand that to see the statistics in Facebook interface is not enough and you need to use other systems.

For example, if you use App+Web Google Analytics you have a free connector for Google BigQuery. We can do the classic reports by using Google Data Studio (in the last article I described a few tricks about it) or other visualization systems. We need data from advertising activities for that.

Everything is simple with Google Ads. We can use a BigQuery transfer, which can send Google BigQuery a lot of data from the advertising and you can set it for several minutes. Working with other data sources are difficult.

There are many paid and partially free services on the market that solve this problem, but why pay money if we have cool Cloud Functions in Google Cloud.

And besides visualizing data on Facebook Ads in Data Studio, we can upload this data to Google Analytics from BigQuery, or automate marketing in general.

Creating Google Cloud Project

If you haven’t got your Project in Google Cloud it is time to create that. For that we go to console.cloud.google.com and create an account.

Next, we need to create a new project.

You get $300 for the year, so first year will be free.

You need to fill all the necessary billing data to create a new project:

Yes! We have a customized project:

To publish functions, we have the opportunity to work with gcloud (command-line tool, in the terminal) or work with web UI, which we just got access to.

If you do not have experience with the terminal, this can lead to difficulties, so I recommend that you configure everything in the web version.

To install gcloud go to the address, and, depending on your operating system, follow the instructions.

Don’t forget to check the boxes next to beta commands, we need it.

Facebook ads ETL

Getting Facebook credentials.

At first, we need create app on https://developers.facebook.com/

Choose, Create App:

Here you can get app_id, app_secret and access_token:

APP_ID will show directly on the main screen:

Next, APP_SECRET, we can select from the Settings menu — Basic section

The most interesting thing is to take a token. To do this, click on the last item — Product on the left side of the menu and select the Marketing API, click on Set up:

Furthermore, after adding the Marketing API, we go to the Tools section, where we select the read_insights and ads_read checkmarks, click on the “GET TOKEN” and save the token.

The token that was obtained this way is live for 60 days, after that we will need to generate it again. We can go through Facebook verification to take the long live token.

We will need these three variables to load scheduled data.

Also, we will need account_id. We can find it in Ads Manager:

Installation via web UI

First, create a topic in Pub/Sub. For this, go to the Pub/Sub section in the left menu and select topics:

Click on create topic and type facebook_run:

Now, go to the repository https://github.com/snegireff/facebook_ads_bigquery_etl and click on the clone or download button and download the zip file yourself:

Since the zip file contains a folder with files with source code, you need to unzip the zip file, go to the folder and make a zip file archive:

Then, go to the cloud function section in the left menu and click on create function:

The name can be arbitrary, in memory we allocated 1GiB, as a trigger we specify cloud pub/sub, and the topic is the one that we created in the last step.

After that, specify the type of ZIP upload:

Runtime we will have python 3.7, select our archive file and specify any bucket available in the storage bucket, the function should be called get_facebook_data.

Next, click on more:

And change the maximum execution time to 540sec:

And click on Create, then, after a couple of minutes, our function is published:

After that, we need to make sure that our function runs on schedule. For this, cloud scheduler is perfect for us, since attributes cannot be set via web UI, yet, we will use Cloud Shell, for this, click on the terminal icon:

gcloud beta scheduler jobs create pubsub facebook — time-zone “Europe/Kiev” — schedule “0 5 * * *” — topic facebook_run — message-body “get_facebook” — attributes project_id=PROJECT_ID,dataset_id=DATASET_ID,table_id=TABLE_ID,account_id=ACCOUNT_ID,app_id=APP_ID,app_secret=APP_SECRET,access_token=ACCESS_TOKEN

Where

PROJECT_ID = your google cloud PROJECT_ID

DATASET_ID = for example “facebook”

TABLE_ID = for example “fb_stat”

ACCOUNT_ID = your facebook account id without act_

APP_ID, APP_SECRET, APP_TOKEN = from apps developers.facebook.com

And at 5am EST. (the runtime is encoded with the — schedule key, and the time zone — time-zone) every day. The data will be written to our table in Google BigQuery.

To download data from yesterday, go to the Cloud Scheduler, in the Google Cloud console and click Run now:

Then, after a couple of minutes, in BigQuery we will have a new dataset and a table with data for yesterday:

Installation via terminal

Now, go to the repository https://github.com/snegireff/facebook_ads_bigquery_etl and click on

Or we can download the code from github repository. To do this, create a new folder. In the terminal, in this new folder we run the command:

git clone https://github.com/snegireff/facebook_ads_bigquery_etl.git

Furthermore, we can start work. Install the project for our gcloud. Project id is displayed on the home page in the Google Cloud Console in the Project info section. To do this, in the terminal, run:

gcloud config set project YOUR_PROJECT_ID

To communicate with our function, the pubsub mechanism from Google Cloud is used, so create a new topic. To do this use the terminal, run:

gcloud pubsub topics create facebook_run

To publish the cloud function, which includes the ability to collect statistics on Facebook and data on the exchange rate, in the terminal, in the source folder, we run the command (we answer Y for all questions):

gcloud functions deploy get_facebook_data — runtime python37 — trigger-topic facebook_run — timeout=540 — memory=1024MB

After that, we need to make the launch of our functions on a schedule, for Facebook, in the terminal we run a command where we fill in the values (we answer Y for all questions, I usually use zone 13):

PROJECT_ID = your google cloud PROJECT_ID

DATASET_ID = for example “facebook”

TABLE_ID = for example “fb_stat”

ACCOUNT_ID = your facebook account id without act_

APP_ID, APP_SECRET, APP_TOKEN = from apps developers.facebook.com

gcloud beta scheduler jobs create pubsub facebook — time-zone “Europe/Kiev” — schedule “0 5 * * *” — topic facebook_run — message-body “get_facebook” — attributes project_id=PROJECT_ID,dataset_id=DATASET_ID,table_id=TABLE_ID,account_id=ACCOUNT_ID,app_id=APP_ID,app_secret=APP_SECRET,access_token=ACCESS_TOKEN

The data will be written to our table in Google BigQuery at 5am EST. (the runtime is encoded with the — schedule key, and the time zone — time-zone) every day.

To download data for yesterday, go to Cloud Scheduler, in the Google Сloud Сonsole and click Run now:

Then, after a couple of minutes, in BigQuery, we will have a new dataset and a table with data from yesterday:

Currency converter

It is not enough just uploading data from Facebook. The costs data on Facebook is displayed in USD only, the other metrics such as revenue, can be displayed in other currencies. We make the second function, which collects daily data on the exchange rate and inputs it in another table in the same dataset.

There is a good service for this — currencylayer.com, which allows you to download data up to 250 times a month for free. It is more than enough for us.

We are registering on the free plan:

In the https://currencylayer.com/dashboard section, copy and save the token, we need it to download data on a schedule.

To create a scheduled download, in the terminal, run the following command, where we fill the values:

PROJECT_ID = your google cloud PROJECT_ID

DATASET_ID = for example “facebook”

TABLE_ID = for example “exchange_rate”

API_KEY = key value from currencylayer.com

From_currency= leave USD

To_currency = indicate your currency, for example UAH

gcloud beta scheduler jobs create pubsub converter — time-zone “Europe/Kiev” — schedule “0 5 * * *” — topic facebook_run — message-body “get_currency” — attributes project_id=PROJECT_ID,dataset_id=DATASET_ID,table_id=TABLE_ID,api_key=API_KEY,from_currency=USD,to_currency=UAH

A few words about dynamic links

In case you use App + web or Firebase for mobile devices, we have the opportunity to see the application’s installation (or rather first open) from Facebook in BigQuery.

We can see not only the installation, but also the opening of the application. For this we use dynamic links from Firebase.

Here are detailed instructions on how to implement them in the application. In Facebook’s case, you should pay attention to a couple of points.

Firstly, the presence of og tags on the site, without them, Facebook ads, give an error, and doesn’t allow placing a deep link in the ad settings.

Secondly, we can collect the links ourselves, then we can register all the necessary parameters (description of the parameters) and the link will look something like this:

https://domain.page.link/?link=https://raketaapp.com/&apn=package_name&ibi=package_name&afl=https://play.google.com/store/apps/details?id=package_name&ifl=https://itunes.apple.com/us/app/id________&utm_source=facebook&utm_medium=cpc&utm_campaign=install_google_play|23843962811990104|23843962819200104|23844002612500104

Now, through dynamic links we can only pass three parameters, namely utm_source, utm_medium and utm_campaign, and we want to see statistics data before each ad, we can write all necessary values with a separator in utm_campaign for each ad, for example |.

So utm_campaing parameters would look like this:

utm_campaign=CAMPAIGN_NAME|CAMPAIGN_ID|ADSET_ID|AD_ID

We can use this following query in BigQuery:

select

concat((select value.string_value from t0.event_params where key = ‘source’), ‘ / ’, (select value.string_value from t0.event_params where key = ‘source’)) as sourceMedium,

split((select value.string_value from t0.event_params where key = ‘campaign’), ‘|’) [safe_offset(0)] campaign_name,

split((select value.string_value from t0.event_params where key = ‘campaign’), ‘|’) [safe_offset(1)] fb_campaign_id,

split((select value.string_value from t0.event_params where key = ‘campaign’), ‘|’) [safe_offset(2)] fb_adset_id,

split((select value.string_value from t0.event_params where key = ‘campaign’), ‘|’) [safe_offset(3)] fb_ad_id

from `project.dataset.table` t0

where event_name like ‘%dynamic%’

You only need to join them.

Join our channel in telegrams: https://t.me/firebase_app_web_bigquery

--

--