Extracting Facebook Insights via Fivetran and Snowflake

David Hrncir
Hashmap, an NTT DATA Company
8 min readMar 17, 2021

By David Hrncir and Chinmayee Lakkad

I recall “in the old days,” a marketing company would come to your house, pay your parents a fee for 3–6 months, and attach a little black box to your TV. The box would gather a limited set of statistics on the channels watched, time of day, and duration. When the 3–6 months were over, the marketing company would return to your home, collect the device, and be on their merry way. We thought nothing of it at the time, but the data gathered on that box was used to shape TV programming for the foreseeable future. Of course, the data gathered from my house was probably skewed since my dad was a channel changer!

Fast forward a “few” years. It’s no longer just TV shows and commercials that are monitored. It’s, well, everything. It was soon discovered that people love to make comments…on everything. Therefore, mediums to provide this “commenting” service were inevitable. Enter social media, and for our purpose in this post, Facebook. Facebook needs no introduction. Facebook gives almost every person and every company in the world a means to share information about themselves via posts, pictures, videos, ads, and so on. And vice-versa, Facebook gives the ability for almost anyone to provide feedback to the shared information. This combination of “post/view/response” produces a wealth of information.

Facebook Insights

Facebook provides a window into this information called “Insights.” Insights contain information in various groupings and aggregations with many graphical representations via Facebook’s Insights portal. But what happens when data users need additional views, consolidations, etc..? Organizations will ultimately need to slice and dice the data differently than the default views provided. Plus, access to the default views creates a security debacle and the hassle created by multiple browser tabs having to be opened to get the overall view needed. Then…what happens when data users want to add the next social media platform and incorporate that data into existing views or data repositories?

There lies our challenge(s):

  • How can we access the raw Facebook Insights data?
  • How do we pull, store, and refresh the data in an automated fashion?
  • How can we build custom data views (aka dashboards)?
  • How do we track developmental changes to these views?
  • AND…how quickly can we get this done?

Before we start implementing views into the data, we need data. Before we start getting data, we need a plan. When constructing a plan, we try to design responsibly using as many S’s as possible: simple, speedy, sustainable, self serve, secure, scale, and savings.

Of course, there are many ways to address the challenges listed above. In most cases, a diagram of the plan is created first to help us understand the overall goal(s). In our case, pull data from a source, maintain it in a repository, reach a consensus on data needs, and build visualizations based on those needs. As an added note, this post focuses more on data integration than trend analysis or defining portions of Facebook’s data catalog. So the basic plan resembles the image below.

Facebook Insights plan/architecture

Now that we have a plan, we can create a solution. Our solution…is a rather simple one. First, utilize Fivetran to extract data from the Facebook Insights API into a Snowflake database. The Fivetran “raw” data tables in Snowflake will be queried for Part 1 of the project (spoiler, Part 2). Second, build a dashboard with the initial agreed-upon views in Looker where the Looker project is integrated into GitLab to track our progress/changes. So…is the architecture truly that simplistic? Yes! Let’s map the technology stack to our plan.

Planned tech stack

Before describing the implementation process, here are a few key points for choosing this particular data integration stack:

Fivetran

  • Fully automated and managed SaaS data integration service
  • Setup is quick and simple
  • Automated data model (schema) build
  • Automated data refresh (CDC)
  • Built-in logging
  • Secure data transport and storage
  • Industry-proven connectors for many data sources

Snowflake

  • SaaS and an industry leader in cloud data platforms
  • Virtually unlimited scale and elasticity
  • Pay as you go (consumption-based) platform
  • Fivetran has a destination connector for Snowflake

Looker

  • Browser-based interface
  • Extensible modeling paradigm/language
  • Customizable and exportable reports
  • Easily build views and dashboards

GitLab

  • Secure
  • Wide-range of features
  • Easily manage and track issues
  • Simple Looker integration

We will describe the overall implementation from a 30,000ft view. Please watch the Hashmap Megabytes video below for a more detailed walkthrough of the implementation process:

Extracting Facebook Insights API via Fivetran & Snowflake — Hashmap Megabytes — Ep 15

To get access to Facebook Insights, we created a Facebook user and gave the user the “Analyst” role via Facebook’s page configuration settings (you will need to be a Facebook page administrator to accomplish this task). The Analyst role provides access to the Insights data. Then we reviewed the Fivetran and Facebook developer documentation to understand the data model better.

Adding the Analyst role to a user

Fivetran can be configured in a few steps. We already had Snowflake set up as a destination in Fivetran, so step 1 is complete. If you do not have an existing destination, the Fivetran Snowflake destination setup will quickly guide you through the process.

Step 2 (shown below), we added a Facebook Pages connector to our destination using the Facebook credentials mentioned above (analyst user/role). We configured the job to pull all pages and all schema data tables on a 24-hour refresh interval.

New connector
Facebook Pages
All data selected

Lastly, we ran the synchronization process to make sure there were no issues. At this point, the Facebook Insights data is now loaded into Snowflake and will be refreshed daily.

Tables created in Snowflake

GitLab Integration

GitLab integration requires a few steps to be performed within Looker and GitLab:

  1. First, create a GitLab project.
  2. Second, add a GitLab service account with “developer” access to the GitLab project or use an existing service account.
  3. Third, add developers to the GitLab project who will be accessing the GitLab project from Looker.
  4. Lastly, add the SSH key created in Looker, during the configure git repo process, to the service account SSH Keys section (shown below) in GitLab.

The Looker “Setting Up and Testing a Git Connection” help documentation will guide you through the entire process. Now any modifications to the Looker project are tracked via GitLab.

Adding Looker SSH key to GitLab user

Looker is where most of our time was utilized primarily for R&D into the data. First, we built a new Looker project and linked the Looker project to our GitLab project. We used the Looker SQL Runner interface to build and visualize various queries. We decided on three queries related to posting statistics, three queries related to audience statistics, and one query to display data freshness. Our goal was to create custom views, or “Looks” as defined in Looker, versus simply mimicking the views already existing in the Facebook Insights portal (which are a great start, by the way).

SQL Runner sample look

Once we were satisfied with the looks, we added the looks to a dashboard. This dashboard is, of course, by no means a definitive list as the insights data could be parsed in a virtually unlimited number of ways. Analysts and BI developers can coordinate with business users to create as many dashboards as deemed necessary for any department or business level. Here is our first Facebook Insights dashboard…nice!

Looker dashboard

Final Thoughts

We hope you gained some “insight” into how easily data integration can be performed using Fivetran and Snowflake. The primary focus of this project was data movement/integration. Still, we felt giving the code repository, and visualization aspects would help you understand the full system development life cycle. We faced the biggest challenge in understanding the data catalog and building use cases (looks) from the catalog. This was a fun implementation as we gained some experience in understanding how Facebook Insights operates and the breadth of data available.

We plan to do Part 2 for this project. We will implement additional data integration concepts such as dbt (Data Build Tool) to transform the raw data into reporting-friendly data structures. This will allow us to separate the staging area from the information delivery area properly. Please share any ideas regarding the project. We’d be happy to review the ideas and possibly introduce them in a later post. Again, if a more hands-on view is needed, please watch this Hashmap Megabytes video.

Extracting Facebook Insights API via Fivetran & Snowflake — Hashmap Megabytes — Ep 15

Need Help with Your Cloud Initiatives?

If you are considering the cloud for migrating or modernizing data and analytics products and applications or if you would like help and guidance and a few best practices in delivering higher value outcomes in your existing cloud program, then please contact us.

Hashmap offers a range of enablement workshops and assessment services, cloud modernization and migration services, and consulting service packages as part of our Cloud, dbt, and Snowflake service offerings.

Hashmap’s Data & Cloud Migration and Modernization Workshop is an interactive, two-hour experience for you and your team to help understand how to accelerate desired outcomes, reduce risk, and enable modern data readiness. We’ll talk through options and make sure that everyone has a good understanding of what should be prioritized, typical project phases, and how to mitigate risk. Sign up today for our complimentary workshop.

Other Tools and Content for You

David Hrncir and Chinmayee Lakkad are Regional Technical Experts at Hashmap, an NTT DATA Company, and provide Data, Cloud, IoT, and AI/ML solutions and expertise across industries with a group of innovative technologists and domain experts accelerating high-value business outcomes for our customers.

--

--

David Hrncir
Hashmap, an NTT DATA Company

David Hrncir is a Regional Technical Expert at Hashmap, Houston. He specializes in building modern data warehouse solutions.