Google Analytics Dashboard Project

Hongri Jia
Passion for Data Science
6 min readMar 27, 2018

In this project, I use google analytics to obtain data related to website traffic of our client and create a dashboard with Apache Superset and Tableau to make visualizations. The client is planning to launch online courses abroad and open a new branch in another Canadian city. The purpose of the project is to analyze where their largest potential international and domestic markets are.

Project Background

Our client is a Canadian startup company focusing on data science skills training. Because of its excellent business performance in Canada, the founders are considering to expand the business to other countries. They are planning to provide online courses to people who have interests in data science and want to know where they should start from to do marketing. Additionally, they would like to establish a new branch in Canada. The objective of this project is to solve these issues based on the website traffic data from Google Analytics.

Project Content

Tools I used here are Google Analytics, python, MySQL, Apache Superset and Tableau. Google Analytics is a web analytics service offered by Google that tracks and reports information about website traffic. Business owners can use it for insights into the performance of their websites and make decisions on business strategies according to the website traffic data. If you want to know more about Google Analytics and how it can help you with your purposes, you can visit this link: https://www.google.com/analytics/

Apache Superset and Tableau are both data exploration and visualization web applications which provide an intuitive interface to create interactive dashboards. When people are preparing reports, they always do some data visualization to make sure the results of their analysis are understood by their managers or clients more easily. They work on this purpose very well. Other than making visualizations easier, another advantage of them is that users can connect to the most commonly used databases such as MySQL and Oracle. They can make SQL queries directly on the interface, which is very convenient to select data based on their purposes.

Here is the pipeline of the project:

  • Use python to access to Google Analytics and get the web traffic data
  • Save the data into csv files and import the file into MySQL database
  • Connect Apache Superset and Tableau to MySQL database and make the data visualization

Initially, we need to get access to Google Analytics Reporting API (see Figure-1). To do this properly, we should install the client API via pip install — upgrade google-api-python-client, download the client secret file and place it in the same folder as your Jupyter notebook.

Figure-1 Python Code for Accessing to Google Analytics

Then we are going to create a function to extract dimensions and metrics information and build a report (see Figure-2). All dimensions and metrics of Google Analytics are listed in the following link. https://developers.google.com/analytics/devguides/reporting/core/dimsmets

Figure-2 Python Code for Building Function of Creating Report

Here, I choose the data of the last whole year and put the data into separated lists in terms of dimensions and metrics, and import the pandas library to create a data frame. Then place the data from Google Analytics into the data frame and save it into a csv file.

Figure-3 Python Code for Data Extraction

Next step is importing the .csv file just saved into MySQL database (see Figure-4). Here, I create a new database and a new table named “location” to load the data.

Figure-4 SQL Code for Creating New Database and Table

Finally, we need to connect Apache Superset with MySQL to access the data. Use pip install superset and pip install mysqlclient in command prompt to install the application and connector to MySQL (Windows). Other questions related to installation and configuration can be solved on the official website of Apache Superset: http://superset.apache.org

The most important thing is to make sure you can connect to the database properly. Besides that, remember to create a table in Superset to load the data in after the database is connected.

Figure-5 Apache Superset Database Connection Instance

After the setting up, we can connect to the database just created and start to create some fancy graphs like bar charts and word clouds on your own dashboard.

Figure-6 Apache Superset Dashboard (1)

International Market Analysis

At first, I create a distribution map and a word cloud to show which country the users of the client’s website come from. In Figure-6, it’s shown that people from all over the world visited the website last year, which means the there is a huge market potential of offering online courses about data science. The word cloud is created excluding Canada since the purpose is to conduct business outside Canada. From it, we can see the country with the largest font size is the United States, which means USA has the most amount of users in the past year. Thus, USA is our first target now and then we should find a more specific location for the marketing.

Figure-7 Apache Superset Dashboard (2)

The pie chart above also proves that USA is a proper choice and tells us the exact proportion of all the users of the website. Then I create a bar chart to find which city in America has more users. According to it, it’s easy to say New York is the best choice.

Domestic Market Analysis

Tableau is used in this part. The procedures for installation and database connection of Tableau are much simpler than Superset. We just need to download the installation package and follow the instruction. And connecting to databases is shown in Figure-8.

Figure-8 Tableau Database Connection Instance

In the Tableau, I also create a distribution map and a bar chart to show people from which city in Canada pay more attention on our client’s website. In Figure-9, it’s shown that most people are from Ontario and Quebec. Since the head office of our client is in Ontario, they prefer to open the new branch in other provinces. According to the bar chart, Montreal is their best choice.

In addition, the client want to see how the users find their website. I query the source data from Google Analytics and make the bar chart at the lower right corner of the dashboard. Based on the plot, most users visit the website directly or through google. It is worth mentioning that a large part of users who visit the website directly is a positive indicator for our client’s objective. This means there are many people who already know their company in Montreal instead of just finding the website through google search with some keywords. In other words, Montreal already has a mutual user base for our client and is a reasonable target location for a new branch.

Figure-9 Tableau Dashboard

Conclusion

To make the final decision, we still need more analysis on the city we choose such as quantity demanded of people understanding data science, amount of job positions available, average education level and so on. These further works won’t be displayed in this blog. Here I just show some outputs that we can achieve with Google Analytics.

If you are interested in my work or have some problems about it, please feel free to contact me. At the meantime, if you want to know more about what students learn from WeCloudData’s data science courses, check out this website:

www.weclouddata.com

--

--