Inspect Email Logs using BigQuery

Kevin Sanghvi
4 min readApr 11, 2020

--

Inspecting Email Logs is one of the key task performed by an Enterprise Admin on a day to day basis, whether it’s knowing, How many Spam emails have arrived, Top Sending domains, Domain wide Subject based search, Top Rules triggered by an email and many others.

This is where Google Cloud — BigQuery can help to inspect large amount of data in a very short amount of time with flexibility of exporting to popular analysis tools like Data Studio, Sheets. This would ideally take lot of hour efforts using Email Log Search feature and that too with pre-defined fields like Subject, Sender, Recipient, IP, Subject and Message-ID. Email Log search retention is for 30 days.

In this blog, I would share how to setup this feature with some examples and I am using Google Cloud Platform (GCP) Free Trial of $300 and there is a limited always free Quota for Big Query usage.

How to setup GCP Free Trial & Always Free Quota Info

Step 1

Create a new Google Cloud Console Project once the Free Trial is activated, select the project and add a Project Editor from IAM & Admin section.

Step 2

Login to G Suite — Super Admin console and go to Apps > Gmail > Setup and enable the option by mentioning your GCP Project you created in Step 1 with a Dataset name and save changes. After few minutes, you would your Dataset is created in GCP under your project.

Step 3

This is the Dataset created in Step 2.

We would need to verify whether this is also shared with required Service accounts, for that, click on Share Dataset.

Step 4

We would need to wait for approximately an hour, so that a table called daily_date would be created on which SQL queries would be performed. We’re all set to Drive! :)

Example: To know details of Spam emails using Big Query Editor.

SELECT TIMESTAMP_MICROS(event_info.timestamp_usec) as timestamp,
message_info.subject,
message_info.source.address as source,
destination.address as destination,
message_info.rfc2822_message_id
FROM myBQdataset.daily_20200411 d, d.message_info.destination
WHERE message_info.is_spam
LIMIT 1000

myBQdataset — Name of the dataset & daily_20200411 is the Name of table with limit of 1000 results.

Example: To know detail of Spam emails using Big Query connector using Google Sheets.

Reference for Example SQL Queries & Schema for Custom SQL queries.

FAQ(s)

What is BigQuery?

BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google’s infrastructure.

What is BigQuery Dataset?

A dataset is contained within a specific project. Datasets are top-level containers that are used to organize and control access to your tables and views. A table or view must belong to a dataset, so you need to create at least one dataset before loading data into BigQuery.

Which G Suite Edition supports Gmail Logs to BigQuery?

G Suite Enterprise & G Suite Enterprise for Education.

What is the default table expiration time with this feature?

It’s 60 days and can be extended.

What if my table gets deleted?

It can be restored within 2 days.

How to export BigQuery Data?

Note: It’s important to note that BigQuery has a maximum row size limit of 1MB. For this reason, some fields are truncated to make the log shorter than 1MB — 1KB, so that it can be inserted successfully into BigQuery. The 1KB is intentionally left as a buffer.

This analysis is also possible via Security Investigation Tool however the flexibility of SQL queries customization and analysis on your favorite editors is also very comfortable.

Thank you. I hope this would provide you a Basic level understanding on using this feature. Feel free to share your feedback which would always motivate me for doing more :)

Stay Safe, Stay Indoors !!

--

--