A Comprehensive Guide to Building Gmail Mail Synchronization Functions in NestJS

Abdullah Irfan
13 min readNov 17, 2023

--

This is seventh story of series Building a Robust Backend: A Comprehensive Guide Using NestJS, TypeORM, and Microservices. Our purpose is to build an email sync system for Gmail, oAuth2 for email accounts set, we can proceed with introducing features in our application. Gmail offers individual email sync. and threads sync. In threads, related messages are grouped together like a conversation, more about it can be read in Google’s managing threads docs. This story will be long, since it covers all basic possible aspects of email synchronization.

We will let’s start with syncing Gmail with our local database. For this we will need to create migration and decide columns we will be using for our migration.

  1. account_id: This column stores UUIDs that reference an id in the gmail_accounts table, establishing a foreign key relationship. This setup implies that each record in gmail_threads is linked to a record in gmail_accounts. The isNullable: false ensures this field cannot be left empty.
  2. subject: A text column intended to store the subject of a Gmail thread. Being of type text, it can store strings of any length. It’s nullable, so it can be left empty.
  3. from: This text column is used to store information about the sender of the email. Like the subject column, it can contain text of any length and is nullable.
  4. to: Similar to the from column, this stores the recipient(s) of the email. It’s a text column that can be left empty.
  5. cc: This column is for storing the carbon copy recipients of the email. It is a text column and can be null.
  6. bcc: Stands for blind carbon copy. This column is also for email recipients but in a way that other recipients cannot see who is BCC’d. It’s a text column and nullable.
  7. date: Stores the date and time when the email was sent or received. It’s a timestamp with time zone, allowing for precise timekeeping that considers time zone differences. This column is also nullable.
  8. body: This is intended for the email’s body content. As a text column, it can contain long strings of text and is nullable.
  9. attachments: A JSONB column designed to store structured JSON data, which is ideal for attachment details like file names, types, sizes, etc. It’s nullable and offers the flexibility of storing complex structured data.
  10. label_ids: This column is for storing an array of text values. It’s designed to hold labels or tags that might be assigned to an email thread. This column is also nullable.
  11. thread_id: This column stores a text identifier for the Gmail thread. It’s marked as non-nullable (`isNullable: false`), meaning it must always have a value.

Each of these columns is designed to capture different aspects of an email thread, mirroring the kind of data you would expect to find in an email client like Gmail. The schema allows for a comprehensive representation of email threads, including details about the participants, timing, content, and categorization.

Let’s create migration, run npm run migration:create — name=gmail-threads to create gmail-thread migration and its corresponding table in DB. After that add the below migrations code and run npm run migrate.

import {
MigrationInterface,
QueryRunner,
Table,
TableForeignKey,
} from 'typeorm';

export class GmailThreads1699896950683 implements MigrationInterface {
async up(queryRunner: QueryRunner): Promise<void> {
const table = new Table({
name: 'gmail_threads',
columns: [
{
name: 'id',
type: 'uuid',
isPrimary: true,
isGenerated: true,
generationStrategy: 'uuid',
},
{
name: 'account_id',
type: 'uuid',
isNullable: false,
},
{
name: 'subject',
type: 'text',
isNullable: true,
},
{
name: 'from',
type: 'text',
isNullable: true,
},
{
name: 'to',
type: 'text',
isNullable: true,
},
{
name: 'cc',
type: 'text',
isNullable: true,
},
{
name: 'bcc',
type: 'text',
isNullable: true,
},
{
name: 'date',
type: 'timestamp with time zone',
isNullable: true,
},
{
name: 'body',
type: 'text',
isNullable: true,
},
{
name: 'attachments',
type: 'jsonb',
isNullable: true,
},
{
name: 'label_ids',
type: 'text[]',
isNullable: true,
},
{
name: 'thread_id',
type: 'text',
isNullable: false,
},
],
});

await queryRunner.createTable(table, true);

const foreignKey = new TableForeignKey({
columnNames: ['account_id'],
referencedColumnNames: ['id'],
referencedTableName: 'gmail_accounts',
onDelete: 'SET NULL',
onUpdate: 'CASCADE',
});
await queryRunner.createForeignKey('gmail_threads', foreignKey);
}

async down(queryRunner: QueryRunner): Promise<void> {
await queryRunner.dropTable('gmail_threads');
}
}

Before we deep dive into the functions, let’s clarify few things:

  1. The functions aren’t necessarily in chronological order.
  2. The email sync by date doesn’t correctly work, because even if we pass gmail.users.threads.list with after:${sinceDateTimestamp}, for latest email we will still get the email present at sinceDateTimestamp time, so we need to filter out by getting the mail data.
  3. I haven’t thrown errors in every scenario since it wasn’t my requirement, but errors should be thrown for general sync. scenarios.
  4. There are some changes (interfaces and other functions) and some new interfaces are introduced so check GitHub repo for complete operational code

First, let’s create a function called syncMail to update Gmail threads for a specific account. This function will take the account identifier, page number, and number of threads per page as inputs. When we are on the first page, this function will update with the latest threads; if we are on any other page, it will update with older threads. Depending on the page number, it will call either syncLatestThreads or syncOlderThreads. After updating, it returns a custom message with the status. If there's an error, it catches it and returns an error message.

  async syncMail(
id: string,
page = 1,
pageSize = 50,
): Promise<responseMessageInterface> {
try {
const oAuth2Client = await this.prepareOAuthClient(id);
if (!oAuth2Client) throw new Error('OAuth2 client preparation failed');

const data =
page === 1
? await this.syncLatestThreads(id, oAuth2Client, pageSize)
: await this.syncOlderThreads(id, oAuth2Client, page, pageSize);
return customMessage(
HttpStatus.OK,
'Gmail account updated successfully.',
data,
);
} catch (error) {
console.error('Error in syncMail:', error);
return customMessage(HttpStatus.BAD_REQUEST, MESSAGE.BAD_REQUEST);
}
}

To setup OAuth, let’s create a function prepareOAuthClient that sets up an OAuth2 client for Gmail API access. It will take the account ID as input and checks if the user's token is valid. If the token is valid, it creates and returns an OAuth2 client. If not, or if there's an error, it returns null.

  async prepareOAuthClient(id: string): Promise<OAuth2Client | null> {
try {
const token = await this.validToken(id);
return token ? getOAuthClient(token) : null;
} catch (error) {
console.error('Error in prepareOAuthClient:', error);
return null;
}
}

Next, we will create a function called syncLatestThreads. This function is responsible for updating the most recent email threads from a Gmail account. It will require the account ID, an OAuth2 client for authentication, and the number of threads to fetch per page. The function first determines the date of the latest thread and then synchronizes threads based on this date. If it encounters any issues during this process, it logs the error and returns an empty array.

  async syncLatestThreads(
id: string,
oAuth2Client: OAuth2Client,
pageSize: number,
): Promise<GmailThreads[]> {
try {
const latestThread = await this.getLatestThread();
const lastThreadDate = latestThread?.date
? new Date(latestThread.date).getTime() / 1000
: undefined;

return this.syncThreadsCommon(
id,
oAuth2Client,
lastThreadDate,
true,
1,
pageSize,
);
} catch (error) {
console.error('Error in syncLatestThreads:', error);
return [];
}
}

To sync older threads, we’ll create a function named syncOlderThreads. This function updates older email threads from a Gmail account, specifically for pages other than the first one. It needs the account ID, OAuth2 client, the current page number, and the number of threads per page. The function finds the oldest thread and its date, then synchronizes threads based on this date. In case of an error, it logs the error and returns an empty array.

  async syncOlderThreads(
id: string,
oAuth2Client: OAuth2Client,
page: number,
pageSize: number,
): Promise<GmailThreads[]> {
try {
const oldestThread = await this.getOldestThread();
const oldestThreadDate = oldestThread?.date
? new Date(oldestThread.date).getTime() / 1000
: undefined;

if (oldestThreadDate) {
return this.syncThreadsCommon(
id,
oAuth2Client,
oldestThreadDate,
false,
page,
pageSize,
);
}

return this.getThreadsByAccountAndPage(id, page, pageSize);
} catch (error) {
console.error('Error in syncOlderThreads:', error);
return [];
}
}

We also need a common function for thread synchronization, named syncThreadsCommon. This function is used for both the latest and older threads. It takes several parameters: the account ID, OAuth2 client, a reference date for synchronization, a flag indicating if the latest threads are being fetched, the current page number, and the number of threads per page. This function first lists the thread IDs based on these parameters, fetches their details, processes them, and then retrieves the threads by account and page.

  async syncThreadsCommon(
id: string,
oAuth2Client: OAuth2Client,
referenceDate: number | undefined,
isLatest: boolean,
page: number,
pageSize: number,
): Promise<GmailThreads[]> {
try {
const threadIds =
(
await this.listThreadIds(
oAuth2Client,
undefined,
referenceDate,
isLatest,
)
).threads || [];

const threadDetails = await this.fetchThreadDetails(
threadIds,
oAuth2Client,
id,
referenceDate,
isLatest,
);

await this.createBulk(threadDetails);
return this.getThreadsByAccountAndPage(id, page, pageSize);
} catch (error) {
console.error('Error in syncThreadsCommon:', error);
return [];
}
}

The listThreadIds function is essential for fetching thread IDs from a Gmail account. It can filter threads based on a date and a flag indicating whether to fetch the latest or older threads. It uses the OAuth2 client for authentication and optionally a page token for pagination. This function returns a list of thread IDs after applying the specified filters.

  async listThreadIds(
oAuth2Client: OAuth2Client,
pageToken?: string,
sinceDateTimestamp?: number,
isLatest?: boolean,
): Promise<gmail_v1.Schema$ListThreadsResponse> {
try {
const gmail = google.gmail({ version: 'v1', auth: oAuth2Client });

const params: ThreadListParams = {
userId: 'me',
maxResults: 50,
pageToken: pageToken,
};

if (sinceDateTimestamp) {
if (isLatest) {
params.q = `after:${sinceDateTimestamp}`;
} else {
params.q = `before:${sinceDateTimestamp}`;
}
}

const response = await gmail.users.threads.list(params);
return response.data;
} catch (error) {
console.error('Error in listThreadIds:', error);
return { threads: [] };
}
}

For detailed information on specific email threads, we’ll create fetchThreadDetails. This function will require an array of thread identifiers, OAuth2 client, account ID, a timestamp for message filtering, and a flag for the latest messages. It processes each message in the threads and returns detailed thread information.

  async fetchThreadDetails(
threads: gmail_v1.Schema$Thread[],
oAuth2Client: OAuth2Client,
id: string,
sinceDateTimestamp: number | undefined,
isLatest: boolean,
): Promise<GmailThreads[]> {
try {
const threadDetails = [];
for (const thread of threads) {
const threadData = await this.getThreadDetails(thread.id, oAuth2Client);
const relevantMessages = this.filterMessages(
threadData.messages,
sinceDateTimestamp,
isLatest,
);
const messages: threadInterface[] = await this.processMessages(
relevantMessages,
oAuth2Client,
id,
thread.id,
);
threadDetails.push(...messages);
}
return threadDetails;
} catch (error) {
console.error('Error in fetchThreadDetails:', error);
return [];
}
}

The filterMessages function is designed to filter messages in a thread based on a timestamp and a flag for fetching the latest messages. It will take an array of messages, a timestamp, and a flag, and will return the filtered array of message objects.

  filterMessages(
messages: gmail_v1.Schema$Message[],
sinceDateTimestamp: number | undefined,
isLatest: boolean,
): gmail_v1.Schema$Message[] {
if (!sinceDateTimestamp) {
return messages;
}
return messages.filter((message: gmail_v1.Schema$Message) => {
const messageTimestamp = parseInt(message.internalDate, 10) / 1000;
return isLatest
? messageTimestamp > sinceDateTimestamp
: messageTimestamp < sinceDateTimestamp;
});
}

For processing an array of Gmail messages, we will develop processMessages. It requires the messages, OAuth2 client, account ID, and thread ID. This function will process each message to extract detailed information and return an array of these detailed messages.

  async processMessages(
messages: gmail_v1.Schema$Message[],
oAuth2Client: OAuth2Client,
id: string,
threadId: string,
): Promise<threadInterface[]> {
return Promise.all(
messages.map(async (message: { id: string }) => {
return this.extractMailInfo(
await this.getEmailDetails(message.id, oAuth2Client),
id,
threadId,
);
}),
);
}

We will also implement two functions for retrieving threads: getLatestThread for the most recent thread and getOldestThread for the oldest one. These functions will return the respective thread or null if none is found.

  async getLatestThread(): Promise<GmailThreads | null> {
try {
return await this.gmailThreadRepository.findOne({
where: {},
order: {
date: 'DESC',
},
});
} catch (error) {
console.error('Error in getLatestThread:', error);
return null;
}
}
async getOldestThread(): Promise<GmailThreads | null> {
try {
return await this.gmailThreadRepository.findOne({
order: {
date: 'ASC',
},
});
} catch (error) {
console.error('Error in getOldestThread:', error);
return null;
}
}

The getThreadDetails function is for fetching detailed information for a specific email thread using the Gmail API. It requires the thread ID and OAuth2 client and returns the details of the specified email thread.

  async getThreadDetails(
threadId: string,
oAuth2Client: OAuth2Client,
): Promise<gmail_v1.Schema$Thread> {
try {
const gmail = google.gmail({ version: 'v1', auth: oAuth2Client });
const params = {
userId: 'me',
id: threadId,
};
const response = await gmail.users.threads.get(params);
return response.data;
} catch (error) {
console.error('Error in getThreadDetails:', error);
return null;
}
}

The createBulk function is responsible for saving a collection of Gmail thread details in bulk to a repository. It takes an array of Gmail thread objects and saves them, completing once all thread details are successfully saved.

  async createBulk(threadDetails: GmailThreads[]): Promise<void> {
try {
await this.gmailThreadRepository.save(threadDetails);
} catch (error) {
console.error('Error in createBulk:', error);
throw error;
}
}

Now we need to create a function named extractMailInfo to pull detailed information from an email message. This function will take a message object, the Gmail account ID, and optionally, the thread ID. It extracts details like subject, sender, recipients, date, body, and attachments. If a thread ID is provided, it includes that as well. It uses other helper functions to extract attachments and headers and then constructs a detailed mail information object. If there's an error during this process, it catches it and returns null.

  extractMailInfo(
message: MessageInterface,
id: string,
threadId?: string,
): ThreadInterface {
try {
const attachments = this.extractAttachments(message);
const headers = this.extractHeaders(message);

const mailInfo = this.constructMailInfoObject(
id,
message,
headers,
attachments,
);

if (threadId) {
mailInfo['thread_id'] = threadId;
}

return mailInfo;
} catch (error) {
console.error('Error in extractMailInfo:', error);
return null;
}
}

We need a function to extract attachment details from email messages, so we’ll create extractAttachments. This function will take a message object and return an array of attachment details, including filenames and URLs. It checks if the message has attachments and, if so, processes each attachment to extract its details.

  private extractAttachments(
message: MessageInterface,
): AttachmentsResponseInterface[] {
const attachments: {
filename: string;
url?: string;
}[] = [];

if (message.payload.parts) {
message.payload.parts.forEach((part: gmail_v1.Schema$MessagePart) => {
if (part.filename && part.filename.length > 0) {
const attachment = {
filename: part.filename,
mimeType: part.mimeType,
data: part.body.data,
url: `https://www.googleapis.com/gmail/v1/users/me/messages/${message.id}/attachments/${part.body.attachmentId}`,
};
attachments.push(attachment);
}
});
}

return attachments;
}

The next function to create is extractHeaders, which pulls header information from an email message. It simply takes the message object and returns an array of all headers found in the message, like subject, sender, and recipient details.

  private extractHeaders(
message: MessageInterface,
): gmail_v1.Schema$MessagePartHeader[] {
return message.payload.headers;
}

Now, let’s build constructMailInfoObject. This function constructs a detailed mail information object. It needs the Gmail account ID, message object, headers, and attachments. It pulls various pieces of information from these inputs, like subject, sender, recipient, date, and attachments, and assembles a comprehensive mail information object.

  private constructMailInfoObject(
id: string,
message: MessageInterface,
headers: gmail_v1.Schema$MessagePartHeader[],
attachments: AttachmentsResponseInterface[],
): ThreadInterface {
return {
account_id: id,
subject: headers.find(
(header: { name: string }) => header.name === 'Subject',
)?.value,
from: headers.find((header: { name: string }) => header.name === 'From')
?.value,
cc: headers.find((header: { name: string }) => header.name === 'Cc')
?.value,
to: headers.find((header: { name: string }) => header.name === 'To')
?.value,
bcc: headers.find((header: { name: string }) => header.name === 'Bcc')
?.value,
date: new Date(
headers.find((header: { name: string }) => header.name === 'Date')
?.value ?? '',
),
body: this.getBody(message),
attachments: attachments,
label_ids: message.labelIds,
thread_id: null,
};
}

For extracting and decoding the body of an email, we’ll implement getBody. This function takes a message object and finds the encoded body. It then decodes this body from its base64URL format and returns the decoded string. If there's an error, it logs it and returns an empty string.

  getBody(message: MessageInterface): string {
try {
const encodedBody = this.findEncodedBody(message);
return this.decodeBody(encodedBody);
} catch (error) {
console.error('Error in getBody:', error);
return '';
}
}

We also need a function named findEncodedBody to locate the encoded body of an email message. It checks whether the body is in 'parts' or directly in the payload. Depending on where it finds the body, it extracts and returns the encoded body string.

  private findEncodedBody(message: MessageInterface): string | null {
if (message.payload.parts) {
const part =
this.findBodyPart(message.payload.parts, 'text/html') ||
this.findBodyPart(message.payload.parts, 'text/plain');
return part ? part.body.data : '';
} else {
return message.payload.body.data;
}
}

The findBodyPart function is essential for finding specific parts of an email message, like 'text/html' or 'text/plain'. It takes the parts array from the email payload and a MIME type to look for. It then searches through the parts and returns the part that matches the specified MIME type.

  private findBodyPart(
parts: gmail_v1.Schema$MessagePart[],
mimeType: string,
): gmail_v1.Schema$MessagePart {
return parts.find(
(part: gmail_v1.Schema$MessagePart) => part.mimeType === mimeType,
);
}
private decodeBody(encodedBody: string): string {
if (!encodedBody) return '';
const buff = Buffer.from(
encodedBody.replace(/-/g, '+').replace(/_/g, '/'),
'base64',
);
return buff.toString('utf-8');
}

For fetching detailed information for a specific email message using the Gmail API, we’ll outline getEmailDetails. This function needs the message ID and an OAuth2 client. It queries the Gmail API for the message's details and returns them. If an error occurs, it logs the error and returns null.

  async getEmailDetails(
messageId: string,
oAuth2Client: OAuth2Client,
): Promise<MessageInterface> {
try {
const gmail = google.gmail({ version: 'v1', auth: oAuth2Client });
const response = await gmail.users.messages.get({
userId: 'me',
id: messageId,
format: 'full',
});
return response.data as MessageInterface;
} catch (error) {
console.error('Error in getEmailDetails:', error);
return null;
}
}

Lastly, let’s describe getThreadsByAccountAndPage. This function retrieves a paginated list of email threads for a specific Gmail account from the repository. It takes the account ID, page number, and number of threads per page as inputs. It calculates the offset for pagination and returns an array of threads for the specified account and page. In case of errors, it logs them and returns an empty array.

  async getThreadsByAccountAndPage(
accountId: string,
page: number,
pageSize: number,
): Promise<GmailThreads[]> {
try {
const offset = (page - 1) * pageSize;

return await this.gmailThreadRepository.find({
where: { account_id: accountId },
take: pageSize,
skip: offset,
order: { date: 'DESC' },
});
} catch (error) {
console.error('Error in getThreadsByAccountAndPage:', error);
return [];
}
}

Now we have operational system to sync our Gmail with local DB. In next story we will setup methods to move mails to/from trash/inbox. As usual, this story code is available on GitHub in feature/email-sync-functions branch. If you appreciate this work, please show your support by clapping for the story and giving star on repository.

Before we conclude, here’s a handy toolset you might want to check out: The Dev’s Tools. It’s not directly related to our tutorial, but we believe it’s worth your attention. The Dev’s Tools offers an expansive suite of utilities tailored for developers, content creators, and digital enthusiasts:

  • Image Tools: Compress single or multiple images efficiently, and craft custom QR codes effortlessly.
  • JSON Tools: Validate, compare, and ensure the integrity of your JSON data.
  • Text Tools: From comparing texts, shuffling letters, and cleaning up your content, to generating random numbers and passwords, this platform has got you covered.
  • URL Tools: Ensure safe web browsing with the URL encoder and decoder.
  • Time Tools: Calculate date ranges and convert between Unix timestamps and human-readable dates seamlessly.

It’s a treasure trove of digital utilities, so do give it a visit!

--

--