Mage AI: Building a Powerful Pipeline with MySQL Integration and Email Export (Part 2)

Jai-Techie
6 min readMar 11, 2024

--

Introduction:

In Part 1 of our Mage AI series, we delved into the fundamentals of Mage, providing an overview of its concepts, guiding you through the installation process on your machine, and getting started with the dashboard. Now, let’s take the next step in this journey as we explore the creation of a new pipeline, focusing on three crucial stages: Data Load, Transformation, and Export.

1. Data Load — Connecting with MySQL:

The foundation of any data-driven pipeline lies in the ability to seamlessly load data from various sources. In this case, we’ll be connecting Mage with MySQL, a popular relational database management system. Follow these steps to integrate MySQL into your Mage pipeline:

Configure Database Connection:

  1. I have configured XAMPP on my system and imported a sample database from online. Now, we can dive directly into configuring MySQL for Mage AI.
Sample DB- Classic Models

2. Navigate to Project -> Files -> io_config.yaml, and update the MySQL configs.

io_config.yaml — Mage AI

3. If you are facing a problem with MySQL configs not found, refer to this document (https://docs.mage.ai/getting-started/setup#install-mage-dependencies-optional) to install MySQL to your Mage Project.

pip install "mage-ai[mysql]"

Define Data Load Parameters:

  1. Create new pipeline -> Go to Dashboard -> Click New Pipeline & select Standard (batch) option
Mage AI — New Pipeline

2. You will be redirected to the fresh pipeline as below,

Pipeline — Dashboard

with various options such as Data Loader, Transformer, Data Exporter, DBT Model, Scratchpad, etc. In this article, we will focus on the Data Loader, Transformer, and Data Exporter.

3. Click Data Loader -> Python -> MySQL

Data Loader — MySQL

4. The Data Loader will be created and opened in edit mode.

Now, modify your query since we have already configured MySQL in the main `io_config.yaml` file, and the database will be set as the default configuration.

query = 'select * from products' //Modify to test the MySQL connection

Test and Validate:

1. Run a test extraction to ensure that Mage successfully retrieves data from the MySQL database.

2. Click the Run icon on the top right of the data loader edit panel.

3. Now, you can observe the data loaded from the MySQL ‘products’ table.

Products — Datum

2. Transformation — Data Manipulation:

With data successfully loaded into Mage, the next stage involves transforming and manipulating the data to meet your specific requirements.

Utilize Mage’s Transformation Functions:

  1. Here, I am utilizing the ‘products’ dataset to aggregate by distinct values. Although numerous dataset transformations are possible, for the sake of this demonstration, I am using a small example.

2. A transformation will be created with a template of aggregation code for the action. Below are the modifications I made to obtain the total aggregate value.

action = build_transformer_action(
df,
action_type=ActionType.COUNT_DISTINCT,
action_code='', # Enter filtering condition on rows before aggregation
arguments=['productName'], # Enter the columns to compute aggregate over
axis=Axis.COLUMN,
options={'groupby_columns': ['productLine']}, # Enter columns to group by
outputs=[
# The number of outputs below must match the number of arguments
{'uuid': 'total_aggre_val', 'column_type': 'string'},
],
)

The above change code represents the configuration for a transformation action using a transformer in a data processing pipeline.

  1. action_type=ActionType.COUNT_DISTINCT: This specifies the type of action to be performed, which is counting distinct values. In this case, it's counting the distinct values in the specified column.
  2. action_code='': This parameter is left empty in this example, indicating that there are no additional filtering conditions applied to the rows before performing the aggregation.
  3. arguments=['productName']: This is a list of columns over which the aggregation will be performed. In this case, it's counting the distinct values in the 'productName' column.
  4. axis=Axis.COLUMN: It specifies that the aggregation is done along columns.
  5. options={'groupby_columns': ['productLine']}: This defines the grouping of data before applying the aggregation. The aggregation is performed separately for each group defined by the 'productLine' column.
  6. outputs: This is a list that defines the output of the transformation. In this case, it specifies that the result of the aggregation will be a new column with the UUID 'total_aggre_val' of type 'string'.

Visualize and Preview:

  1. Run a test transformation on the top right
Total Aggregation Value

2. In the above, you can see the aggregation value displayed in the last column.

3. Export Data — Sending via Email using SMTPLib:

The final stage of our pipeline involves exporting the transformed data, and what better way to do this than via email? Integrating Mage with the SMTPLib library allows you to seamlessly send data to recipients of your choice. Here’s how to set up the export functionality:

Create Data Export with a custom template

  1. After performing the aggregation, choose ‘Data Export,’ then select ‘Custom template,’ and create the template with the following content:
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

if 'data_exporter' not in globals():
from mage_ai.data_preparation.decorators import data_exporter

@data_exporter
def export_data(data, *args, **kwargs):
data = data.to_dict()
dataPara = ""
for i in data['productName']:
dataPara = dataPara + '<p><b>Product Name:</b> '+str(data['productName'][i])+ ' <b>Total Aggre:</b> ' + str(data['total_aggre_val'][i]) +'</p>'

smtp_server = "smtp.example.com" # Replace with your server details
port = 566 # Standard SMTP port

# Example usage within your Mage script
subject = "Jai - Mage Build Mail"

# HTML content with styles
message_html = """
<html>
<head>
<style>
body {
font-family: 'Arial', sans-serif;
background-color: #f4f4f4;
color: #333;
}
h1 {
color: #0088cc;
}
p {
margin-bottom: 20px;
}
</style>
</head>
<body>
<h1>Welcome, find below aggregated value for Products.</h1>
"""+str(dataPara)+"""
</body>
</html>
"""

# Create a multipart message
msg = MIMEMultipart("alternative")
msg["Subject"] = subject

# Attach both HTML versions of the message
msg.attach(MIMEText(message_html, "html"))

with smtplib.SMTP(smtp_server, port) as server:
server.starttls() # Start TLS for encryption
server.login("JaiTechie@example.com", "JaiTechie@testPass") # Replace with your password
server.sendmail("JaiTechie@example.com", "JaiTechie@example.com", msg.as_string())

2. Save the changes and proceed to edit your email configuration for sending emails.

Test and Verify:

  1. Execute the ‘Export Data’ block by clicking the top-right icon, and witness the magic as your aggregated values and product names are sent to the configured email.
Email Export Data — SMTPLIB

Finally, we have completed the workflow involving the Data Loader, Transformer, and Data Export with email integration.

Simple Data Pipeline — Tree

4. Configure the Pipeline Triggers:

As explained in Part 1, there are three types of triggers available: Schedule, Event, and API. Here, I am setting a daily trigger for creating an aggregation and sharing the report via email daily.

Creating trigger with daily schedule — Mage AI

Save the changes and enable the trigger, and the pipeline is now ready. It will automatically execute daily, aggregating the data and sending it via email. Additionally, you can monitor the triggers in the Dashboard to track successes and failures. Each block can be configured with retry options and duration thresholds in case of failures.

Conclusion:

In Part 2 of our Mage AI series, we’ve explored the crucial stages of creating a new pipeline, integrating MySQL for data loading, performing data transformations, and exporting the transformed data via email using SMTPLib. By following these steps, you can harness the power of Mage to build robust, efficient pipelines tailored to your specific data processing needs. Stay tuned for the next installment as we delve deeper into advanced features and optimization techniques within the Mage AI framework.

--

--

Jai-Techie

Tech enthusiast and software development, driven by innovation and eager to explore the latest technologies for impactful solutions. 🚀