Synthetic Data Generation using Datalake by Autogon

Autogon AI
3 min readMay 10, 2024

--

Synthetic data generation is gaining traction in the AI world. Acquiring datasets tailored to specific problem statements can be quite challenging. However, creating customized prompts to generate datasets on the fly is a fascinating development.

It’s projected that by 2024, a significant portion — around 60% — of the data used for developing AI and analytics will be artificially produced (the future looks interesting already).

Over the years, various tools have sprouted up to aid the generation of synthetic datasets, from case-study specific datasets to generic ones. However, in this discussion, we’ll be focusing on the Autogon AI approach to synthetic data generation (you can already sign up here to follow through this tutorial)

The Autogon Way : The right way

Autogon AI is a No-code AI tool dedicated to demystifying the process of data analysis and model building. Through its Datalake tool, Autogon AI makes the process of synthetic data generation a breeze. Let’s delve right into it!

You want to head over to https://console.autogon.ai/get-started to access the Autogon AI console and create an account.

Once you’re logged in, your dashboard should look like this.

Click on Datasets, you should be seeing this

Then click on Generate Dataset, you should see this

Here, you can specify a prompt for the description of the datasets and indicate the number of rows you want it to have. You can be as descriptive as possible. For example, you can specify a prompt like “Generate a dataset on bank transactions for 100 customers”.

The data generation would resemble what you see below.

And you can already save the dataset and use it to build models within the Autogon console.

Auto-Generating Datasets via API

This auto-generation can also occur via API, should you wish to integrate it into standalone software or other applications.

API endpoint: https://api.autogon.ai/api/v1/services/generate-data/

To make successful HTTP calls, your Authorization header should include the key “X-Aug-Key”, and the value of the API key is obtained from the Autogon Console. Simply click on the settings icon located in the top right corner of the dashboard.

The body request includes two parameters:

  1. Prompt: Specify the prompt for the generated dataset.
  2. Rows: Specify the number of rows the dataset should have.

The response will resemble the format below, consisting of two major parameters:

  1. Status: Indicates the status of the network request (true for successful, false for unsuccessful).
  2. Message: Provides an attached URL link to the generated dataset.

And viola, your dataset is generated 🎉!

Trust me, dataset generation can’t get any easier. You should give it a try yourself!

Until next time

--

--

Autogon AI

We at Autogon are building an Artificial Intelligence platform for creatives. This is our blog where we share a few thoughts about AI and ML.