Airbyte: Simplifying Data Integration with ‘No-Code’

Poatek
Poatek
Published in
5 min readAug 2, 2023

Hi, everyone! If you love diving into the infinite pool of data as much as I do, then get ready for one of the most tangible tools for this: The Airbyte.

Airbyte is an open-source tool for data integration and management. But don’t misunderstand me: it’s much more than just a data repository. Think of it as a data superhero that fights the messy and chaotic multiverse of information.
Hey, but why Airbyte? Good question. Airbyte is the new kid on the block and is already making a big difference in the data universe. Airbyte allows you to integrate data from various sources, making the data integration process almost as simple as making a cup of coffee!

One of the cool aspects of Airbyte is its ability to perform data integrations in real-time. Yes, data friends, you heard right! No more waiting for batch data updates. With Airbyte, you have access to updated data at the same speed as you have your morning coffee.

Another thing that makes us shoot confetti for Airbyte is its incredible flexibility. It supports more than 50 destinations and native data sources, from SQL and NoSQL databases to SaaS APIs. And if there isn’t a connector for the data source you’re looking for, you can create your own.

“It really seems like a win on all fronts!”

However, as the saying goes, “You can’t make an omelet without breaking a few eggs”. Airbyte, although very promising, is still a project in development. Consider it a rough diamond, full of potential but still requiring a bit of polishing. Nonetheless, the Airbyte team works day and night to improve the tool and add new features.

Airbyte proves that a smart business should be based on precise and up-to-date data. By making the data management process simpler, more flexible, and real, Airbyte promises to give us the competitive advantage we all crave.

So, if you haven’t had a sip of Airbyte coffee yet, we suggest you start now. And if you have any questions or need help getting started, know you can count on me!

Now that you’re interested in Airbyte, how can you take action and start using this ‘no-code’ tool? Here’s a quick tutorial:

Making friends with Airbyte: An introductory guide!

Step 1: Accessing Airbyte

The first step is to visit the official Airbyte website and click on ‘Get Started’. Don’t worry, it’s completely free!

Step 2: Deploying Airbyte

After signing up, you’re ready to deploy Airbyte. This can be done in two main ways: using Docker or Kubernetes. We’ll follow the Docker path as it’s generally simpler.
First, ensure that you have Docker installed. Then, open your system’s terminal and run the following command to run Airbyte:

docker run --rm-v /tmp:/config airbyte/seed:latest

This command will run Docker, and the parameter -v /tmp:/config will create a persistent volume where your Airbyte configuration data will be stored.

💡Note that you can use the cloud version of Airbyte, but this will have a cost

Step 3: Configuring your first data source

Now that you’ve deployed Airbyte, it’s time to add your first data source. On the ‘Admin’ page, click on ‘Sources’ and ‘Add Source’. Below are some of the connectors available on Airbyte.

Choose the data source you want to add from the provided list and follow the instructions to finalize the setup. For this tutorial, we’ll use GitHub as a source.

Setup guide

  1. Name your source.
    2. Click Authenticate your GitHub account or use a Personal Access Token for Authentication. For Personal Access Tokens, refer to the list of required permissions and scopes.
    3. Start date Enter the date you’d like to replicate data from.
    4. GitHub Repositories — Enter a space-delimited list of GitHub organizations or repositories.

Step 4: Adding a destination

After setting up your source, you’ll need a destination for your data. Repeat the previous process, but this time, choose ‘Destinations’ instead of ‘Sources’.

Now, we will select BigQuery as the destination. After that, follow the instructions to finalize the setup.

Setup guide

  1. Enter the name for the BigQuery connector.
  2. For Project ID, enter your Google Cloud project ID.
  3. For Dataset Location, select the location of your BigQuery dataset.

⚠️ You cannot change the location later.

  1. For Default Dataset ID, enter the BigQuery Dataset ID.
  2. For Loading Method, select Standard Inserts or GCS Staging.

💡Airbyte recommends using the GCS Staging option.

  1. For Service Account Key JSON (Required for cloud, optional for open-source), enter the Google Cloud Service Account Key in JSON format.
  2. For Transformation Query Run Type (Optional), select interactive to have BigQuery run interactive query jobs or batch to have [BigQuery run batch queries] (https://cloud.google.com/bigquery)

Step 5: Setting up your first sync

With the source and destination defined, click ‘Set Up Connection’ and follow the instructions to set up your first sync.
1. First select GitHub as source:

2. And then BigQuery as destination:

And finally, select the fields that you want to sync.

Thats it; welcome to the wonderful world of data management with Airbyte!

Well, this was a basic introduction to Airbyte! We did not cover all the details and deep features that Airbyte has to offer, but we hope this is enough to get you started.

To conclude, we can safely say that Airbyte is a sleeping giant that is awakening to revolutionize the way we manage and use data. Get ready to fly high with Airbyte!

Luciano Zembruzki

References:

1. Airbyte official website: https://airbyte.io/
2. Airbyte GitHub repository: https://github.com/airbytehq/airbyte
3. Airbyte documentation: https://docs.airbyte.io/
4. Airbyte Docker Hub: https://hub.docker.com/r/airbyte/seed
5. Tutorial on deploying Airbyte with Docker: https://docs.airbyte.io/tutorials/deployment/docker
6. Airbyte YouTube channel: https://www.youtube.com/@AirbyteHQ

--

--

Poatek
Poatek
Editor for

We’re a software engineering company filled with the best tech talent!📍Porto Alegre, São Paulo, Miami and Lisbon linktr.ee/poatek.official