Introducing Conveyor: Build, Deploy & Scale Your Data Projects

Stijn De Haes
datamindedbe
Published in
6 min readMay 20, 2022

Today, the team at Data Minded is proud to announce the launch of Conveyor, our managed, cloud-based data platform.

Yet another data platform on the market? Well, yes… Yet we believe this one is different. Conveyor takes your data projects from ideation to production.

Unlike many other platforms available out there, Conveyor does not try to do everything in the vast data landscape. Instead, it focuses on removing the infrastructure and process barriers that are typically encountered in the lifecycle of a data project.

“The secret of getting ahead is getting started. The secret of getting started is breaking your complex, overwhelming tasks into Smaller Manageable Tasks, and then starting on the first one”
— Mark Twain

Well, that’s exactly what we did with Conveyor. Conveyor breaks down the overwhelming complexity of building, deploying & scaling your data projects into small, manageable tasks.

The motivation for building Conveyor came from our consulting day-to-day experience. While delivering data products for/with our customers, we found ourselves building common functionalities over and over again. We started compiling all those building blocks (think pipeline authoring, orchestration, runtime, monitoring, CI/CD integration, batch/streaming, cost optimisation, security…) and immediately aimed for a challenging direction:

Our data platform should enable developers, not constrain them. It should encapsulate software engineering best-practices while fostering creativity and freedom. It should allow an organisation to adopt the distributed nature of the Data Mesh philosophy while maintaining central governance.

Conveyor takes your data projects from ideation to production.

Over the last 2 years, Conveyor (formerly named Datafy) has matured significantly, with more than 500 data projects running daily in production at our customers.

Conveyor is built on top of state-of-the-art open source projects and is hosted on public clouds (AWS and Azure for now, more coming).

We distilled the feedback from our customers and shipped many new features in Conveyor, while remaining true to our north star. We are delighted with the result and we think you will be too:

Conveyor is a true game changer when it comes to getting your data projects from ideation to production. It unites an entire ecosystem of modern tools and services into a single, simple workflow for building maintainable and cost effective data projects.

This is the beginning of an exciting journey…

Curious? Read on to get to know Conveyor or start using it now

What’s in a name? Bringing your data projects from ideation to production

To imagine what Conveyor can do for you, we invite you to picture your organisation as a factory.

To build a great product in a reliable and repeatable way, you will need an assembly line. In data/software words, this means everything from data exploration to running in production, from local testing to continuous integration, from machine learning models creation to monitoring results and watching costs. Each step in this assembly process can be seen as a workstation.

Those workstations come in many different flavours as each task at hand requires specialised tools. Data quality, data warehousing, machine learning model repository, … you name it.

But one thing is clear: your data project will need to progress through those workstations in a controlled and reliable fashion, with clear observability along the line.

This is what Conveyor was built for. Just like a conveyor belt in a factory, it takes your data product through its different lifecycle stages. From ideation to production.

Once your data organisation grows in maturity, more assembly lines will be needed. More projects will see the light of day, and more talents will join your teams. For them to bring new projects to life efficiently, they will need structure. Yet they will need to be empowered and remain creative.

This is where Conveyor shines: it will bring the same amount of control and reliability to the new projects as with your previous successes. The same assembly line, yet very different. The data factory manager (CDO, business owner, …) will appreciate the homogeneity and predictability of the factory while the engineers will keep enjoying their freedom.

A conveyor belt for each of your data project…

Making the invisible visible… and then out of the way.

When talking to customers, we see that they are typically at one of these two stages:

  • So far, they have only focused on proof-of-concepts projects and one-offs (data science notebooks, small data pipelines, …), and are looking to bring those to the next level of data maturity. They have not tried to build such a data production line yet, and therefore are not always aware of the difficulty of building a reliable blueprint to reproduce the success of their first projects.
  • They have already tried to build such a data factory, and found themselves having spent a significant amount of time on building infrastructure, APIs and SDKs. To do so, they had to hire data talents that typically have quite different profiles from the engineers who are actually using the platform to deliver the end products. In the process, frustration appeared, both in business and engineering. They spent more time than anticipated on delivering non-business value and at the same time, the engineers building the data products might feel overly constrained by the platform. (If you are interested in this topic, watch our blog as we plan to write about this soon).

In both cases, what happened is that building a data platform involves many invisible steps. They typically appear while building your first few data products:

  • First, you will need to explore data, build some proof of concept (notebooks are a great tool for this).
  • After you have proven the business value of the project ideas, it will then be time to write production-grade code and tests, iterating many times along the way.
  • Once the code is ready, the product will need to run in production. Before you get there, you typically first build and deploy to an acceptance environment where the product can be validated, before being promoted to production.
  • The project code will need to be executed efficiently on a runtime engine, preferably scaling on-demand.
  • Some quality assurance will be required along the way. Once deployed, the data products will need to be monitored, so that efficient troubleshooting can be provided.
  • Last but not least, security concerns come into play, as well as costs considerations.

The list keeps growing…

Building a data platform is a huge undertaking… and can sometimes be overwhelming.

Conveyor is making sure that no stone is left unturned in the lifecycle of a data project. It is making all those steps in the assembly line visible. And yet, it makes sure you don’t have to build them yourself.

> Conveyor makes all the invisible parts of a data factory visible, and manages them for you so can focus on what matters: delivering great data products.

Don’t take our word for it, try it yourself… For free!

We are passionate data engineers who’ve been building data products for the last 7 years. We put all the best practices that we learned along the way in Conveyor. We hope Conveyor makes the journey of other data engineers better, easier and more enjoyable. The good thing is, you don’t even have to talk to Sales to try it out. There is a Free version that can get started with today.

Visit our Pricing page to start your journey now: https://www.dataminded.com/conveyor-pricing

From there, you can install Conveyor on your own AWS account, in a few easy guided steps. Or you can 📕 Browse the documentation.

Looking forward to see you there!

The Conveyor team:

Stijn de haesPascal KnapenPierre BorckmansNiels Claeys

…and the whole Data Minded team!

--

--