Celebrating 3 Years of Trapheus with HacktoberFest

Rohit Kumar
Intuit Engineering
Published in
3 min readOct 2, 2023

--

In 2020, we wrote a blog post about releasing Trapheus to the open source community. We described how Trapheus automates data storage and restoration process, a problem our team had tackled internally and designed with the help of AWS step functions to work with any relational database out there. We knew that others could benefit from a more generic approach.

Within five months, Trapheus was featured in AWS open source blog:

Designing simple automated solutions to solve hard problems

Looking back at the early days, we’re proud of the fact that durability, resiliency, retry mechanisms, efficient logging, role-based access — and security — were designed in from the beginning. We debated many design options, ultimately concluding that we needed to build such a tool, and to make it easy for developers to use it by providing proper hand-holding and documentation.

When we open sourced Trapheus, we had no idea how the community might react to it, as everything on the cloud was ever-changing and there were multiple ways to achieve the same goals. Three years later, Trapheus continues to power and inspire developers around the globe (like jamsan920), and our continued engagement with the community has been equally inspiring to us on our open source software journey.

Evolution of Trapheus over time…

Choosing favorites from a long list of features is difficult, but here are a few of the most advanced:

  • Minimal LaMBDA configurations without a VPC (virtual private cloud)
  • Exporting snapshots in Apache Parquet to Amazon S3
  • Support for sending Slack notifications on failure
  • Support of parallel execution of the state machine for optimisation
  • Scheduling capability for the state machine to trigger later
  • Support for isolated snapshot creation

Sneak peek into the future: TrapheusAI

We continue to profile, optimize and review every part of Trapheus and the developer community has spoken yet again. Today’s AI-driven world brings interesting opportunities in every field — and data is still at the center of everything.

We see three trends converging:

  1. An increase in large amounts of unstructured data from the internet is leading to huge advancements in unsupervised training,
  2. The evolution of model architectures towards transformer neural networks, which can take context into account
  3. The emergence of domain hardware, which is making it possible to build a deep learning neural network in commodity hardware and chipsets.

With those trends in mind, we’ve launched TrapheusAI — a next-generation data search and analysis assistant that can overcome the challenges of traditional keyword search with powerful data analyses using natural language.

We’ll monitor this space deeply and drive innovation with the open source community. If this inspires you and you want to contribute back, please check out our contribution guidelines, or see Intuit’s Open Source website for more about how we’re getting involved in Hacktoberfest this year.

--

--