Introducing Live Data Products with Quick
Today, data is ubiquitous and since a while it is the driver of new businesses or business models in existing organizations. There is a wide range of vendors and tools enabling the community to create data-driven solutions. Traditional relational technologies further advance and modern NoSQL or Big Data stacks mature and can be found in production in many places. Lately, stream data processing came into play and is being adopted in many organizations — some of which have progressed quite a lot while others are still at the start.
At bakdata, we focus on stream data processing and Apache Kafka in particular. We are happy to frequently share accomplishments and lessons learnt here on Medium and to contribute back to open source. On our GitHub you can find many projects that enable us (and you) to efficiently create, re/deploy, test, and govern stream data applications and also deal with data quality issues.
With such projects and, of course, a selection of other open source tools, we build stream data platforms that empower organizations to create modern data products, which integrate current data at any point in time, so-called Live Data Products.
Delivering Live Data Products can be challenging: A common approach is to feed integrated data into some external store or to build custom Backends for Frontends (BFFs) with APIs. From there, data is consumed via the store’s or API’s specific interface and often re-combined and converted into its final representation (for example JSON) that fits the final consumption.
What do we need to successfully create a Live Data Product?
We all know that there are many steps to be taken including the final data integration and delivery to your frontends or devices. Diverse approaches have to be evaluated, selected and implemented. Building all this from scratch is associated with many risks. Of course, initial steps can be taken easily. However, over time requirements change, an implementation grows and becomes very complex, leading to re-engineering again and again and cleaning up technical debt. Most of us had to experience that this distracts engineering teams from focusing on further progress that creates value.
With Quick, we streamline the development of Live Data Products. It frees developers from boilerplate development and lets them focus on data.
You can use Quick as the foundation for your Live Data Products. Quick …
- receives and integrates your data streams.
- runs and maintains your applications on your data streams.
- exposes production-ready APIs and connects your data streams to your frontends and devices.
- let’s you design your API backends and query your data with GraphQL.
- handles live data re-combinations and conversions.
- implements GraphQL subscriptions such that your frontends and devices are always up-to-date.
Build your first Live Data Product with Quick!
The Quick documentation contains references and several examples such as a music streaming service: Here, we have artist, album, and track information as master data. Also, there are events, when a user listens to a song. This simple case already introduces complexity: APIs must be created, master data and events have to be combined, analytics and recommendations shall be integrated.
With Quick you can manage this complexity easily by designing a global GraphQL schema providing all this information. Note that this schema provides the single source of truth for your Live Data Product. The schema can evolve over time. You can transparently track it via GitOps.
Quick allows for an easy data ingest via REST APIs. Ingested data is stored in topics and can be used in query definitions. The snippet below shows two queries from the music streaming service. The returned UserProfile contains data from several topics integrated via the @topic directive. The getArtistRecommendation query integrates a recommendation service via the @rest directive. Data integration is done by Quick under the hood. IDs are resolved transparently.
No coding needed. Maintaining your Live Data Product is as easy as maintaining the global GraphQL schema.
We created Quick as the essence of many data engineering projects we did with our customers. It contains the combined experience and countless engineering hours that you can save.
We recently open-sourced Quick. You can get it from GitHub and give it a try. We appreciate issues and pull requests and any type of feedback. Feel free to approach us. We are happy to discuss your challenge and we follow a roadmap that reflects customer needs.
In the future, the bakdata team will collect more open source projects under the datanonstop umbrella. So follow us on d9p.io.
Disclaimer: I’m a co-founder of bakdata GmbH and happy to work with a strong engineering team on great projects like Quick.
Update as of August 2023
We take feedback seriously and therefore rethink Quick. As a conclusion we will shift from applications to composable libraries. For the time being, there will be less activity in the Github repos.
In the future, we focus on providing the tools to integrate live data from Kafka in your GraphQL server. Libraries will offer features like Quick does today, but allow for a more seamless integration with custom approaches. There will be libs to create different indexes on data in Kafka. Others will apply GraphQL to indexed Avro or Protobuf data. Finally, there will be clients in Python and Java to smoothly leverage all this in your custom approach.