Flink Forward Berlin 2017 — An overview

Bertjan Broeksema
bigdatarepublic
Published in
5 min readSep 18, 2017

A small delegation of BigData Republic consultants went to the Flink Forward conference in Berlin. One of the the key statements in this conference which resonated was: “ING is an IT company with a banking license.” In a world where data is increasingly important to keep ahead of competition, realizing that excellent IT infrastructure and data-driven software is key to realize a data driven vision. This realization also became clear in the statement to view Flink Jobs as applications on their own, delivering business value. Most talks were more technical and did not always have a clear pointer to the actual business value delivered. In this blog we give a brief overview of what we view as the highlights of this conference.

Kappa Architecture

Stephan Ewen from dataArtisans gave an update of how Flink has developed the last past twelve months. He went on to describe three phases he recognized in distributed application development:

  1. Classical tiered architecture. One of the shortcomings of this approach is that infrastructure is shared, and centrally managed. This leads to slow processes when it comes to data management (e.g. copying over data for a test, extending schema’s).
  2. Microservice architecture. Improves the earlier situation as control over infrastructure is now decentralized. This makes it easier to adopt parts of the application landscape to new data requirements. It still involves managing databases though, making every dev(ops) a mini-DBA.
  3. Stateful stream processing architecture. Or micro services on steroids as he likes to call it. State is now part of the application (i.e. in memory or abstracted away by the streaming platform).

In the third approach compute and state are tightly coupled, and modifications are local to the process. Asynchronous writes of large blobs across tiers are made to keep track of state in the streaming platform. This opposes more traditional synchronous read/writes across tiers.

Still, streaming applications are challenging when it comes to topics as consistent stateful upgrades, migration of application state, and schema evolution. Here came one of the more interesting announcements: dataArtisans has been working on the dA Platform, which can be used to launch, monitor and update streaming job applications. The brief demo was quite interesting and impressive. This is definitely a piece of technology to keep an eye on if Flink is or will be part of your infrastructure.

Key take away:

  • Infrastructure becomes more reliable and fault tolerant, making a more natural approach of dealing with streams of data possible.

Recommended talks to watch:

Flink in containers

There are many advantages for running Flink in containers. It allows for efficient dynamic resource allocation and scaling and it makes the deployment more application oriented instead of machine oriented. As such there was some attention for different ways of running Flink Jobs in containerized environments.

Dominik Brun from Relayr talked us through dockerizing Flink jobs for reproducible builds. He started with some requirements which are common for many environments: A single deployment artifact, configuration from an external source (i.e. not part of the build) for deployment in multiple environments, and deployment to a YARN managed cluster. One observations was that Flink jobs behave like a service, and as such it makes sense to unify the packaging of standard micro services and Flink jobs. This has obvious practical benefits as the same CI/CD infrastructure can be used for deploying both kind of artifacts.

Patrick Lucas from dataArtisans gave an update about the state-of-the-art related to running Flink on Kubernetes. One of the interesting aspects he describes is the change in the preferred way of running Flink: from one big Flink cluster running all jobs, to a Flink cluster per Job. This really makes Flink jobs first class citizens in your architecture. This shifts the attention from caring about infrastructure, to caring about Jobs which implement actual business logic and value and delegate the task of resource management to a platform that is good at this, such as Kubernetes. Patrick Concludes his talk with the observation that there are still a number of issues to overcome to make Flink running on Kubernetes straightforward. However, as it fits the application oriented approach, it seems that the Flink community is pushing towards this direction. This is also illustrated by the

Key take aways:

  • Treat Flink jobs as single applications, which implement part of a business process and deliver business value, as opposed to focus on Flink infrastructure as a purpose in itself.
  • If your streaming job, quacks like a service, looks like a service, and behaves like a service, then take some effort to deploy it as a service.

Recommended talks to watch:

Complex Event Processing with Flink

One area in which progress was made last year is Complex Event Processing (CEP) on streams. CEP allows for detection of event patterns over continuous streams, even when events arrive out of order. Flink now provides APIs to specify simple patterns and combine those in more complex ones. These patterns can be based on event properties, such as the type of the event or the value of a given properties. Patterns can also be based on time windows, e.g. event of type B should follow event of type A within 24 hours. The patterns allow for non-determinism (e.g. some specified end-event may never occur). This needs to be kept in mind and actively countered by restricting the pattern with e.g. a time window. Furthermore work is going on to integrate CEP features with the SQL functionality of Flink. Another limitation that is being worked on is dynamic patterns. This would allow to specify new patterns as stream, to adapt the pattern matching of a Flink job at runtime. Although it has more limited functionality than applying a regular expression or writing custom business logic, it is fully integrated in Flink’s state/checkpointing backend, allowing it to continue where it left off in case of application failures.

Key take away: FlinkCEP provides reasonable mature functionality to detect and act upon patterns in event streams.

Recommended talks to watch:

Conclusion

A number of talks have shown to us that Flink is a production ready technology for dealing with streams in a real time fashion. Clearly there are technical hurdles to take when streaming data is not yet part of your application landscape. Streaming data requires a different way of thinking and also requires a different kind of technology stack. Flink and the just announced dA Platform, are certainly pieces to have a look at when incorporating streaming data in your application landscape.

--

--