Apache Kafka at Porsche — Writer meets car manufacturer

Heiko Scholtes
PorscheDev
Published in
6 min readJul 15, 2017

“Roads grow out of going them!” — Franz Kafka

As part of our digital transformation at Porsche the mindset of the people plays the most important role. But agility, speed and organisational culture must also find its way to the product teams and sooner or later we talk about speed in development teams.

In this article I will describe how Apache Kafka helped us to decouple microservices and led to a more independent software development process by dissolving dependencies between both organisational units and technical components. The result is an even better velocity in the teams and an even more reliable event backbone.

Circumstances

You might think of delivering software in ONE development team is not a big deal? You are probably right when the following requirements are met:

  • the team understands the requirements and acceptance criteria that are described in the epics and user stories
  • the scrum roles are fully established
  • the team members trust each other and have a good personal relationship
  • the team is passionate
  • the team uses the technology stack on which it performs best
  • no dependencies to other teams

Basically, this is an ideal state and during my 12 years experience as Project Manager I have never seen such a perfection in regards to these circumstances.

Think of complex Connected Car projects: beginning from R&D, Electronic Control Units (ECUs) in a vehicle up to Vehicle Documentation Systems, various After Sales and Sales Systems (CRM System, Identity & Access Management Services, Shop Systems, License & Activation Management Systems for Connect Services, Picture Services, Messaging & Communication Services for customer notifications, etc.) to only mention a few of them. There are massive complexities and huge dependencies between various development teams.

In 1967, Melvin Conway was right when he said that the complexity of the IT-landscape strongly depends on the complexity of the company’s organisation and communication structure (Conway’s law). The more organisational units the more heterogenous IT systems.

The complexity mentioned by Conway posed a challenge to our development teams: the vast amount of dependencies between legacy systems, microservices and their organisational ownerships led to communication overhead, many meetings and too many bounded developers for harmonising both interface integrations and test(-data) management between these systems.

Long story short: the dependencies between teams and organisational units grew and team members were most of the time arguing in meetings than developing new features.

The result: the development teams’ velocity slowed down.

For sure you can rethink the size of your microservices and shift its organisational ownership to improve the communication channels and hence speed up the development of components. But you definitely want to avoid that technical components are tailored to particularly fit to organisational units that either might change in future and/or the component’s ownership will eventually change. Remember Conway’s words!

Welcome Apache Kafka

The point at Porsche came pretty early when we had to decide how to solve these dependencies on a technical layer.

At the same time we talked about „process-like“ microservices that encapsulate the overall business process and in turn orchestrate downstream microservices. But this kind of architecture collapsed already on the question of the product ownership of that „process-like“ microservice since processes usually span business domains.

Additionally, we doubted that introducing a new microservice orchestration layer will really speed up our development process since it increases the complexity, apparently.

So, we decided for a hybrid event-driven architecture. What does that mean?

Actions, triggered by the frontends (e.g. Portal- or Webshop-based-microservices) always result in synchronous transactions and synchronous communication with downstream components. This synchronous communication pattern always guarantees that the Frontend-Users get a meaningful response to their CRUD (Create, Read, Update, Delete)-operations and data consistency in frontend widgets is ensured during navigation.

The synchronous main transaction (e.g. establish/release car relationship between owner and vehicle) either succeeds or fails, i.e. there are no doubted transactions.

Microservices themselves might initiate several synchronous calls to downstream components and are hence responsible for the consistency of the data that are in turn provisioned back to the frontend user. But this is all done in a synchronous way.

The synchronous Create, Update and Delete main transaction usually results in a changed state from a business perspective, e.g.:

  • a car relationship between owner and vehicle released successfully compared to the previous state when the owner-vehicle-relation existed
  • a connect service was successfully activated in a vehicle compared to the previous state when the service was not available in this specific vehicle

In these cases the microservice that successfully finished the main transaction publishes a document-based “state-changed-event” which is then subscribed by various microservices. The microservice that initiated the main transaction is out of responsibility after having published the “state-changed-event”. Furthermore, the publishing microservice (event producer) does not even know if — and if yes, — which microservices subscribed to the “state-changed-event”.

Subscribing microservices (event consumers) are now in charge of consuming this event and of processing it in a new transaction. The event consumers act independently and are self-contained from both a technical and business point of view.

Between event producers and event consumers, Apache Kafka is established as publish/subscribe event backbone.

The big advantage: both the technical components and hence the development teams are decoupled in regards to the „state-changed-events”. This circumstance finally lead to faster test results, less integration efforts and hassle free integrations with help of the publish/subscribe pattern.

Why Apache Kafka?

Apache Kafka is an open-source distributed streaming platform that we evaluated last year in a three-months Proof-of-Concept (PoC) phase. We decided against other queuing platforms since Apache Kafka offers more than just realising the above described publish/subscribe pattern.

The main benefits for our use-cases were

  • continuously running real-time query capabilities
  • excellent topic/partition management
  • very high throughput
  • low administrative overhead
  • many settings are decentralised, i.e. the event consumer decides what rollback, timeout, etc. behaviour fits best for its specific use-case
  • both decentralised settings and administration capabilities fit best for a DevOps approach
  • Open-Source Software (-license)

After the successful PoC we decided that Apache Kafka is the perfect match for our event backbone: it decouples our microservices, is easy to handle and — most important — helps to speed up the overall development process. Additionally, it is a future proof streaming platform for both vehicle events and event analysis.

Furthermore, we decided for Apache AVRO as data serialisation system and exchange format in Kafka messages and checked-in a blueprint consumer and producer code in our bitbucket for the sake of re-use in other development teams.

After six months running a clustered Apache Kafka in production more and more development teams within Porsche but also from other corporate group brands ask for experiences with the platform and I usually highly recommend this setup.

In some cases the use of traditional queuing systems makes definitely more sense and should be taken into consideration but in our case at Porsche Connect, Apache Kafka was the first choice.

What comes next?

We are just in the process of establishing domain-specific (e.g. After Sales, Sales, etc.) Apache Kafka hubs at Porsche and interconnect these hubs with help of Apache Kafka’s Mirror Maker. This will offer new scalable and flexible possibilities across business domains especially if you think of moving some domains into the cloud.

The journey has just begun. Come along with us on this journey.

--

--