What can we do with events ?

Sandeep M D
hubbleconnected
Published in
7 min readMar 13, 2018

It’s not the events of our lives that shape us, but our beliefs as to what those events mean. — Tony Robbins

A significant change in a state is often referred to as a Event. In the context of a Automated System an event is the fundamental driving force for completing workflows. What you want to or do not want to do with your events can easily determine the soundness of a architectural design. This is a comprehensive review of how events play a crucial role in current software architecture and how basing the design on events structure has allowed companies to solve various levels of problems from Data crunching to decision making.

The simplest method of understanding any system is to follow the sequence of events it is publishing or in turn is responsible for propagating within a system. The resulting outcome of such events can be correlated to several entities and an inference can be drawn as to how we can OR cannot use such a system. Eventually this process of identifying and spotting resulting consequences purely driven out of events give us a pathway to take the next step towards how a participating entity can evolve beyond its current state, contribute more, obstruct less and perform better. A systematic approach to such a discipline is referred to as Event Driven Architecture(EDA) and the underlying processes of analyzing recurring stream of information is referred to as Event Processing.

Data analysis has constantly opened up newer parallels of understanding & assessing data and several companies have adopted more than one innovative way to accomplish the task of ‘making sense out of data’. This field has constantly ignited enough interest among the developer community and you will repeatedly see several different frameworks trying to chase this problem using very unique methods of computation. All methods of computation eventually commences at the most simplistic/atomic version of — ‘where do we start’. This being a crucial stage of analysis the patterns adopted might vary based on the type of analysis to be performed or type of data stored. More often than not any Data(Internal/External) is often treated now as possible hotpot of treasure which can lead to newer inventions or possible insights into exponentially increasing product sales/acceptability etc.

So where does events come into this equation? Events data are the primary contributors to the data hotspot referred earlier and contribute to data collection phase in analysis significantly. The fundamental questions around ‘what happens within a system’ and ‘what happens around it’ can be sufficiently answered and analyzed using events data.

Events work flow can be loosely mapped in the following way

  • Event occurs or is published in a system
  • Published event propagates the resulting data
  • Entities receiving resulting data reacts
  • Reacting entities generate Event to create a chain
  • Chain of events terminate within a given context on completion of a task or continue based on set of triggers/programs on state change consecutively
Event Work flow

To companies which are chasing accuracy in understanding information, a constant stream of events data, historical Events data and analyzed past Events data, form a big chunk of building blocks in capturing the state change of System entities in context. Basing the systems understanding on real time information available through event driven structure gives several advantages in maintaining architectural agility and by following the state changes, you have the opportunity to constantly learn the behavior of the components of the system, leading to automation and incremental improvement in performance of the components.

The information around state changes of entities will in-turn equip systems to

  • handle distributed work flows
  • make definite decisions on what change each state brings in
  • real-time analytics
  • better fault detections/predictions
  • trigger action programs

In a broader view of technical implementation patterns the events are processed using three fundamental methods

  • Simple Event Processing: Maps to real time flow where event concerning a state change published is stored and information is given out to the concerning entities. Being a simple work flow it has advantages in reducing lag time for response.
  • Event Stream Processing: Designed for real time decision making based on constant stream of data getting into a system.
  • Complex Event Processing: Builds pattern on recognizing an incoming event as complex. It is designed to take action on the pattern analyzed which basically begins with correlation using the data available. Anomaly & Threat detections are possible use cases for this method.

Looking at some of the interesting use-cases around you get a better perspective of how events data have been used by companies to detect frauds , predict opportunities , analytics etc. These use cases delve into multiple domains and basically push for continued development in this area.

Spotify’s Event Delivery system

Spotify Labs has detailed out their entire event delivery system as part of their technology blog and the system architecture speaks volumes of the work Spotify has been doing in the area of event delivery systems. As specified in their reference the event delivery system which they have built has many use cases and forms the backbone of the services they provide on the Spotify platform to look at the usage data and help developers make decisions on improving the service availability. Discover Weekly, Spotify Party, Billboard are features listed which are directly or indirectly using this system.

Key takeaway from their architecture is — it is designed to achieve a scalable structure for ensuring every event is logged and information from every event is utilized to improve on the systems efficiency. Failure tolerance, Timed data sharding, Latency control and emphasis on data completeness are key points to consider from the case study.

Ref : https://labs.spotify.com/2016/02/25/spotifys-event-delivery-the-road-to-the-cloud-part-i/

MapR Stream processing

MapR is a data platform that provides multiple tools to engage use cases around Storage, Management, Disaster Recovery, Analytics and Data Science. This use case has been specifically discussed in the context of IOT for real time decision making and applies to multiple production use cases where real time decisions play a crucial part. Complex Event Processing(CEP) built here with Kafka , Drools and MapR streams provides a good reference model to creating a streaming architecture.

Key takeaway from the reference architecture is CEP and its role in building decision making systems using rule engines. Decoupling services and using microservices pattern to achieve a streaming architecture as part of CEP.

Ref : https://mapr.com/blog/better-complex-event-processing-scale-using-microservices-based-streaming-architecture-part-1/

Hubble Event processing

Hubble is a Cameras based IOT PaaS platform. The platform specifically deals with handling over 100000 RPM (requests per minute) in terms of events data and has consistently maintained over 99% uptime as part of the services it offers.

Event Processing as part of the platform has gone through multiple iterations starting from MySql based ingestion platform to Cassandra based time-series system. Currently we are exploring including Dynamodb in place of a Cassandra/Redis system as we repeatedly found managing a 15–20 node Cassandra cluster with varying data TTL’s as a pain in operations.

Hubble’s event processing engine is a ingest based engine which basically reacts to each event trigger and spawns multiple processors to perform respective programs. The entire engine runs on Cassandra for recurring data storage, Redis for caching & DynamoDB for summarization & analytics related data storage. Its a hybrid of decoupled individual services and pipeline model of service pattern.

Primarily the Hubble Event Processing engine goes through following set of processors and provides the resulting output to User

  • Event Storage & Retrieval: A simple event processing block which stores sensor triggered event and provides API’s to retrieve events based on certain filters of Events data.
  • Event Summarization: 5 mins Hourly & Daily Summarization of Events data is done in this processor. This data is, and can be presented to user depending on the use case of user feature exposed. For eg. Providing Daily metrics of a particular sensor alert of the day , “Busiest part of day”, “Hottest part of the day” etc.
  • Insights: Based on certain business rules & incoming sensor alerts we provide insights on pattern recognized for certain data. The data here is first generated based on predefined pattern and then presented to the User. For eg. Provide how many hours of the day the Baby was not active(asleep), Provide what parts of the day Baby was active etc.
  • Specific Analytics Triggers: Trigger hourly/daily Video Summarization for alerts which send videos to the cloud. Summarized videos are presented to user as Summary consumable along with metrics.
Hubble Event processing Framework

Learnings from the activities in this area

  • Events Data needs a flexible timestamped data model for various applications. A rigid data structure is a problem.
  • No single Database/Datastore might help design a complete Events data backend for ingest, Retrieval and Analysis. Use cases will need a hybrid design most of the times.
  • Fault Prediction, Tolerance & Repair are essential part of the engine. ‘Any system that can go down , will go down’ — if you are not prepared for this then downtimes are inevitable.
  • A definitive process of Backing up & Restoring is crucial to failure mitigation and recovery. Data loss is a critical problem and by all probability unacceptable.
  • Test, Test & Test — Production workload based testing determines the performance of event driven systems without which a sudden burst or uneven traffic breaks the work-flows designed on business rules.
  • Events data correlation gives crucial insights into Behavior of entities.
  • Be very clear about what data is worth deleting and what data is worth preserving — A lot of data getting into system is noise, an effective way to store only what is necessary goes a long way in saving long computational cycles to draw insights from your preserved data.

Where do we go from here?

CEP with streaming component is the next architectural change we are currently working on. Events correlation, real time analytics, anomaly detections and media analytics are the use cases we are chasing with the change in the design. As we get deeper into these problems we are pretty sure we will have plenty to publish. This is a introductory info on our work and we will keep following up with more data on our work in this area.

--

--