Event Streaming and Journey Tracking at Bazaar

Muhammad Zuhair
Bazaar Engineering
Published in
4 min readMar 24, 2023

There are hundreds of Bazaar Shaheens (Riders) who serve thousands of bazaar customers everyday performing several activities like support, platform encouragement, service accessibility, etc, for customers. Bazaar Shaheens use Bazaar App to assist the customers. Therefore, to ensure that the customers are facilitated everyday, Bazaar uses Shaheen’s phone GPS data to track and record their daily journey with timestamp.

Purpose of Shaheen Tracking

One of the main purposes to implement this feature is to get the insights about the bazaar customer app’s usability, engagement, accessibility, interaction, etc. Moreover, this feature constitutes many other benefits such as Shaheen’s daily journey, customer retention, platform adoption, system support, reward system for best performing Shaheen, optimisation of Shaheen journey routes, etc.

Initial Challenges And Their Resolution

1. Data Collection Challenge:

Since the location recording involved Shaheen’s phone GPS so we faced the challenge of noise in data. In order to resolve it we passed these raw locations to an algorithm that transformed them into more accurate data points.

2. Data Storage Challenge:

Another challenge was to store huge amounts of data points along with their timestamps in the database in an effective manner in order to support efficient data queries. For this purpose, we explored both SQL and NoSQL approaches, evaluated them on the basis of storage and query costs with respect to our use case. Eventually, we decided to store the data points of a Shaheen’s single day journey in a single file in JSON format.

Implementation #1

The First Solution:

The first version of Tracking involved collection of location data points from the bazaar App and then its persistence on a scalable Database at the backend. Thus, the concept of Database dump was a key factor in choosing the commensurate database.

Figure 1: Architecture diagram of first version of Shaheen Tracking

Bazaar App:

Firstly, we implemented the feature of location tracking in the Bazaar App. It was comparatively an easy part as we had to integrate Bazaar’s open source location SDK in the bazaar App. Secondly, we stored the recorded data in the local Database of the App and then scheduled a sync job that triggers a backend api after a specific interval.

Backend:

As soon as the Bazaar App calls the sync api, the collection of location objects is populated with the requested data points and then is persisted to the database using the relevant persistence methods.

Challenges after the first implementation

We faced an issue of non appendable objects which was resolved by refactoring logic and checking if the requested object already existed and if yes then retrieve its content, append it in the new file and then persist it to the designated memory block in the database after the JSON conversion.

Another challenge faced was the inconsistency of data type as many of the database persistence methods do not return JSON and we wanted to store data in JSON format. To solve this, we converted the object content into JSON by using Gson package and finally appended it back to the JSON file. This resolution provided us an advantage of querying out data efficiently.

The Final Solution

The final version featured an introduction of the Event Streaming concept at the backend to fulfil the ideology of building scalable solution.

Event Streaming and Apache Kafka:

Event Streaming means a constant flow of data, each containing information about an event or change of state. Apache Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol. It is primarily used to build real-time streaming data pipelines and applications that adapt to the data streams.

Kafka Producer and Consumer:

Kafka producer is the source of data in the kafka stream as it publishes the data to one or more topics in the kafka cluster. Kafka consumer are applications that subscribe to kafka topics in order to retrieve data from kafka servers.

Introduction of Event Streaming at Backend:

In the final phase of this feature, we introduced Apache Kafka at the backend. As soon as the sync api is triggered, the producer produces an event that contains the information about Shaheen’s location topic in the kafka event stream which is then listened by a listener/consumer in the same service. Consequently, the event is processed through its associated handler and the location data points are persisted in the designated memory space in the database.

Figure 2: Architecture diagram of kafka version of Shaheen Tracking

Conclusion

The roll out of the final version of this feature helped us in analysing the impact of Shaheens on customers, the journey of Shaheens, bazaar customer app’s usability, accessibility, and eventually solved many other use cases as a by-product. The insights of this feature will help us in making better decisions that will eventually contribute to the cause of best customer experience.

Disclaimer:

Bazaar Technologies believes in sharing knowledge and freedom of expression, and it encourages it’s colleagues and friends to share knowledge, experiences and opinions in written form on it’s medium publication, in a hope that some people across the globe might find the content helpful. However the content shared in this post and other posts on this medium publication mostly describe and highlight the opinions of the authors, which might or might not be the actual and official perspective of Bazaar Technologies.

--

--