Exploring GUI in Event Streaming: Unpacking the Benefits with Cribl Stream

Danny Woo
5 min readJan 31, 2024

--

There are many open-source event streaming solutions available, such as Logstash, Vector by Datadog, FluentD, and many others. If you’ve already integrated any of these into your data environment, you would understand why they are so popular among users. They are easy to implement, there are numerous resources available for reference, and perhaps most importantly, they are free to use.

However, on the other hand, these CLI-based event streaming solutions can be time-consuming when building data pipelines due to their lack of interfaces. They also require in-depth knowledge of each solution to check the configuration options as the versions change. I would like to introduce how these challenges can be mitigated with Cribl Stream.

Introduction to Cribl

[ https://cribl.io ]

Cribl Stream, a GUI-based event streaming solution, provides interfaces ranging from setting up plugins/connectors to data flow monitoring. With support for over 90+ Sources and Destinations, it stands as a highly vendor-agnostic solution. This means it can play a crucial role in collecting data from various sources to populate your data lake. Furthermore, you can design data streaming across a multi-cloud environment. There’s a lot more to it that can make your work more efficient and feasible.

Configuration

Let’s start with configurations. All configurations, including inputs, outputs, and filters (transformations), must be written in text from scratch. The major pain point here is that if there are typos in the options, the system won’t restart. Furthermore, as the number of functions used increases, it becomes harder to grasp the structure. All these configurations are written in a single file, and each module cannot be reused for further utilization.

Cribl provides a user-friendly interface that allows for easy set-up of options. This mitigates the issue of system restarts due to typos in configurations.

[ Sources and Destinations are pre-set. Simply fill in the values for each platform’s options. ]

Cribl is structured in a way that makes it easy to check each function. This ensures that, even as the number of functions increases, understanding the structure doesn’t become more difficult.

[ Pre-set transformation functions can be managed in a list. Simply fill in or select the values for each option. ]

Cribl allows for reusability of sources, destinations, and transformation tasks as they are managed individually. This not only enables users to reuse modules, but also makes it easier to track each module when an error occurs.

[ Errors can be easily tracked due to individual management when data pipeline malfunctions occur. ]

References
https://docs.cribl.io/stream/sources/
https://docs.cribl.io/stream/destinations/
https://docs.cribl.io/stream/functions/

Debugging in Transformation

This is often the most time-consuming part. As long as the data isn’t sent as raw data, the transformation stage is vital for tasks such as masking personal information, reducing the data volume to manage system/license capacity, and parsing data for integration into the data lake.

To debug the transformation, you would typically scrape the incoming data and then upload it to the Playground (this option is usually available on the web interface, though sometimes it may not be). This allows you to debug and ensure each function is correctly applied. Once the transformation is ready, copy and paste the tasks into the data pipeline configuration file and restart the system.

Debugging transformation tasks within Cribl can be done without accessing another web interface or console. It’s possible to monitor how data is consumed at Sources and capture events by number or time. These captured events aid in debugging transformation tasks.

[ Live data from sources can be monitored and captured for debugging. ]

You can debug using the functions that will actually be employed in the pipeline (eliminating the need for copy-pasting), and the transformation of data can be checked using the captured events.

[ Captured data is saved as a file, which can be used to test transformation tasks. ]
[ Apply the transformation task to the file for a pre-check. If it’s suitable, use it in the data pipeline ]

References
https://docs.cribl.io/stream/data-preview/

Data Flow Monitoring

Vector has interface to check the data flow monitoring, it saves a lot of work when checking the data is well consumed and transferred. But mostly It is difficult find a solution with monitoring interface and has to check the destination to verify whether the data is streaming or not

[ You can access detailed statistics from the Monitoring menu. ]

Cribl excels in monitoring data flows, providing an integrated interface for real-time checks. This saves users the trouble of verifying data at its destination. Cribl offers complete monitoring capabilities, including Sources, Destinations, and even Persistent Queues. Plus, its user-friendly interface makes understanding data flow a breeze for all users.

Beyond the advantages I’ve introduced, Cribl offers a multitude of features designed to streamline data management. These capabilities contribute to a seamless and efficient user experience. Check out the link below to explore more features.

Reference
https://docs.cribl.io/stream/monitoring/
https://docs.cribl.io/stream/knowledge-library/
https://docs.cribl.io/stream/notifications/
https://docs.cribl.io/stream/packs/

Conclusion

Comparing Cribl Stream or other GUI-based solutions to CLI-based solutions, as mentioned at the outset, is like comparing flat-pack furniture to pre-assembled furniture. Flat-pack furniture requires manpower and time for assembly, but it’s relatively cheaper than pre-assembled furniture. On the other hand, with pre-assembled furniture, all you need to do is decide where to place it, though it might cost a little more.

A similar shift towards convenience has occurred in the IT environment. It’s not surprising to write code in VSCode or Eclipse these days, whereas in the past, people used VI tools or text editors.

If you’re looking to improve convenience and efficiency, reduce user burden, and save time, Cribl Stream can be a good choice to meet these challenges. And speaking of the advantages of open-source solutions being free, you can use Cribl Stream for FREE up to 1TB/day.

If you have any questions or need further information, don’t hesitate to contact us at cribl@megazone.com or leave your comments below.
Thank you for reading!

#MegazoneCloud #Cribl #CriblStream #CriblEdge #Cribl.Cloud

--

--