Video: Leveraging Databricks AutoLoader: Better Visibility of CloudTrail Logs (Hebrew)

Riskified Tech
Riskified Tech

--

S3 logs generated by AWS CloudTrail provide organizations with essential visibility into user activity and resource utilization within their AWS infrastructure.
However, working with raw CloudTrail logs can be challenging due to their size, complexity, and the need for optimal storage and query performance. Our SecOps team had 180TB of these logs in an S3 bucket, which took forever to query, so they came to us looking for a better solution.

In this talk, we discussed our journey to find the best solution for this problem, and why we ended up using Databricks AutoLoader, an automatic and scalable data ingestion mechanism, to do it.
We talked about the various approaches we attempted to use; AutoLoader with its advantages and features and the lessons we learned along the way.

About the speaker:
Yoni Eilon is an accomplished DBA and Data Engineer with a wealth of experience across various database systems and platforms. His expertise spans from relational databases to NoSQL and cloud-based solutions, as well as DWH systems and Spark. When he’s not helping Riskified manage its Data Platform, Yoni enjoys spending time with his wife and 3 kids, playing basketball and cooking for friends and family.

--

--

Riskified Tech
Riskified Tech

Software Engineering, Research, Data, Architecture, Scaling and more, written by our very own engineers and data scientists.