PinnedPublished inAirtel DigitalExtended Predicate Pushdown in Spark with Apache ORCWhy It is important, How we achieve it and performance enhancementsMay 25, 2022May 25, 2022
PinnedPublished inAirtel DigitalDesigning and Scaling a PetaByte Scale System — Part 3: Even More Optimized ORCAt Airtel, when it comes to data storage, we have unique scalability challenges, As we receive 40’s PB (5 PB compressed) per year as Data…Jul 17, 20201Jul 17, 20201
Published inAirtel DigitalFaster transforming datasets with Spark Internal RowWhile working on large dataset our goal is to work with best possible efficiency using optimized code and save cpu as much as possible.Jul 1, 2020Jul 1, 2020
Faster transforming dataset with Spark Internal RowWhile working on large dataset our goal is to work with best possible efficiency using optimized code and save cpu as much as possible.Jun 28, 2020Jun 28, 2020