Rubens SôtoinTowards DevMastering Spark: Solutions for Data Skewness in Query ExecutionThe definition of Data Skewness is when a given key is unevenly distributed in a dataset. For example the following transactions table:3d ago3d ago
Rubens SôtoData Engineering Newsletter — Week 1In the ever-evolving world of data engineering, staying updated with the latest trends and technologies is crucial. This week, we’ve…4d ago4d ago
Rubens SôtoMy Personal Thoughts About the Present and Future of AIAI is a highly discussed topic, people are constantly engaging in conversations about it. While some are dedicated to creating valuable…May 15May 15
Rubens SôtoHandling Concurrency Problems in Databricks (Delta Lake)Hey Guys,Dec 29, 20231Dec 29, 20231
Rubens SôtoinData HackersStreaming de dados com Kinesis Stream e Kinesis FirehoseNos últimos dias venho estudando bastante a respeito da AWS e acho que a melhor forma de fixar o que eu aprendi e também me aprofundar…May 10, 2020May 10, 2020
Rubens SôtoinData HackersEncapsulando Transformações em DataFrames com PysparkHoje iremos realizar algumas transformações simples em um dataframe e veremos uma forma de deixar nosso código muito mais legível.Apr 18, 2020Apr 18, 2020