Sumair Sayani – Medium

Sumair Sayani

Leveraging from Apache Parquet Predicate Pushdown feature using Apache Spark

Predicate pushdown is a technique used to optimize the performance of queries in systems that use columnar storage formats, such as Apache…

2 min readJan 11, 2023

--

Leveraging from Apache Parquet Predicate Pushdown feature using Apache Spark

--

Sumair Sayani

Apache Spark 3.x — Data Skew Mitigation

Apache Spark is a powerful open-source data processing engine for big data workloads. One of the key challenges that Spark 3.x addresses…

2 min readJan 9, 2023

--

Apache Spark 3.x — Data Skew Mitigation

--

Sumair Sayani

How to Secure Your REST API with RSA and AES Encryption

REST APIs are widely used in modern web development to expose server-side data and functionality to client-side applications, such as web…

3 min readJan 8, 2023

--

How to Secure Your REST API with RSA and AES Encryption

--

Sumair Sayani

AWS Route 53 — Routing Policies

Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. One of the key features of Route 53 is its ability…

2 min readJan 8, 2023

--

AWS Route 53 — Routing Policies

--

Sumair Sayani

Parquet — An optimal data format for AWS Athena

Parquet is a columnar data storage format that is designed to be efficient for both storage and querying. When storing data in a columnar…

2 min readJan 7, 2023

--

Parquet — An optimal data format for AWS Athena

--

Sumair Sayani

Why to run Apache Spark Applications on AWS EMR using Task Instance Fleet?

As a data scientist or data engineer, you know that running Spark jobs on Amazon Web Services (AWS) can be expensive. Not only do you have…

4 min readJan 7, 2023

--

Why to run Apache Spark Applications on AWS EMR using Task Instance Fleet?

--

Sumair Sayani

Best Practices for optimizing Apache Spark Applications on AWS EMR

Spark is a powerful open-source data processing engine that is widely used for big data analytics. Amazon Elastic MapReduce (EMR) is a…

3 min readJan 3, 2023

--

Best Practices for optimizing Apache Spark Applications on AWS EMR

--

Sumair Sayani

Orchestration and Workflows — Apache Airflow vs AWS Step Function

Apache Airflow and AWS Step Functions are two popular tools for managing and orchestrating workflow processes within cloud environments…

2 min readDec 30, 2022

--

Orchestration and Workflows — Apache Airflow vs AWS Step Function

--

Sumair Sayani

Sumair Sayani

Founder and CEO at Algoryne, Data engineering & Cloud Computing Enthusiast.

Following

Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams