Manushree Gupta – Medium

Manushree Gupta

Manushree Gupta

AUTO LOADER vs COPY INTO in Databricks

Auto Loader in Databricks is a feature designed to efficiently and incrementally load new data files as they arrive in cloud storage. It…

4d ago

4d ago

Manushree Gupta

Use Delta Lake in Azure Databricks

Delta Lake is an open source relational storage area for Spark that you can use to implement a datalakehouse architecture in Azure…

Aug 24, 2023

Aug 24, 2023

Manushree Gupta

Lambda Architecture in Big Data World

Lambda Architecture is a data processing architecture designed to handle both batch and real-time data streams while providing fault…

Aug 22, 2023

Aug 22, 2023

Manushree Gupta

Spark Optimization Techniques

Today I am covering very important feature of Spark. i.e., Spark Optimization.

Aug 17, 2023

Aug 17, 2023

Manushree Gupta

Spark DAG Visualization

When I was new to Spark, I struggled a lot to understand the DAG and how to monitor spark running job and then after spending multiple…

Aug 7, 2023

Spark DAG Visualization

Aug 7, 2023

Manushree Gupta

CI/CD Pipeline: Continuous Integration/Continuous Delivery

AWS developers use these CI/CD pipelines to manage and automate application deployment.

Aug 7, 2023

CI/CD Pipeline: Continuous Integration/Continuous Delivery

Aug 7, 2023

Manushree Gupta

ADLS Gen1 vs ADLS Gen2

Helloo everyone…

Aug 7, 2023

Aug 7, 2023

Manushree Gupta

DataLake vs DataWarehouse

I am sharing below the differences between Datalake and Datawareshouse on very high level.

Aug 7, 2023

Aug 7, 2023

Manushree Gupta

Hadoop file formats

We have different file formats supported in Hadoop file System. Lets see the difference between few of them.

Aug 7, 2023

Aug 7, 2023

Manushree Gupta

Reading & Storing JSON Data in Hive

Introduction

Aug 7, 2023

Aug 7, 2023

Manushree Gupta

Manushree Gupta

Big Data Enthusiast | Senior Data Engineer & Consultant | Techincal Architect

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams