Manushree GuptainTowards DevCompare Git Commits :Step-by-Step Guide to Compare the Current Git Commit with the Last Successful CommitJul 26Jul 26
Manushree GuptaLeveraging DataOps in Existing PlatformAs data engineers, we often neglect best engineering practices, instead we work on fixes and adhoc processes that do not scale.Jul 17Jul 17
Manushree GuptainTowards DevAUTO LOADER vs COPY INTO in DatabricksAuto Loader in Databricks is a feature designed to efficiently and incrementally load new data files as they arrive in cloud storage. It…Jul 91Jul 91
Manushree GuptaUse Delta Lake in Azure DatabricksDelta Lake is an open source relational storage area for Spark that you can use to implement a datalakehouse architecture in Azure…Aug 24, 2023Aug 24, 2023
Manushree GuptaLambda Architecture in Big Data WorldLambda Architecture is a data processing architecture designed to handle both batch and real-time data streams while providing fault…Aug 22, 2023Aug 22, 2023
Manushree GuptainAWS in Plain EnglishSpark Optimization TechniquesToday I am covering very important feature of Spark. i.e., Spark Optimization.Aug 17, 2023Aug 17, 2023
Manushree GuptainAWS in Plain EnglishSpark DAG VisualizationWhen I was new to Spark, I struggled a lot to understand the DAG and how to monitor spark running job and then after spending multiple…Aug 7, 2023Aug 7, 2023
Manushree GuptainCode Like A GirlCI/CD Pipeline: Continuous Integration/Continuous DeliveryAWS developers use these CI/CD pipelines to manage and automate application deployment.Aug 7, 2023Aug 7, 2023
Manushree GuptaDataLake vs DataWarehouseI am sharing below the differences between Datalake and Datawareshouse on very high level.Aug 7, 2023Aug 7, 2023