shubham badayaUnderstanding Big Data File Formats: Their Types and ImportanceIntroduction to File Formats in Big DataJul 3Jul 3
shubham badayaSpark memory allocation for driver and executor — beginner-friendlyI have often been asked to debug Spark applications, and sometimes it gets quite complex to explain why someone encounters out-of-memory…Jun 7Jun 7
shubham badayaUnderstanding Normalized vs. Denormalized Schemas in Data WarehousingWhile working on an analytics project and using data from table X, have you ever wondered why data from table Y is not included in table X…Jun 1Jun 1
shubham badayaNeural network intuitionMy post here is inspired by my subsequent reading of various textbooks on machine learning and deep learning. The aim here is to express…Mar 9Mar 9
shubham badayaData Wrangling in Pandas vs SQL: A Comprehensive ComparisonFrequently, I find myself navigating between SQL and Pandas in my work. On certain days, I find myself crafting queries directly on the…Jan 13Jan 13
shubham badayaData science fundamentals often asked in interviewsIn this article, I present and share the solution for theory and concept-related questions that were asked to me or I had asked in data…Jan 7Jan 7
shubham badayaUnderstand clustering in data science without mathematicsFrequently, we find ourselves questioning the nature of clustering, its distinctions from classification, and the general methods employed…Dec 31, 20231Dec 31, 20231
shubham badayaAsked Python interview questions for data science/engineering.In this article, I present and share the solution for questions that were asked to me or I had asked in data science/engineering…Dec 29, 2023Dec 29, 2023
shubham badayaWhy Logistic Regression?Embarking on the journey of understanding logistic regression, I found myself pondering a series of questions that lingered persistently in…Dec 28, 2023Dec 28, 2023