Shyamal AkruvalaFile Comparison Using PySpark and PandasWhile working on BigData projects for the past couple years, comparing files for data discrepancies has been a common task. This task…Nov 12, 20212Nov 12, 20212
Shyamal AkruvalainNerd For TechSpark Join Types VisualizedJoins are an integral part of any data analysis or integration process. Two sets of data, left and right, are brought together by comparing…Jul 23, 20212Jul 23, 20212
Shyamal AkruvalainNerd For TechJupyter Notebook — Tips & TricksTips and tricks to enhance your Jupyter Notebook experience.Jul 18, 20211Jul 18, 20211
Shyamal AkruvalainNerd For TechApache Spark’s Logical and Physical Plans Using Explain() MethodSpark recommends using the structured APIs (DataFrame, DataSet, SQL) compared to low-level RDDs to leverage the awesome power of the…Jun 28, 20211Jun 28, 20211
Shyamal AkruvalainNerd For TechOvercoming Parquet Schema IssuesCouple approaches on how we overcame parquet schema related issues when using Pandas and Spark dataframes.Aug 20, 20201Aug 20, 20201