Nick HassCorrectly Size EMR Clusters for Spark JobsReady to take your Spark application to the cloud? I will explain how to correctly size EMR clusters to run spark jobs and correctly…Jul 22Jul 22
Nick HassHow to set up an ssh key for a git repoMake sure to follow all these steps in order!Jun 17Jun 17
Nick HassHow to Send a Folder Too Large For EmailLet’s say you have a folder that is way to big to send over email, like a large parquet table. It would probably be too big to email, as…Oct 30, 2023Oct 30, 2023
Nick HassHow to Connect Local PySpark to AWS S3 and Read a Delta TableWhile you could use AWS EMR and automatically have access to the S3 file system, you can also connect Spark to your S3 file system on your…Oct 4, 2023Oct 4, 2023
Nick HassCollege Degrees for a Career in Data ScienceSo you want to pursue Data Science in college? This article will explain the best undergraduate majors for a career in Data Science based…Nov 21, 2022Nov 21, 2022
Nick HassRegularized Regression and MulticollinearityMulticollinearity is when there are some predictor variables are correlated with each other. High multicollinearity will result in bad…Apr 13, 2022Apr 13, 2022
Nick HassMachine Learning to Detect Mental Illness on Social MediaA Literature ReviewApr 13, 2022Apr 13, 2022