A Practical Tip When Working With Random Samples On Spark

___
Towards AI
Published in
4 min readDec 26, 2019

--

Introduction

In this article, I will share a crucial tip when using Spark to analyze a random sample of a data frame. The code to reproduce the results can be found here. It’s an HTML version of a Databricks notebook, so all you have to do is download it in its raw form and then display the downloaded file in a web browser.

--

--