Managing access to data files in interactive computing

Image for post
Image for post
Photo by Hudson Hintze on Unsplash


Almost every notebook contains a pd.read_csv(file_path) or a similar command to load data. Dealing with file paths in notebooks, however, is kinda troublesome: moving notebooks around becomes a problem, and the notebook now has to know project locations. Here, we discuss a couple of approaches to handle this problem.


Starting a notebook is always easy, you just start a couple of cells which often just contain a df.head(). However, as the project grows (and in industry they always do), you will need to organize your folders. You will need a folder for the data, and another folder for notebooks. As the EDA progresses, you will need more folders representing different subsections of the main analysis. …


Nilo Araujo

Data scientist from Brazil, Masters in UFC - the university ;)

