How to Data Science
How to Data Science Using VSCode
Overview
In order to get started doing data science and data engineering, one must have an ergenomic developer setup on one’s local workstation.
If you are on an M1 or M2 Mac, one needs to follow the instructions compiled here in order to have an efficient data science setup.
VSCode Extension
It is important to install the correct set of VSCode extensions in order to have an optimized setup for data science. The below VSCode extensions are recommended
Python Extension Pack
Id: donjayamanne.python-extension-pack
Description: Popular Visual Studio Code extensions for Python
Version: 1.7.0
Publisher: Don Jayamanne
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=donjayamanne.python-extension-pack
Pylance
Id: ms-python.vscode-pylance
Description: A performant, feature-rich language server for Python in VS Code
Version: 2023.5.30
Publisher: Microsoft
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance
Jupyter
Id: ms-toolsai.jupyter
Description: Jupyter notebook support, interactive programming and computing that supports Intellisense, debugging and more.
Version: 2023.5.1001411100
Publisher: Microsoft
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
CodeSnap
Id: adpyke.codesnap
Description: 📷 Take beautiful screenshots of your code
Version: 1.3.4
Publisher: adpyke
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=adpyke.codesnap
Path Intellisense
Id: christian-kohler.path-intellisense
Description: Visual Studio Code plugin that autocompletes filenames
Version: 2.8.4
Publisher: Christian Kohler
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=christian-kohler.path-intellisense
Conda
Conda is an open source package management system and environment management system that installs, runs, and updates packages and their dependencies.
Install Conda using brew
brew install miniforge
Refer to this for more info on conda
Run the below commands to activate conda
conda --version
conda create --name env_myconda_env python=3.11
conda activate env_myconda_env
pip install pandas
pip install numpy
We may now create the following explore_data.py
import pandas as pd
data = pd.read_table('chipotle.tsv')
print(data.head())
We may check the checkbox in Settings -> User in VScode called When pressing shift+enter, send selected code in a Python file to the Jupyter interactive window as opposed to the Python terminal.
Running the above in the interactive window using Shift + enter
gives us the following results