Jupyter Notebook — Tips & Tricks
In my current project we collect a lot of data and perform analytics on it. It ranges from identifying trends to answering very specific business questions. We use Jupyter Notebooks extensively as it provides an awesome interface for interactive analysis and ability to share the findings or insights with others. This article documents some of the useful features that I use on a daily basis.
A Jupyter Notebook as defined on their official homepage is
An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
The name JuPyteR is derived from the three core programming languages or kernels the tool supports, namely Julia, Python and R. The IPython kernel is shipped along with Jupyter. Jupyter Notebooks, Jupyter Lab and Jupyter Hub are the three products offered for interactive processing or analytics. For installation and usage you can follow the steps available at Jupyter Home or Jupyter Docs
Below are some useful tricks that helped me get the most out of the tool,
1. Screen Width
By default the Notebook leaves a lot of margin on the sides. These days when working from home I connect my laptop to an external monitor and personally I prefer to use all the real estate available on my screen for development.
You can extend the width of usable Notebook space by executing the below command
Post executing the commands your Notebook would look like below giving you more cell space to write code
2. Output for all commands in a cell
I tend to club a lot of my commands into one cell. Like read, count etc. By default only the output of the last command is shown.
You can execute the below commands to enable the output of all commands to be displayed.
Post execution all the relevant outputs from a cell will be displayed in order.
3. Pandas settings
When working with a wide and/or a large dataset Pandas hides some columns and rows to try and fit to the display. The “hidden” rows / columns are shown as dots and highlighted in green in below image
You can set the below two properties to enforce Pandas to show the required number of columns and rows,
Additionally there could be a scenario where one particular column has a large value which is shown truncated by Pandas. Here you can use the max_colwidth option to set the display length
You can read about the available options for pandas at Pandas Documentation
4. Executing shell commands
Running shell commands directly from the Notebook is as easy as prefixing them with an exclamation mark (!). All commands that would run on the shell prompt would work
You can use this approach to install any missing dependencies too
The output of the shell command can be stored in a variable and used within the notebook for processing.
5. Magic Commands
For everyday mundane tasks IPython provides shortcuts that are referred to as magic commands. You can refer the documentation at IPython Docs.
These commands are prefixed by % symbol. A single % if you want the command to operate over a single line and double %% to operate over multiple lines or the whole cell.
Few commonly used ones are,
A) Executing an external script from within the notebook.
There may be a case where you have a script that prepares the data or does some tasks as a precursor to running the current Notebook. You can call such scripts and/or notebooks from within the current Notebook using %run
B) Measure Execution Time
You can use %time or %%time to measure the execution time of your code. For small pieces of code it would even run it multiple times and show the best time.
There are extensions available to show the cell execution time automatically (yellow highlights)
There are many magic commands and you can list them using %lsmagic
6. Additional functionality using extensions
Detailed documentation on the extensions is available here
Execute the below 2 commands on your CLI to install the extensions
Then on your Jupyter Homepage you should see the new tab for extensions. From here you can enable and disable them. Highlighted a few commonly used ones in the below image
7. Keyboard Shortcuts
There are many keyboard shortcuts in Jupyter that can speed up your development. To view them you can press ESC to enable the command mode and then h for opening the shortcut window.
8. Run the notebook from CLI
Finally you can also execute your complete notebook from the command line interface using the below command
There is a lot more that Jupyter Notebooks offer and it’s worth reading their documentation
Thank you for spending some time in reading this post. Have a wonderful life!
References
Originally published at https://www.linkedin.com.