Debug Jupyter notebooks with PyCharm

Did you ever had to debug some large cell in a Jupyter notebook? In the below I share my experience on the subject. We’ll review the classical methods for debugging notebooks, and finally I’ll show how to set breakpoints in PyCharm for code being execute in a jupyter notebook, and benefit of the comfort of a real Python IDE for debugging.

Before I actually describe what Pycharm can do, we quickly review the jupyter commands for debugging.

Catch exceptions

%pdb on is my favorite. It is a magic command that start a debug shell on exceptions (deactivate this mode with %pdb off)

Breakpoint with set_trace

This is the jupyter version for the classical python import pdb; pdb.set_trace(). At the location of the desired breakpoint, insert

import IPython; IPython.core.debugger.set_trace()

Conditional breakpoints

T. Hoffmann proposes an elegant conditional breakpoint function:

from IPython.core.debugger import Pdb as CorePdb
import sys

def breakpoint(condition=True):
"""
Set a breakpoint at the location the function is called if `condition == True`.
"""
if condition:
debugger = CorePdb()
frame = sys._getframe()
debugger.set_trace(frame.f_back)
return debugger


def add(a, b):
breakpoint(type(a) != type(b))
return a + b

add('a', 2)

What you get with Jupyter debug commands

Below the cell under debug, you get an input line where you can enter pdb commands.

Most useful commands are:

  • q(uit) to exit the debugger and return to jupyter
  • u(p) to go one frame up (in case you used %pdb on)
  • n(ext), s(tep), r(eturn), c(ontinue) to execute next line, step into function, execute until end of current function, or continue execution
  • p(rint) for printing the content of a variable.

A sample debug session looks like

Don’t forget to quit the debugger, otherwise cells won't execute any more. If you forgot, interrupt the kernel - in some case you will recover a functional notebook.

How to debug a notebook with PyCharm

Now we head towards a more comfortable solution! It does requires some work on configuration the first time you use it, yet the second time already the operation will be very easy, believe me.

Configure PyCharm with the same python interpreter as Jupyter

The following assumes you have PyCharm installed.

Identify the python that is used in your jupyter notebook with

import sys
sys.executable

Now create a PyCharm project at the same location than your jupyter notebook. Configure the project to use the the same interpreter as your jupyter:

Move function under debug to a python script

PyCharm only allows you to set breakpoint on python modules and packages.

If you want to do step-by-step execution of a python package, please simply open the desired module in PyCharm, and skip to the next paragraph.

If you want to debug a function you wrote yourself in the notebook, please move the function to a .py file. In this example I create a script.py file with content

def add(x, y): 
return x+y

We now load the autoreload extension with

%load_ext autoreload
%autoreload 2

The extension will make sure jupyter always use the latest version of the script (useful when you fix the bug).
 Finally we import the desired function with

from script import add

Attach PyCharm to jupyter kernel

We identify the python process used by the notebook with

%connect_info

which returns, among other lines, one like

if you are local, you can connect with just: 
jupyter <app> --existing kernel-d1b9a862-1f04-403b-82fb-5b820c0a0f89.json

Use the above information to attach PyCharm debugger to the python process. Click on Run / Attach to Local Process in PyCharm’s menu, and select the process identified by the kernel file:

Interactive debug

We are ready for interactive and comfortable step-by-step debug. In PyCharm, open the file (or package) where the breakpoint is desired, and right-click on the left border to add the breakpoint:

In jupyter, execute the cell that calls the function under debug:

As you see, the execution does not return (yet). PyCharm’s breakpoint pauses execution, and offers a comfortable debugger and variable window:

Going further

I find the above very helpful for debugging and understanding the stack trace at specific code locations. But I would also love to

  • catch all exceptions in PyCharm (and reproduce %pdb on)
  • set PyCharm breakpoints directly in jupyter cells.

Please let me know if you have any idea on how to achieve this!


Originally published at gist.github.com.