Writing Math Equations in Jupyter Notebook: A Naive Introduction

Published in

Analytics Vidhya

7 min readMar 26, 2020

Writing Math Equations on Jupyter Notebook — I wrote all the text, symbols (even the arrows!) and equations in the image above on Jupyter notebook’s markdown!

Without a doubt, documentation is an essential part of working in Data Science projects. If your work involves reading up latest research or coming up with new algorithms to solve problems, then it is especially important and useful.

However, the latter kind of work often involves writing Math equations in digital form. Except for people familiar with LaTeX, this is often an unfamiliar territory.

In this post I’ll show you, with examples, how to write equations in Jupyter notebook’s markdown. I have selected these equations such that they cover the most recurring types of symbols and notations which you might encounter (at least I do). I also give links to useful resources I refer to. And then, there are many bonuses along the way :)

Source: giphy.com

Markdown Mode in Jupyter

Very very quickly, this is how you can switch to markdown mode in Jupyter.

Select a cell in command mode. If you see a cursor in the cell and can write, then it’s in edit mode. Press escape key to go to command mode. Usually, the color of the thick border on the left of selected cell is green when in edit mode and bluish/grey in command mode.
Switch to markdown mode:

Option 1: Go to Cell => Cell Type => Markdown

Switching to Markdown from options in main menu

Option 2: Change mode from drop down as shown in the images below. On the right image, you can see I am currently in Code mode (checked) and Markdown mode is highlighted, which you can click for switching to Markdown mode.

Switching to Markdown mode from drop down

Option 3: Select a cell in command mode and press M for markdown. BONUS 1: press Y for switching back to code mode.

After switching to Markdown mode, we will edit the cell to write equations.

Writing Your First Equation in Jupyter Notebook

Now, you will be writing equation of a linear model. What is a linear model you ask? This,

Linear model is one where relationship between dependent and independent variables is linear in parameters. — Linear Model

BONUS 2: A model is linear, if the relationship between dependent variable (Y) and independent variables (X) is linear in parameters (betas). The hat on betas just mean that they are values estimated from data (and we hope that they are close to true values).

What are the symbols we need to write?

Hat: \hat
Subscript: _{}
Sum: \sum
Limits of sum: \limits _{} ^{}
Beta: \beta

The weird looking thing against each name is the markdown syntax to write hat, subscript, sum, limits and beta in the equation of linear model. Note that rest of the things are just alphabets and numbers viz. Y, 0, j, 1, p, and X, which need no special syntax apart from, at times, being enclosed inside {}.

Using these building blocks, the complete syntax for linear model can be written as,

LaTeX Jupyter Markdown syntax for linear model with beta, hat, sum, limits of sum, subscript, superscript — $\hat{Y} = \hat{\beta}_{0} + \sum \limits _{j=1} ^{p} X_{j}\hat{\beta}_{j} $

When you run the cell with markdown syntax, shown the image above, you will get the equation of the linear model. You can double click on the cell to edit the markdown syntax. Remember that the cell has to be in markdown mode.

Explanation of the syntax

$ : All the Math you want to write in the markdown should be inside opening and closing $ symbol in order to be processed as Math.\beta : Creates the symbol beta\hat{} : A hat is covered over anything inside the curly braces of \hat{}. E.g. in \hat{Y} hat is created over Y and in \hat{\beta}_{0},  hat is shown over beta_{} : Creates as subscript, anything inside the curly braces after _. E.g. \hat{\beta}_{0} will create beta with a hat and give it a subscript of 0.^{} : (Similar to subscript) Creates as superscript, anything inside the curly braces after ^.\sum : Creates the summation symbol\limits _{} ^{} : Creates lower and upper limit for the \sum using the subscript and superscript notation.

BONUS 3: You can enclose the Math inside two $$ as well. The difference is inline mode vs display mode. “Inline mode is for math that is included within a line or paragraph of text, and display mode is for math that is set apart from the main text.” — https://tex.stackexchange.com/questions/410863/what-are-the-differences-between-and

Display math mode for LaTeX Jupyter Markdown of Linear Model — Display mode with Math inside two $$

Congratulations, if you have followed the post and reached this far. You have successfully written your first non-trivial Math equation on Jupyter’s markdown.

So far, so good. But this is no time to stop. Lets notch up the pace a bit.

Source: giphy.com

Gradient Tree Boosting Algorithm

Let’s write all the Math for Gradient Tree Boosting Algorithm!

Source: giphy.com

Take a deep breath but you need not worry! Just take a glance at the Gradient Tree Boosting Algorithm above and you will find you already know how to write subscript, sum with its limits, Math in display mode (BONUS 3 above) and hat. These symbols along with plain text is literally most of the algorithm!

So what’s new?

*** : Creates horizontal line&emsp; : Creates space. (Ref: Space in ‘markdown’ cell of Jupyter Notebook)\gamma : Creates gamma symbol\displaystyle : Forces display mode (BONUS 3 above). (Ref: Display style in Math mode)\frac{}{} : Creates fraction with two curly braces from numerator and denominator.<br> : Creates line breaks\Bigg : Helps create parenthesis of big sizes. (Ref: Brackets and Parentheses)\partial : Creates partial derivatives symbol\underset() : To write under a text. E.g. gamma under arg min, instead of a subscript. In the algorithm you’ll see both types.\in : Creates belongs to symbol which is heavily used in set theory.

Some text formatting options like \text — to write plain text with spaces, \mathbf — to write Math in boldface, \textrm — to write text in roman font. These formatting options are things you will figure out with searches over Google when you want to make things look a certain way. So no need to bother too much about them.

Enough talk, lets see the entire syntax now!

Gradient Tree Boosting Markdown Syntax for Jupyter

LaTeX Jupyter Markdown syntax for Gradient Tree Boosting Algorithm — Markdown syntax for Gradient Tree Boosting Algorithm and its output in a separate cell

But instead of writing all of that yourself, you may prefer to…

Source: tenor.com

Won’t you?

Here is the Markdown syntax in plain text:

***
$\mathbf{\text{Gradient Tree Boosting Algorithm}}$<br>
***
1.&emsp;Initialize model with a constant value $$f_{0}(x) = \textrm{arg min}_{\gamma} \sum \limits _{i=1} ^{N} L(y_{i}, \gamma)$$
2.&emsp;For m = 1 to M:<br>
&emsp;&emsp;(a)&emsp;For $i = 1,2,...,N$ compute<br>
    $$r_{im} = - \displaystyle \Bigg[\frac{\partial L(y_{i}, f(x_{i}))}{\partial f(x_{i})}\Bigg]_{f=f_{m−1}}$$
&emsp;&emsp;(b)&emsp;Fit a regression tree to the targets $r_{im}$ giving terminal regions<br>
&emsp;&emsp;&emsp;&emsp;$R_{jm}, j = 1, 2, . . . , J_{m}.$<br><br>
&emsp;&emsp;(c)&emsp;For $j = 1, 2, . . . , J_{m}$ compute<br>
$$\gamma_{jm} = \underset{\gamma}{\textrm{arg min}} \sum \limits _{x_{i} \in R_{jm}} L(y_{i}, f_{m−1}(x_{i}) + \gamma)$$
<br>
&emsp;&emsp;(d)&emsp;Update $f_{m}(x) = f_{m−1}(x) + \sum _{j=1} ^{J_{m}} \gamma_{jm} I(x \in R_{jm})$<br><br>
3. Output $\hat{f}(x) = f_{M}(x)$
***

WOW! That was a lot of stuff you went through! Claps claps claps …

BONUS 4: Quick and Criminally Short Explanation of Gradient Tree Boosting Algorithm

Gradient Tree Boosting Algorithm combines Decision Trees in an additive and sequential manner to incrementally make better predictions on training data.

It starts with an initial constant value of prediction for all data points (which is mean value in case of regression).

In every subsequent iteration, it fits a tree to negative of gradient of loss with respect to predictions of model learned so far (which in regression case turns out to be error i.e. actual-predicted value).

This new tree is then combined with the previous trees to get updated predictions for each data point.

You stop the algorithm at a preset number of iterations.

PS: This is a very very generic explanation diluting a lot of important details.

BONUS 5: Variance Covariance Matrix in Markdown