Writing an Interactive Book πŸ“– over the Threat Hunter Playbook 🏹 with the help of the Jupyter Book Project πŸ’₯

Roberto Rodriguez
Open Threat Research
11 min readDec 18, 2019

Well, I decided to start writing a book πŸ˜†, but an interactive book 😎! Yes, a book where I could not only share detection concepts, but also allow the readers to interactively run every analytic provided as part of my research in a public computing environment all from a web-browser. It is still a work-in-progress, but I believe it is time to start putting together some of the projects that I have built so far and share them in a more traditional reading experience, a book, and for free of course πŸŽ„!

In a previous post, I shared how I was able to integrate detections from the Threat Hunter Playbook initiative and pre-recorded datasets from Mordor with the amazing BinderHub project. That was something that I had not seen before, but I felt that I still needed to package all those ideas and resources in a more practical and easy-to-read format.

How can I share the rich interactive Jupyter notebook experience in a book format?

In this post, I will show you how I took all the Jupyter notebooks available in the Threat Hunter Playbook project and some markdown files to create a static website leveraging the amazing Jupyter Book project originally developed by Chris Holdgraf and Sam Lau with support of the UC Berkeley Data Science Education Program and the Berkeley Institute for Data Science.

In addition, I will show you how I create Jupyter notebooks programmatically from YAML files using a python library named nbformat to enhance the Jupyter Book content update and documentation of the project.

Pre-Requirements

I highly recommend to read the following post to have a better understanding of most of the projects I will be mentioning in this post:

What is a Jupyter Book?

Jupyter Book is an open source project for building beautiful, publication-quality books and documents from computational material.

According to the official GitHub repository, Jupyter Book allows users to

  • write their content in markdown files or Jupyter notebooks,
  • include computational elements (e.g., code cells) in either type,
  • include rich syntax such as citations, cross-references, and numbered equations, and
  • using a simple command, run the embedded code cells, cache the outputs and convert this content into:
  • a web-based interactive book and a publication-quality PDF.

The result is a static HTML pages backed by Jupyter notebooks, markdown files πŸ˜‰ Also, an online book with capabilities to interactively run notebook cells either with a local JupyterHub deployment or online via a public computing environment such as BinderHub.

What is a Static Website?

The concept of a static website comes from taking static files such as markdown files and turning them into HTML files that can be deployed online. For example, this is something you can do with GitHub pages. I find it very flexible and easy to do when deploying a website without the need to know how websites are developed.

A Jupyter Book Recipe

With Jupyter Book, all you need in order to build your own book are the following files:

Installing Jupyter Book CLI

Before continuing with this process, you will need to install the jupyter-book python library. This library allows you to create, build, upgrade, and otherwise control your Jupyter Book. You can install it via pip:

pip install jupyter-book

Or from the latest version from GitHub (still via pip):

pip install git+https://github.com/executablebooks/jupyter-book

The book building process

According to the official Jupyter Book docs, building a Jupyter Book consists of three steps:

1. Create your book template

You just need to run the following command to create a sample book that you can customize and add your own table of contents, config file, etc. You can call it anything you want, but I use docs to integrate it with GitHub Pages πŸ˜‰

jupyter-book create docs

After running that command, you will get a new directory in your current path named docs .

====================================================================Your book template can be found atdocs/====================================================================

You will get the following files as part of your book template:

docs/
β”œβ”€β”€ _config.yml
β”œβ”€β”€ _toc.yml
β”œβ”€β”€ content.md
β”œβ”€β”€ intro.md
β”œβ”€β”€ logo.png
β”œβ”€β”€ markdown.md
β”œβ”€β”€ notebooks.ipynb
β”œβ”€β”€ references.bib
β”œβ”€β”€ requirements.txt

It is up to you how you structure your project and how you modify everything.

The _config.yml File (Example)

This file controls the behavior of Jupyter Book, and allows you to define metadata for the book such as title, author, baseurl, and even enable interactive buttons such as a Binder one to interact with any notebook via BinderHub (open infrastructure for open research).

In my case, I opened the default _config.yml file in the book template:

# Book settings
title: My sample book
author: The Jupyter Book Community
logo: logo.png
latex:
latex_documents:
targetname: book.tex

and updated with the following information.

# Book settingstitle: "Threat Hunter Playbook"
logo: images/logo/logo.png
author: Roberto Rodriguez @Cyb3rWard0g
email: ""
description: >-
This is the first interactive Jupyter book in the infosec community!
execute:
execute_notebooks: cache
html:
favicon: images/logo/favicon.ico
home_page_in_navbar: false
use_edit_page_button: true
use_repository_button: true
use_issues_button: true
baseurl: https://threathunterplaybook.com/
repository:
url: https://github.com/hunters-forge/ThreatHunter-Playbook
branch: master
path_to_book: docs
launch_buttons:
notebook_interface: "classic"
binderhub_url: "https://mybinder.org"
colab_url: "https://colab.research.google.com"
thebe: true

2. Convert each page of your book into HTML

You can use the jupyter-book library to convert each page of your book into HTML. This converts your .ipynb, .md, etc files into HTML that can be understood by a website. See the β€œbuilding your book” section for more information in their official docs. Run the following command:

jupyter-book build docs/

you will see a new folder being created named_build which will host all your markdown and notebook files in HTML format similar to the ones in here.

3. Publish Your Book Online

Once you have HTML files for each page, you will need to put them all together as a standalone HTML file that can be hosted online. See the β€œpublish your book online” section for more information in their official docs. I built mine leveraging GitHub Pages 🍻.

One thing to remember is that we are not using Jekyll as our static site generator with Jupyter Books. We are using python sphinx.

I use the ghp-import python library to push my jupyter book content to GitHub and skip a Jekyll build. ghp-import copies the docs/_build/html folder (my jupyter book) and sends it over to a gh-pages branch. If it does not exist, it will be created for you.

You can install ghp-import via pip:

pip install ghp-import

Next, I run the following command at the root directory of the Threat Hunter Playbook repository:

ghp-import -n -p -c threathunterplaybook.com -f docs/_build/html

Finally, I go to my repository settings and make sure I set the source of my site to the gh-pages branch. I also add a custom domain for it πŸ˜‰

Once the GitHub pages feature is enabled successfully in my repository, I can go to the following site and see the whole project in a book format:

https://threathunterplaybook.com

That’s it πŸ™?

Yeah, that’s pretty much it from a Jupyter Book perspective! However, I added some extra automation on the top of this process to make the contribution process and update of the book much easier for future builds.

Automating the Creation of Notebooks πŸ—

As you already know, the Threat Hunter Playbook project documents detection strategies in the form of interactive notebooks to provide an easy and flexible way to visualize the expected output and be able to run the analytics against pre-recorded mordor datasets through BinderHub cloud computing environments. This has been a great integration for the project; however, not everyone has the skills, resources or time to create Jupyter hunting notebooks from scratch. Therefore, I needed to find a way to automate the creation of Jupyter notebooks from a template with an easy to use format. What if I document every threat hunter playbook as a YAML file and convert them into notebooks via code? How would I do that?

Enter NBFormat

nbformat contains the reference implementation of the Jupyter Notebook format, and Python APIs for working with notebooks.

The Threat Hunter Playbook notebooks follow always a similar format/structure. Therefore, I started testing if it was possible to use nbformat APIs to create notebooks automatically with some Python code from a YAML file.

Threat Hunter Playbook Format

The following document summarizes the main sections of every playbook provided by the project.

Threat Hunter Playbook Notebook

So far, this is how I document every single playbook to share with the community. However, how can you also create something similar without standing up your own Jupyter Notebook server?

Constructing Notebooks Programmatically 😎

You can use nbformat Python APIs to create markdown and code cells. The following is a basic example to show you how easy it is to create a notebook with a few lines of code:

Create Notebook Object

  • Import nbformat library
  • Create a new notebook object
  • Initialize notebooks cells as an empty list
import nbformat as nbfnb = nbf.v4.new_notebook()
nb['cells'] = []

Create a Markdown Cell

nb['cells'].append(nbf.v4.new_markdown_cell("# Remote Service Creation"))

Create a Code Cell

nb['cells'].append(nbf.v4.new_code_cell(
"""from openhunt.mordorutils import *
spark = get_spark()"""
))

Write Notebook to file

nbf.write(nb, "test.ipynb")

The result will be the following notebook 😱

All I need to do now is pass the Threat Hunter Playbook format as a YAML file through the nbformat APIs and create a Jupyter Notebook.

YAML File > Jupyter Notebook πŸ’₯😱

I translated the following Jupyter Notebook used as the main example in this post and translated it to the following YAML file:

Next, I created the following Python script, as a proof of concept, to convert the YAML file to a Jupyter Notebook with markdown and code cells derived from the YAML file.

The result was the following Jupyter Notebook with the same format as the one shown before:

YAML Playbooks > Jupyter Book Notebooks 🍻

Finally, I took this concept and added it as part of the Python script I use to update my Jupyter Book. One difference from the proof-of-concept script is that instead of converting one file only, I iterate over all the YAML files I created from the notebooks available in the Threat Hunter Playbook. All the notebooks created by the script get stored in folders inside of the Jupyter Book docs>content>notebooks folder and categorized following the MITRE ATT&CK structure (Platform>Tactic)

How is this an Interactive Book Again πŸ˜†?

Great question! According to the Jupyter Book docs:

because Jupyter Books are built with Jupyter Notebooks, you can connect your online book with a Jupyter kernel running in the cloud. This lets readers quickly interact with your content in a traditional coding interface using either JupyterHub or BinderHub.

If you read this previous post, you already know that I use BinderHub public infrastructure in order to share my research with other hunters in the 🌎. Therefore, I enabled the Binder button on the top of every page that is backed by a Jupyter notebook as shown below:

All you have to do is click the Interact button, and you will be taken to BinderHub’s public infrastructure, where the notebook will be hosted by a server created by my own BinderHub repository (Dockerfile).

There you will be able to interactively run analytics provided in the notebooks as shown below:

Experimental Feature: Interactive Code Directly on the Page

According to the Jupyter Book docs

If you’d like to provide interactivity for your content without making your readers leave the Jupyter Book site, you can use a project called Thebelab.

This provides you a button that, when clicked, will convert each code cell into an interactive cell that can be edited. It also adds a β€œrun” button to each cell, and connects to a Binder kernel running in the cloud.

I enabled it in my Jupyter Book as shown below:

Remember that this is still an experimental feature! In the meantime, I still ❀️ the BinderHub redirection with the Interact button to use other features available in the Jupyter Notebook interface.

Future Work

  • Add more chapters to the Jupyter Book!
  • Add more analytics in other platforms!
  • I added a basic Jupyter Notebooks tutorial for this first release. However, I want to add more tutorials to show the power of notebooks in our industry
  • I am preparing some material for a few workshops and training classes for 2020 using all the projects mentioned in this post so stay tuned! πŸ˜ƒ 🍻 πŸ’œ

Thats it! I hope you enjoyed this post! Im so happy I got the Jupyter Book to work and added other enhancement to the projects I have been working on and updating for the last few months. My goal is to write more about every open source project I have worked on in the past couple of years and put it all together in this Jupyter Book πŸ™

Book Link: http://threathunterplaybook.com/

Finally, with this new project, I am also very happy and honored to share that I just got accepted to the GitHub Sponsors program! 😱 If you would like to fund my contributions and become a sponsor to my open source work, you can read about it here: https://github.com/sponsors/Cyb3rWard0g Thank you so much in advance πŸ™ 😊🍻

References

https://jupyterbook.org/intro.html

https://nbformat.readthedocs.io/en/latest/index.html

https://threathunterplaybook.com/

.

--

--