Numba: Accelerating Python Code for Quantitative Finance

Jakub Polec
6 min readJun 5, 2024

--

In this post you will read about:

  • An introduction to Numba, a JIT compiler for Python, beneficial for optimizing quant financial calculations.
  • Examples showing how Numba’s @jit decorator accelerates functions, such as the Mandelbrot set and Monte Carlo simulations.
  • Tips for Python code optimization with Numba and its use in the QuantJourney Framework.

COMPLETE CODE AND TEXT AT https://quantjourney.substack.com/

Introduction

In the quant journey, we often encounter the challenge of optimizing Python code to obtain results faster. Python has emerged as a go-to language due to its simplicity, versatility, and vast ecosystem of libraries. However, when it comes to computationally intensive tasks, such as financial data analysis, the interpreted nature of Python can sometimes lead to performance bottlenecks. This is where Numba comes to the rescue. In this post, we’ll explore Numba and its powerful decorator, numba.jit, and demonstrate how they can significantly boost the performance of your Python code.

You may read more at https://github.com/numba

Please note all code of QuantJourney Framework is available to paid subscribers.

What’s Inside our framework?

The QuantJourney Trading Framework is a comprehensive investing package designed to streamline your access to financial data, simplify data processing, and enhance data visualization. For data retrieval it offers a comprehensive range of methods covering different asset classes for equities, ETFs, CFDs, cryptocurrencies, bonds, futures, forex, commodities, macro trends, REITs, and indices.

The framework offers a variety of efficient data connectors, including EOD, FMP, FRED, Quandl, OANDA, CCXT (Binance, Coinbase, etc.), SEC, Y-Finance, CNNFG, and TipRanks. By leveraging asynchronous access and threaded processing, QuantJourney accelerates the data acquisition process, saving you valuable time and resources.

Data Manager module ensures seamless data storage and retrieval across various databases, such as ArcticDB, MongoDB, S3, KDB+, and Redis, giving you the flexibility to choose the database that best suits your needs.

The framework also adds Backtesting Engine combined with the execution engine for IBKR, providing a suite of tools for developing and testing trading strategies.

You can check various modules and used methods of QuantJourney Trading Framework at https://shorturl.at/MNU59

What is Numba?

Numba is an open-source JIT (Just-In-Time) compiler for Python, primarily used for numerical and scientific computing. It leverages the LLVM compiler infrastructure to compile Python code into machine code, significantly enhancing execution speed compared to interpreted Python code. Numba is especially efficient for code that includes loops, mathematical operations, and array manipulations.

Numba supports two compilation modes:

  1. Nopython mode (nopython=True): In this mode, Numba compiles the function into machine code that operates independently of the Python interpreter during runtime. While offering the highest performance gains, this mode requires the function to be fully compilable without any Python-specific features.
  2. Object mode (nopython=False or default): In this mode, Numba compiles the function to machine code while still interacting with the Python runtime. Although this mode permits the use of Python objects and features, it might deliver lower performance than the nopython mode.

When you use Numba’s @jit decorator on a Python function, Numba analyzes the function's bytecode and translates it into an intermediate representation (IR). This IR is then optimized using various techniques, such as type inference, loop unrolling, and vectorization. Finally, the optimized IR is compiled into machine code using LLVM, resulting in highly efficient native code that can run much faster than the original Python code.

Let’s see simple code to see how to use it:

# Logger
from quantjourney.data.utils.data_logs import data_logger
logger = data_logger()
from quantjourney.other.decorators import timer# Mandelbrot function with Numba JIT
@jit(nopython=True)
def mandelbrot_numba(x, y, max_iters):
c = complex(x, y)
z = 0.0j
for i in range(max_iters):
z = z * z + c
if (z.real * z.real + z.imag * z.imag) >= 4:
return i
return max_iters
# Mandelbrot function without Numba
def mandelbrot_no_numba(x, y, max_iters):
c = complex(x, y)
z = 0.0j
for i in range(max_iters):
z = z * z + c
if (z.real * z.real + z.imag * z.imag) >= 4:
return i
return max_iters
# Mandelbrot set function with Numba JIT
@jit(nopython=True, parallel=True)
def mandelbrot_set_numba(xmin, xmax, ymin, ymax, width, height, max_iters):
r1 = np.linspace(xmin, xmax, width)
r2 = np.linspace(ymin, ymax, height)
n3 = np.empty((width, height))
for i in range(width):
for j in range(height):
n3[i, j] = mandelbrot_numba(r1[i], r2[j], max_iters)
return n3
# Mandelbrot set function without Numba
def mandelbrot_set_no_numba(xmin, xmax, ymin, ymax, width, height, max_iters):
r1 = np.linspace(xmin, xmax, width)
r2 = np.linspace(ymin, ymax, height)
n3 = np.empty((width, height))
for i in range(width):
for j in range(height):
n3[i, j] = mandelbrot_no_numba(r1[i], r2[j], max_iters)
return n3
@timer
def run_mandelbrot_set_numba(xmin, xmax, ymin, ymax, width, height, max_iters):
return mandelbrot_set_numba(xmin, xmax, ymin, ymax, width, height, max_iters)
@timer
def run_mandelbrot_set_no_numba(xmin, xmax, ymin, ymax, width, height, max_iters):
return mandelbrot_set_no_numba(xmin, xmax, ymin, ymax, width, height, max_iters)
# Running and timing the functions
run_mandelbrot_set_numba(-2.0, 1.0, -1.0, 1.0, 1000, 1000, 80)
run_mandelbrot_set_no_numba(-2.0, 1.0, -1.0, 1.0, 1000, 1000, 80)

In this example, we use Numba to accelerate the computation of the Mandelbrot set, a complex fractal. The mandelbrot function calculates the number of iterations required for a given point to diverge. The mandelbrot_set function generates the entire Mandelbrot set by iterating over a grid of points. By using Numba's @jit decorator with nopython=True and parallel=True, we can significantly speed up the computation.

Finished 'run_mandelbrot_set_numba' in 0.4413 secs (with Numba, parallel=False)
Finished 'run_mandelbrot_set_numba' in 0.7587 secs (with Numba, parallel=True)
Finished 'run_mandelbrot_set_no_numba' in 3.2974 secs (without Numba)

Using Numba speeds up those calculations by seven times. However, adding parallel execution (which is set to False by default) results in less improvement. Read below to understand why this is the case.

Now, let’s optimize the Monte Calo simulation using Numba and create additional code relevant to quant finance topics:

import numpy as np
from numba import jit
# Logger
from quantjourney.data.utils.data_logs import data_logger
logger = data_logger()
from quantjourney.other.decorators import timer@timer
def monte_carlo_simulation_without_numba(S, K, r, sigma, T, num_simulations):
payoffs = np.zeros(num_simulations)

for i in range(num_simulations):
epsilon = np.random.standard_normal()
S_T = S * np.exp((r - 0.5 * sigma ** 2) * T + sigma * np.sqrt(T) * epsilon)
payoffs[i] = max(S_T - K, 0)

option_price = np.exp(-r * T) * np.mean(payoffs)
return option_price
@timer
@jit(nopython=True)
def monte_carlo_simulation_with_numba(S, K, r, sigma, T, num_simulations):
payoffs = np.zeros(num_simulations)

for i in range(num_simulations):
epsilon = np.random.standard_normal()
S_T = S * np.exp((r - 0.5 * sigma ** 2) * T + sigma * np.sqrt(T) * epsilon)
payoffs[i] = max(S_T - K, 0)

option_price = np.exp(-r * T) * np.mean(payoffs)
return option_price
# Parameters
S = 100.0 # Underlying stock price
K = 105.0 # Strike price
r = 0.05 # Risk-free interest rate
sigma = 0.2 # Volatility
T = 1.0 # Time to maturity (in years)
num_simulations = 10_000_000 # Number of simulations
# Perform the Monte Carlo simulations
option_price_without_numba = monte_carlo_simulation_without_numba(S, K, r, sigma, T, num_simulations)
option_price_with_numba = monte_carlo_simulation_with_numba(S, K, r, sigma, T, num_simulations)
print(f"Option Price (Without Numba): ${option_price_without_numba:.2f}")
print(f"Option Price (With Numba): ${option_price_with_numba:.2f}")

The difference in processing is over fifteen times faster, mostly because Monte Carlo algorithm requires quite intensive computations.

Finished 'monte_carlo_simulation_without_numba' in 13.5076 secs
Finished 'monte_carlo_simulation_with_numba' in 0.8714 secs

We heavily used Numba in the QuantJourney Framework, e.g., for the DataNum class, which provides pre-defined NumPy operations for quant finance.

		Args:
func (Callable): The function to apply.
axis (int): The axis along which to apply the function.
array (np.ndarray): The input 2D array.
Returns:
np.ndarray: The resulting array after applying the function.
"""
assert array.ndim == 2
assert axis in [0, 1]
if axis == 0:
result = np.zeros((1, array.shape[1]))
for i in range(array.shape[1]):
result[0, i] = func(array[:, i])
else:
result = np.zeros((array.shape[0], 1))
for i in range(array.shape[0]):
result[i] = func(array[i, :])
return result

For your reference, here are some hints for optimization with Numba:

  1. Use nopython mode whenever possible for maximum performance gains.
  2. Leverage parallel execution with parallel=True for computationally intensive loops; but check as for some functions it really slow downs the calculation;
  3. Avoid using Python objects and features inside Numba-compiled functions in nopython mode.
  4. Minimize the use of branches (if-else statements) and function calls inside compiled functions.
  5. Use Numba-compatible data types, such as NumPy arrays, for optimal performance.
  6. Profile your code to identify performance bottlenecks and focus optimization efforts on those areas.

--

--