First post: what and why

Published in

decomplexify

3 min readJan 3, 2021

Explaining topics from machine learning, decision science and model risk management via first principles and visual examples

decomplexify: the act of causing the complicated to become simpler

This publication is a collection of articles that attempt to clarify and simplify quantitative topics in an intuitive fashion. Often I see material expecting technical knowledge as a prerequisite to explain introductory concepts. This blog is intended as an alternative for those who prefer to learn visually and via examples.

The focus is financial mathematical modeling and specifically applications in the quantitative investment industry. Typically, the passage of time plays a critical role and impacts how useful historical data is to model future outcomes.

This publication is intended for three different audiences:

My current students who may be meeting these topics for the first time. These posts will deliver additional intuition and visual explanations of concepts beyond those covered in class;
Experienced industry practitioners who may have met these topics many years ago but now want a simple refresher on the statistical foundations of ML and AI. Also these posts will deliver observations on the practical challenges, assumptions and limitations of applying techniques within quantitative investment management.
Finally, for myself: to organize my current teaching material and act as a pipeline of new topics.

The topics for these articles come from two sources:

Questions my previous students have asked
Observations from experience in the quantitative investment industry.

Future posts will cover, in no particular order:

Measures of association:

Measures of association between variables: Disentangling Covariance/ Correlation/ Beta(OLS, CAPM, multi-factor models). Defined from first principles & defined visually. How are they related to each other? Properties of data versus properties of a model.
Covariance: what does it really show? How easily can one extreme data point affect it? If the mean is hard to predict (and affected by outliers), how is the covariance prediction affected?

Measures of model fit:

Measures of model fit: R², RMSE, F-Test, bias, other error measures (MAE, MAPE, etc)
Root Mean Square Error: What it is and defined visually. What it is used for (from OLS to Deep Learning).

Probability distributions:

Central Limit Theorem: Defined visually. What it means for portfolio construction. Specifically when is the assumption of normal distributions of returns acceptable and when is it not
Log-Normal versus Normal Distributions: definitions, differences and practical use cases of each to model asset returns.

Modeling systems in general:

Ergodicity vs Stationarity: properties of the system versus assumptions of a model
Types of model defined: Stochastic / Probabilistic / Deterministic
Linear regression using non-linear terms: x², abs(x), max(x,0) ie “rectilinear”!
From Regression to Deep Learning in 3 simple steps: OLS as a single layer, single neuron network with linear activation fn; then with several inputs (for multivariate OLS). Step2: multiple layers. Step3: non-linear activation functions.
Is deep learning still a model?

Modeling in finance:

Log returns and simple returns: worked example to illustrate converting price series to each return series and back to prices again. Summary stats for each. When each is appropriate (or not appropriate) to be used.
Portfolio Construction and Optimisation: mountain climbing analogy, convex versus non-convex objective functions, sensitivity to initial condition
Challenges of Deep Learning & the necessary conditions of systems for it to be useful.

Model Risk Management:

Model Risk as 3 components of uncertainty: [1] Process risk: the inherent uncertainty of a system; [2] Parameter risk: uncertainty in parameter estimation; [3] Model Specification risk: the risk the assumed model structure is wrong.
Model Risk Management for trading strategies: Use and abuse of back-testing. Back-testing is not about the Sharpe Ratio! sensitivity analysis, alpha decay, temporal analysis, random portfolios, and ongoing monitoring.
Knowing when not to build a model: Characteristics of a system making it easier or harder to model. The difference between precision and accuracy. Examples of modeling in the finance industry: Collateralized bank loans (probability of default versus loss given default) , Claims on insurance policies (expected loss versus catastrophe modeling), investment management portfolios construction parameters (expected return versus volatility versus correlation between assets).
Data risk as a component of model risk: the difference between data governance & integrity (not model risk) and data representativeness (a key part of model risk)

Other topics to be added. Please comment with any suggestions and requests of anything you would like to see added

First post: what and why

Written by Ben Steiner