Occam’s Razor in ML practice

Jaideep Ray

Published in

Better ML

2 min readOct 10, 2021

Occam’s razor in essence states that a model should be as simple as possible, but no simpler.

In this post we explore Occam’s razor principle in ML practice.
The two questions that we have to answer :

What do we mean when we say a model m1 is simpler than m2 ?

We can consider number of coefficients to be a proxy for model complexity. While comparing models one must consider number of learned coefficients. Lesser the simpler.

Based on the application, other metrics can also be considered : training / inference latency, memory required for training / inference etc.

How do we know if model m1 is better than m2 ?

Simpler has a better chance of being right. This is directionally correct as more complex models have a higher chance of overfitting. The best way of compare models is through performance metrics. We describe a bootstrap pairwise test below to compare m1 vs m2

Bootstrap Pairwise Test:

Terminology :

labels and predictions denote true label and predictions array.
eval_dataset is the test corpus on which m1 and m2 can be compared.
primary metric -example : prAUC (It can be any metric)

Conclusion :

Report model complexity while reporting performance metric.
Use bootstrap pairwise test to compare model performance.
Use occam’s razor as a guiding heuristic while selecting models.

Occam’s Razor in ML practice

What do we mean when we say a model m1 is simpler than m2 ?

How do we know if model m1 is better than m2 ?

Conclusion :

Further reading :

Written by Jaideep Ray