Occam’s Razor in ML practice

Jaideep Ray
Better ML
Published in
2 min readOct 10, 2021

--

Occam’s razor in essence states that a model should be as simple as possible, but no simpler.

In this post we explore Occam’s razor principle in ML practice.
The two questions that we have to answer :

What do we mean when we say a model m1 is simpler than m2 ?

We can consider number of coefficients to be a proxy for model complexity. While comparing models one must consider number of learned coefficients. Lesser the simpler.

Based on the application, other metrics can also be considered : training / inference latency, memory required for training / inference etc.

How do we know if model m1 is better than m2 ?

Simpler has a better chance of being right. This is directionally correct as more complex models have a higher chance of overfitting. The best way of compare models is through performance metrics. We describe a bootstrap pairwise test below to compare m1 vs m2

Bootstrap Pairwise Test:

Terminology :

  • labels and predictions denote true label and predictions array.
  • eval_dataset is the test corpus on which m1 and m2 can be compared.
  • primary metric -example : prAUC (It can be any metric)
Bootstrap pairwise test

Conclusion :

  • Report model complexity while reporting performance metric.
  • Use bootstrap pairwise test to compare model performance.
  • Use occam’s razor as a guiding heuristic while selecting models.

Further reading :

[1] https://en.wikipedia.org/wiki/Occam_learning

[2] Learning from Data [http://amlbook.com/]

--

--