Butterflies. Image created with Midjourney by the author.

Embracing the Unknown: Lessons from Chaos Theory for Data Scientists

Insights for understanding the limits of predictive models

Hennie de Harder
bigdatarepublic
Published in
7 min readJun 23, 2023

--

Sometimes during a data science project, you discover that it’s really hard to improve your metric. You try many things: complex models, adding more data, hyperparameter tuning, feature engineering, feature selection, everything. It just doesn’t get better. You can’t even improve the baseline, which was a simple moving average. What is happening? In such cases, maybe you should stop trying because something else is going on.

In this post, I want to share why it is not always possible to get good predictions. In specific projects, you might be dealing with chaos. Not in the normal sense of the word (complete chaos or randomness), but with scientific chaos. A chaotic system is really hard or impossible to predict, especially in the long term.

In the following three paragraphs, you’ll get a good understanding of chaos and what it can mean for a data science project.

1. The Existence of Chaos

Before scientific chaos was discovered, Newton had a dream. He thought it was possible to understand all of nature through principles of physics that could be expressed mathematically. Newton did a great job in understanding the…

--

--

Hennie de Harder
bigdatarepublic

📈 Data Scientist with a passion for math 💻 Currently working at IKEA and BigData Republic 💡 I share tips & tricks and fun side projects