Sitemap
Data Science Collective

Advice, insights, and ideas from the Medium data science community

Member-only story

Bad, Good, and Great Practices in Machine Learning: Theory and Practice

6 min readMar 20, 2025

--

Machine learning models are often used in the real world to solve complex problems. However, the success of these models should be evaluated not only by statistical metrics, but also by real-world performance and impact. In this article, we will discuss common mistakes, best practices, and advanced approaches in machine learning. We will examine each section in detail and support it with practical examples.

CHAPTER 1: Bad Machine Learning Practices

1.1 Ignoring Imbalances in Data

Imbalanced data is when one class is represented by more data than the other. It is usually observed a lot in fraud scenarios. If you develop a model without looking at your target distribution and then use accuracy to evaluate the results, you will have ignored the minority class and followed a wrong path.

Imbalanced data example:

pd.value_counts(df['Class'])
0    284315
1 492
Name: Class, dtype: int64

--

--

Data Science Collective
Data Science Collective

Published in Data Science Collective

Advice, insights, and ideas from the Medium data science community

Buse Şenol
Buse Şenol

Written by Buse Şenol

BAU Software Engineering | Data Scientist | The AI Lens Editor | https://www.linkedin.com/in/busesenoll/