Extreme Imbalanced Data — The Worst Data Scientist Nightmare

And the Accuracy Trap

Carla Martins


Photo by Luke Chesser on Unsplash

We can say that we have imbalanced data when one of the target variable classes has a much lower frequency than the other(s). One common example is data on cancer detection. If we have 10,000 lab results to detect cancer, and we only have a relative frequency of 1% of positive results for cancer, our data is extremely…



Carla Martins

Compulsive learner. Passionate about technology. Speaks C, R, Python, SQL, Haskell, Java and LaTeX. Interested in creating solutions.