Data Science Case Study: EDA on Mystery Science

EDA using Data Science tools: Python, Data Visualizations, Machine Learning, Statistical Tests and Inference, a Custom Build ML Tool, and more!

Alexander Barriga
Sep 6, 2018 · 1 min read

What’s Mystery Science?

The above gist is an exhaustive data analysis on Mystery Science (MS) data. MS is a San Francisco based startup that is working towards improving the K-5 science curriculum in the United States and across the world.


What’s in the Notebook?

The notebook demonstrates a typical EDA workflow on industry data. Tools like data visualization, machine learning, and statistics are all combined to provide a thorough analysis and answer some business critical questions.


Why should the reader care?

If you’re someone that interested in learning how to:

  • Comprehensively use the data science tool box
  • Learn about the data and model error through iterative modeling building
  • Feature Engineer the right features
  • Write efficient code for EDA workflows

Then you can benefit from reading this notebook!


About the Author

Alexander is a Data Scientist based out of Berkeley, CA. He enjoys learning, teaching, and applying Data Science both as a data scientist in industry and through passion projects.

You can check out his work on Github and LinkedIn

Alexander Barriga

Written by

Data Scientist by day, Tango Dancer by night.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade