The Machine Learning 101 — Day 2

Machine Learning Project

ZIRU
4 min readJun 14, 2023
Photo by Hitesh Choudhary on Unsplash

If you’re studying machine learning, it’s recommended to practice using real-world data instead of artificial datasets. The good news is there are numerous open datasets available for you to choose from, covering various domains. Below are some places where you can find these datasets:

Well-known open data repositories:

  • OpenML.org
  • Kaggle.com
  • PapersWithCode.com
  • UC Irvine Machine Learning Repository
  • Amazon’s AWS datasets
  • TensorFlow datasets

Meta portals (that list open data repositories):

  • DataPortals.org
  • OpenDataMonitor.eu

Other websites that list many popular open data repositories:

  • Wikipedia’s list of machine learning datasets
  • Quora.com
  • The datasets subreddit

Select a Performance Measure

Root mean square error (RMSE)

RMSE is a way to measure how close a prediction model’s guesses are to the actual values in a dataset. For example, let’s say we are trying to predict how tall a…

--

--