Public Data Sets: Use these to train Machine Learning models on Mateverse

Mate Labs
2 min readJul 8, 2017

To get you started with Machine Learning.

About Mateverse

The ML platform which enables you to build and train customized models without writing even a single line of code.

Public Image Datasets

  1. IRIS Flowers Data Set. One of the most common data set being used by the beginners for getting started with Pattern Recognition. This data set contains 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length. (Courtesy UCI).
  2. Breast Cancer Data Set. A data set to perform classification tasks to identify if there’s cancer in the mammography or not.
  3. Jewellery Data Set. A basic data set to start testing your CV algorithms to perform classification tasks.
  4. Shapeset. An artificially generated images to train machines to identify different geometrical shapes.

Public Text/CSV Datasets

  1. Sentiment Analysis Data Set. One of the most basic data set in text but equally important for understanding customer reviews/feedback, and taking actions against it.
  2. News Data Set. A collection of 20 news groups top train a text classifier on different types of news.
  3. Wine Quality Data Set. Related to Red and White variants of Portuguese “Vinho Verde” wine. This data set can be used for both Classification and Regression tasks. (Courtesy UCI).
  4. Diabetes Data Set. A Categorical Data set of Diabetes records. (Courtesy UCI).

This is the first in the series, and we are planning to make a lot more data sets public in the coming days, be it from the community or something we’ll make.

Share your thoughts or suggestion with us on Twitter or on LinkedIn. Check our website- Mate Labs to know more.

Keep a watch, and if you want to know first Subscribe below.

More About Us

We have come a long way in making Mateverse a better platform. With Mateverse V1.0, we make the jobs of Analysts and Data Scientists easier, with proprietary technologies vis a vis, Complex pipelines, Big Data support, Automated Data Preprocessing (Missing Value Imputation using ML models, Outlier Detection, and Formatting), Automated Hyperparameter Optimization, and much more.

Previously we’ve shared.

  1. Mateverse Public Beta Announcement.
  2. Why do we need the Democratization of Machine Learning?
  3. A very crisp explanation of All-CNN’s implementation. How these researchers tried something unconventional to come out with a smaller yet better Image Recognition.
  4. Our Vision. What Everyone is not Telling You about Artificial Intelligence

--

--

Mate Labs

We’re trying to enable Machine Learning and Deep Learning to one and all. Irrespective of whether a user knows how to code or not.