Annotation from FaceMask Dataset by Humans In The Loop

Picsell.ia Open Dataset Hub launch

Thibaut Lucas
4 min readJun 3, 2020

--

Today is a big day for us at Picsell.ia as we have just released a brand new version of our platform. We chose to focus on the tech during the recent crisis in order to bring up a totally new and more complete solution that what we had earlier (an image annotation platform).

Although the image annotation is still available in the platform, we now include what we called the “Open Datasets Hub”.

TLDR: You can find every free public dataset on our platform at www.picsellia.com

It’s a place where every user on the platform can publish Dataset and make it available instantly for everyone ! Which means that if you find an interesting Dataset in our Hub, you can “clone” it to use it in your own project and you can even get the annotations that come along if they are available.

As we launch our platform, we wanted to show you what you can expect from it and see what are the Datasets available in the Hub so far.

VisDrone 2019

This datasets contains aerial images of the urban Chinese area, in many different light exposures and altitudes.

It is composed of 6341 images in 1360x765 resolution, but we also uploaded a “lite version” with a subset of only 150 images.

The annotations that comes along are bounding-boxes made on 13 classes.

A total of 363 000 objects have been annotated .

Here are all the classes and the repartitions of the dataset :

You might wonder what the “Quality” value is referring to.

It indicates how balanced your Dataset is, which means that if there is a huge difference between the number of objects between your classes, you will have a low “quality” but if you have the exact same number of objects for each class you will see a 100% quality.

This value is important for Datasets where you don’t have lot of objects and under-sampling is not an option to equalize your classes.

As you can see, the Dataset is not quite balanced but as the least represented class (awning-tricycle) comes with 3396 labels, you can still train a decent neural network by doing some under-sampling.

Face Mask (By Humans In The Loop)

Our friends from Humans In The Loop (a company specialized in image annotation) has curated this Dataset and accepted to share it on our platform.

It’s composed of 6024 images with 20 different labels and 23500 annotated objects.

Once again it’s been wonderfully annotated with very precise bounding-boxes and enough objects to train a good neural network. This Dataset will be perfect if you want to design an access system based on the presence or not of a mask covering the face of people.

The name of this dataset in the Hub is “FaceMask_Dataset” so don’t wait to go check it out !

The Simpsons Characters

We don’t want to be serious all the time, and we want you to be able to access every kind of images, who knows what your next project will be ?

That’s why we made available this Dataset which allows to perform detection of 20 characters from “The Simpsons”. It contains 20 classes of approximately 1000 images each which represent 20 000 annotated characters.

This one Dataset compared to the others is almost perfectly balanced.

This is only a subset of what you can find on our platform and the content is daily updated by our users ! Don’t hesitate to come by to search for some data or contribute too by sending your own Dataset and make it public, see you there !

Join our Community and Start building :) https://bit.ly/2U2PHOn

--

--