Data labeling is a crucial step in any supervised machine learning task. Labeling is the process of defining areas in an image and generating text descriptions for those regions. This process helps us make data more readable for computer vision. Again, labeling images plays an important role to ensure the quality of data. It also helps create datasets for different experiments.
Some of the most common categories of labeling images in computer vision are bounding boxes, 3D cuboids, and line annotation. Polygonal segmentation, semantic segmentation, and landmark annotation are also mentionable. In the following figure, we see how an image looks after labeling.
Also, a great number of image labeling tools for computer vision are accessible. These annotation platforms present a diverse amount of features and tools. Here is a review of some of the best labeling tools for computer vision- Supervise.ly, Hasty.ai, Ultimate Labeling, VOTT, Darwin, Heartex, and Make-Sense.
Some Best Image Labeling Tools in Computer Vision
- Several types of tools like boxes, lines, dots, polygons, bitmap brush for semantic segmentation are ready for use. Also, the feature set includes drawing holes within the polygons which are incredible.
- Users may transform input images/videos /3d point cloud into high-quality training data.
- By developing Supervise.ly plugins, the user’s focus remains only on the core of the custom logic.
- Users can read, change, and write the Supervise.ly projects on the disk.
- Different functions of labeling data, geometric objects, and tags are ready for application.
- Users have access to add images, object tags, and order figures in layers.
- The output can come through JSON files for every image or PNG masks. Again, the platform also permits users to upload formats such as COCO and Cityscapes.
1.2 Project Management:
This software provides a large number of options for project management. This works well on different levels like workspaces, teams, and datasets. These advantages offer annotator management levels like labeling jobs, permissions, and statistics. Here, a significant ability is to organize users into teams. This makes it easy to collaborate across many projects. This platform also provides a Data Transformation Language (DTL) and a Python Notebook for handling data. The presence of intuitive functions and permissions helps manage everything. It lets owners and administrators manage user access to resources, data, and developed models.
The software uses the Python SDK to import plugins for custom data formats. It carries out neural network models and runs various tools. AI-assisted labeling, video labeling, 3D point cloud labeling, and volumetric slices are possible because of this tool. The tool can manage and track annotation workflow at scale with teams, workspaces, roles, and labeling jobs. Here, users can connect with data and models using shareable links like in Google Drive.
In the above figure, we see how an image looks after labeling by supervise.ly. The interface allows users for a very specific task and supports customizable hotkey shortcuts. It is mentionable that performance has recently become a little bit slow. It becomes very frustrating if the platform takes a large time to switch between images and record annotations.
- Supports vector labeling including boxes, polygons, and also pixel-wise labeling with a brush.
- Users are able to upload data and pre-generated labels. Export of data is through a JSON file or PNG mask
- This tool makes object detection and instance segmentation up to ten times quicker than before. This supports users to go from weeks of labeling to days or even hours. Thus Hasty provides users more time to focus on building important applications.
- The platform supports 2D image data in various formats like jpeg, png, bmp, tiff, and GIF.
2.2 Project Management:
This platform ensures great workflow management for users. Here, users can sort images by status like — new, in progress, to review, and done. The software assigns permissions in a granular way. There is a manual review panel here. Users have access to visualize each labeled instance and can sort them by labeler, status, or class. This tool makes the review process straightforward which is not found in any other labeling tool.
Several smart tools like GrabCut, Contour, or Dextr are available on this platform. These tools can identify objects’ edges or contours. This can be manually fixed with a threshold considering so on the best segment of the image. On completion of enough labeled data, the tool starts supporting label predictions. There is a feature for training the user’s own object detector, instance segmentation, and semantic segmentation. The platform considers information security very strictly. It will never share team information or image data with anyone. The manual review-quickly lets users see and browse through all annotations.
In the above figure, we see a hasty screenshot. The tool works well with large scale predictable datasets where after some time, smart tools are activated. The use of this drastically reduces the amount of time spent per object. The time it takes for processing is up to 10 or 20 seconds which could have been used to do actual labeling. It can be a downside to this labeling tool.
- CVAT provides the primary tasks of supervised machine learning. It includes things like object detection, image classification, and image segmentation.
- There are CVAT annotation tools for video, semantic segmentation, and polygon annotations.
- As this tool is a web-app running in a Docker, it is easy to install and scale.
- In this platform, users use four shapes to annotate images. They use boxes, polygons for segmentation tasks, polylines for annotating road marks, and points for annotating face landmarks or pose estimation.
- This platform supports a large number of automation instruments. This includes automatic annotation using the TensorFlow* Object Detection API, video interpolation, etc.
- The tool supports the interpolation of bounding boxes between keyframes. It also supports semi-automatic annotation using Deep Learning models. Shortcuts for most serious actions, LDAP and basic access authentication, and a dashboard with a list of annotation tasks, etc. are also found.
3.2 Project Management:
CVAT is used by different companies to label millions of objects with different properties. It is a very powerful and complete annotation tool. As it supports managing collaborative work, different members of a team can work together on the same labeling task. This platform works among a group of similar DIY labeling tools including LabelImg computer vision labeling tool. Here, the UI is quite complex. For this reason, setting up an annotation task for the users may be quite tricky for the first time. Even it may take several days to master, but it proposes tons of features for labeling computer vision data. Usually, users create a public task and split the work between other users.
As CVAT is web-based, users don’t need to install the app; they just need to open the tool’s link in a browser when they want to create a task or annotate data. Here, different optional tools are supported like — Deep Learning Deployment Toolkit (Intel® Distribution of OpenVINO™ toolkit element), ELK (Elasticsearch* + Logstash* + Kibana*) analytic system, TensorFlow* Object Detection API (TF OD API), NVIDIA* CUDA* Toolkit, etc. As the platform only runs with Chrome, users have to find workarounds if they fear Google.
Here, we see two different shapes for labeling two different types of objects.
- The tool is fast, efficient, and most of all very easy to use.
- It provides an incredible UX for users
- As the tool is Open-source, free, and web-based, it does not require any complicated installation procedure and visiting the website, users can access it.
- MakeSense allows multiple annotations like bounding box, polygon, and point annotation. And the label can be exported through different formats like YOLO, VOC XML, VGG JSON, and CSV.
- The SSD model pre-trained on the COCO dataset does some of the work for users in drawing boxes on photos and also suggests a label.
- A vision model PoseNet is used to estimate the pose of a person in an image or video by estimating the position of key body joints.
- Though they do not store the images that will be labeled, the tool does not send those images anywhere concerning the privacy of data.
4.2 Project Management:
As this platform has released recently in June 2019, this does not provide any project management feature yet. And also it does not provide any API till now. Users can not collaborate with their team to work on the same annotation project. But this platform performs well for small computer vision deep learning-based projects, making the process of preparing a dataset much easier and quicker. Also, this platform does not allow upload zip files; users will have to upload images by selecting desired images to the label.
The engine that makes AI functionalities in Make-sense is TensorFlow.js. It is the JS version of the most general framework for training neural networks. This choice allows us not only to speed up the user’s work but also to care about the privacy of their data. Here, unlike with other image labeling tools, images do not have to be transferred to the server. This time AI comes to the device. Users can choose two options. One is the COCO SSD object detection model for bounding-box annotation and the other- POSE-NET pose estimation for key-point annotation.
This is the output image after labeling via makesense.ai. From the above discussions, MakeSense is a good choice to use when you have up to a few hundred images to label as in the open or closed-door example.
- This tool generates a segmentation mask around any object or part of an object in under one second.
- Tags, polygons, bounding boxes, masks, directional vectors, attributes, and more are available in this platform.
- Users can sort datasets by any combination of filters belonging to diverse data columns.
- There are graphical icons for greater consistency and better visualization.
- Here, very long videos can be uploaded and processed by users. V7 Darwin is all about labeling automation at the pixel-perfect quality. This also creates ground truth to enable AIs to learn automatically.
- Automated Image Annotation and Neural Network training are possible here, in Darwin. Also, this tool allows inverting image colors and rising brightness for image manipulation.
- This platform contains a brush tool. It relies on vector graphics to draw contours, an eraser tool for erasing mistakes of users, and new class creation access.
5.2 Project Management:
The precise and intuitive UI of this platform makes handling challenging data by labeling easier. It ensures ML training tasks. The deep learning teams can use Darwin’s active learning approach. Through this, they can create pixel-perfect ground truth 10x quicker. Here, users track the progress and quality and also train models for doing a project. The annotators can only view projects they have been invited to. They can only look at images for the tasks they have been allotted or have generated from open images. The Annotator function is best suited for third parties accessing the team for image annotation services. After labeling an image, a human reviews the output label. This contributes to the periodic learning of automation models. Thus, this tool is entirely private and exclusive to a team.
This platform is good at generating polygon and pixel-wise masks automatically. This tool allows users to create projects, add data, annotate, and export data. Here, users can delete their own projects, but not other users’. Here, tasks are bundles of images in need of annotation or review. When a task is being annotated, it is not added with the main dataset until it has gone through a review. All its images also need acceptance. After completing the task, users may start reviewing by the Review tab in a dataset. The reviewer may accept the image, reject it and even they can store the image in an archive for future tasks. Again, users can approve of the “Skip” action for skipping that image and moving it to achieve. If a task is not doing an adequate job or becomes unavailable, users may dissolve the task or re-assign with another annotator. In this platform, there is a Select Tool to enable moving and warping annotations. The Box Tool creates rectangular bounding boxes to capture the pixels belonging to an object with a box accurately. Polygon Tool creates polygons of any shape to segment objects, etc. This tool supports annotations list layering. It also supports copying annotation instances.
Here, we see how an object is labeled through this tool. Finally, we can say, this platform combines AI, reviewers, annotators, and experts in carefully crafted workflows that each image will go through whenever it’s assigned to Darwin.
- This platform helps annotate and label data. It finds outliers, and check for biases and can configure for specific uses.
- For uploading data, there are multiple formats like images, audio, and text. Users can upload data through their API or by uploading the JSON/CSV/TSV/ZIP/RAR files.
- Here, the administrator of the project has full authority of data, experts, and statistics about the labeling process.
- This platform supports connecting user’s models with Python SDK. It enables users to move from iterative to a continuous process of quality and judgments.
- After labeling a few examples, the model starts to give predictions. This tool uses powerful capabilities of transfer learning.
- After the labeling process starts, the users have the tools to validate the progress through accuracy and a couple of vital statistics.
- The data manager feature of this tool can include labels by both collaborators and models.
- API access is there to help users quickly grade the results into the pipeline.
6.2 Project Management:
Every labeling and data research journey begins with building a new project. The project consists of a dataset, the UI to label and traverse it, a machine learning model to assist users with labeling, and several external collaborators or the team’s members.
After creating a new project, each project can be extensively configured and tailored for the particular labeling scenario. In project settings, there are options like General, Collaborators, Labeling, Machine Learning, Duplicate & Delete, etc. We know, automatic annotations are insights on top of the dataset. And one of the advantages of using Heartex is the possibility of annotating datasets automatically using machine learning. For adding a member in a project, the users need to go to the Teams page and click on the Add Member button. After adding members, here users can also assign roles for them and add collaborators. After downloading the results, the results include both manual labels and model-assisted labels. This tool manages the training data in one place, control quality, and privacy. Again, this tool minimizes the amount of time one’s entire team spends on preparing and analyzing datasets for machine learning.
Heartex annotates tasks automatically using its pre-trained ML models or custom ones connected through SDK. This tool allows users to automate the processes of finding important insights. E.g., examine customer feedback in real-time, or recognize the objects in the images. Automatic annotations can boost annotator execution. Tasks are pre-labeled by ML models, and annotators only correct wrong predictions. In this way, Heartex learns continuously from the feedback. It provides more precise results with each iteration.
Heartex Active Learning algorithms seek to select diverse and informative data for annotation from a pool of unlabeled data. Active learning algorithms are a base of the Heartex platform. It allows users to use these methods with minimal costs to projects. This tool automatically labels up-to 95% of a dataset using Machine Learning and Active Learning.
Here, we see how Heartex labels different objects. Lastly, we can say it is designed with simplicity in mind and is adaptable for different needs.
- This tool supports annotations on both images and videos including 2D and 3D data labeling. For example, bounding boxes type annotation supports simple “click and drag” actions and options to add multiple attributes.
- Scalabel provides innovative features and a user-friendly interface.
- The platform allows speed by using semi-automated annotations.
- This supports concurrent annotation sessions and progress monitoring.
- This is accessible through a web browser without installation.
- The tool supports semantic segmentation, video tracking, drivable area, lane marketing, 3D bounding boxes in a point cloud, etc. In semantic segmentation, there are feature functions to fit boundaries with Bezier curves and copy shared boundaries.
7.2 Project Management:
This tool supports both video and image annotations providing a script at scripts/prepare_data.py to help users prepare the data. Here, for the video data, first, the video is split into frames at the defined frame rate (which is by default 5fps). This generates a YAML file containing the paths to the frames. Labeling segmentation for video tracking requires annotation for every frame. Here, the number of labels in the current image is shown on the left of the title bar. In video tracking labeling projects, the total number of tracks will be displayed. Again Scalabel currently supports labeling 3D bounding boxes on point cloud data. But data must be supplied in PLY format then. In this tool, the user management has two parts, first is user authentication. The second is user and projects managements. For creating a project, users have to specify the item type (image, video, or point cloud), label type (2D bbox, segmentation, 3D bbox), item list, customizable categories, and attributes. Here, user management is set up in the file user_management_config.yml. After that, users have to set user management to “on’’ to enable user management. Once they have enabled user management, users set up variables related to AWS Cognito userpool. The exported data can be imported later as an item list.
This tool predicts annotations between frames using object tracking and interpolation algorithms. Even in the drivable area, this tool can annotate the area that the driver is currently driving on. This can also annotate lane marking for vision-based vehicle localization and trajectory planning. The labeled data can be explored in the dashboard. Even, task URLs for vendors can be downloaded in the vendor dashboard. Here, with model assistance, the bounding boxes can be predicted at runtime. For using a 3D bounding box labeling tool, users should drag around to adjust the view angle. They can use the arrow key to navigate around to start labeling first. Here, segmentation labels often share borders with each other. To make segmentation annotation more convenient, Scalabel supports the vertex and edge-sharing approach. Because the image list is in bdd data format, it can also contain labels within each frame. For example, users can upload an image list like examples/image_list_with_auto_labels.json. The labels in this email list are generated by an object detector. The labels will be automatically loaded in the tasks and shown to the annotator for adjustment. In this scenario, users can use an off-shelf object detector such as Faster RCNN. If the results generated by the detector are in COCO format, they can use the script option to convert the results to BDD format. There is an advantage to directly upload the exported results from a previous annotation project as the labels will show up again in the new tasks. In the following image, we see an image labeled with this tool.
Currently, the Scalabel team is trying to make sure that this system can be stable and the internal data can be reused in the new updates. Also, they believe to break backward compatibility for the internal data storage to export the labels from the old project and import them to the new project with the new code.
- The platform can label objects and regions with a few clicks, assisted by the state-of-the-art computer vision technology.
- This is ideal for labeling large public datasets and also makes it easy to build datasets confidentially for own usage.
- Here, the model’s predictions ensure to initialize the labels by building an active learning pipeline with humans-in-the-loop.
- The API and Python SDK of this tool enable a deep and seamless integration of labeling technology into the user’s existing ML pipelines and workflows. Thus users need not build a perception pipeline from scratch.
- There are labeling management features like unlimited support, hourly billed workforce, video annotation technology, custom retraining of models, predictive labeling workflows, guaranteed uptime SLA, etc.
- The categories of data can be ‘Automotive’, ‘Vegetation’, ‘Satellite’, ‘People’, ‘Medical’, and others in this tool.
8.2 Project Management:
This tool allows collaboration with a team or on board an external workforce. For supporting a powerful labeling technology, Segments.ai brings state-of-the-art machine learning and computer vision research to industry by ensuring Active Learning, Few-shot learning, Data distillation, Generative modeling, Self-supervised learning, and Uncertainty estimation. Again, from automotive data to medical images, this tool supports different types of data to import. If the image is on the user’s local computer, they have to upload it to a cloud storage service like Amazon S3, Google Cloud Storage, Imgur, or this tool’s own asset storage service. The technology of this platform quickly learns and adapts to user-specific data. Here, a label can be appended to a sample in relation to a task defined on the dataset, such as an image classification task or an image segmentation task.
After labeling the first set of images, AI-powered predictions start working on it. Though deploying PyTorch models in the cloud is not simple, they managed to include PyTorch into Lambda functions. They also add it as a zipped archive to their deployment package and unzip it on the fly. While annotating an image for segmentation tasks, first users need to click a segment and confirm the auto-estimated space of that object. Here, users also can scroll up and down in order to adjust the size of the segments. This tool also allows users to drag and connect multiple segments. Thus multiple objects are labeled semi-automatically in segments.ai. For labeling images perfectly, there are several options like enabling paintbrush, enabling erase mode, viewing images and labels, options for pan, zoom, etc. This is illustrated below:
In the following image, we see the output image of Segments.ai.
There are numerous data labeling tools available for us, apart from the ones discussed above. As each tool works well for each purpose, the key is not to know a lot of tools but to know which tool will work best for your project and to understand how to leverage it best.
References for Figure:
Fig : https://segments.ai/
Fig: https://www.youtube.com/watch?v=Y1JOCxXQLMg (this gif was made from the given link)