Computer vision data sets are essential for training machine learning models to detect objects, faces, and other visual features. However, it can be difficult to know what to annotate and how to do it correctly.
We would like to share our experience to promote some best practices for annotating images in computer vision data sets:
There are a variety of annotation tools available, both free and commercial, so it is important to choose one that is right for your project. Some factors to consider include the type of image data you are annotating, the number of images you need to annotate, and your budget. Some of the most popular tools include:
LabelImg: This is a free, open-source image annotation tool that is available on three platforms: Windows, macOS, and Linux. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML files in PASCAL VOC format, the format used by ImageNet (a database of images organized according to the WordNet hierarchy of nouns, with hundreds and thousands of images for each node. It has played a significant role in the progress of computer vision and deep learning research. Researchers can access the data for free for non-commercial purposes). LabelImg supports YOLO and CreateML formats, too.
VIA (VGG Image Annotator): Open-source, user-friendly and self-contained software that enables manual annotation of images, audio, and video. It can be accessed through a web browser without requiring any installation or configuration. The entire program is contained within a single HTML page less than 400 Kilobytes in size, and it can be used offline in most modern web browsers. VIA relies solely on HTML, Javascript, and CSS, and does not require any external libraries. Released under the BSD-2 clause license, it is many annotation services’ go-to choice, as it is suitable for both academic research and commercial applications. Available on Windows, macOS, and Linux.
LabelMe: online annotation tool delivered by the MIT CSAIL team to build image databases for computer vision research. Also freely available on Windows, macOS, and Linux.
Note: a version for Polygonal Annotation can be found in GitHub.
Zight: This is a commercial image annotation tool.
V7: Another commercial image annotation tool.
It is important to train your annotators on how to annotate images correctly. This will help to ensure that the data is consistent and accurate.
Learn more:
Now we know there are plenty of free and commercial (proprietary) tools available, the key is to remember that adding spice to the mix will only help the ML model to recognize more variety and thus become more accurate. There are a variety of annotation tools available, as you can see above, so it is important to choose not just one that is right for your project, but also vendors that can offer results from several tools.
When annotating images, it is important to annotate the entire image, not just the objects of interest. This will help the model to learn about the context of the image.
It is essential to use consistent labels when annotating images. This will help the model to learn how to identify different objects. The annotations should be easy to understand for both humans and machines.
Once the images have been annotated, it is important to review them to ensure that they are accurate and consistent. This will help to ensure that the data is accurate and up-to-date.
7. Use a consistent color scheme for different objects
This will help the model to learn how to identify different objects.
The model must learn the relationships between different objects and logical annotation will really help this happen.
Identifying the different levels of detail will be of great benefit to the model to help it learn how to identify objects at different scales.
By following these tips, you can improve the quality and accuracy of your computer vision data sets and make them more useful for training machine learning models.