Computer vision data sets are essential for training machine learning models to detect objects, faces, and other visual features. However, it can be difficult to know what to annotate and how to do it correctly.
We would like to share our experience to promote some best practices for annotating images in computer vision data sets:
1. Choose the right annotation tool
There are a variety of annotation tools available, both free and commercial, so it is important to choose one that is right for your project. Some factors to consider include the type of image data you are annotating, the number of images you need to annotate, and your budget. Some of the most popular tools include:
Image Annotation Tools
LabelImg: This is a free, open-source image annotation tool that is available on three platforms: Windows, macOS, and Linux. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML files in PASCAL VOC format, the format used by ImageNet (a database of images organized according to the WordNet hierarchy of nouns, with hundreds and thousands of images for each node. It has played a significant role in the progress of computer vision and deep learning research. Researchers can access the data for free for non-commercial purposes). LabelImg supports YOLO and CreateML formats, too.
LabelMe: online annotation tool delivered by the MIT CSAIL team to build image databases for computer vision research. Also freely available on Windows, macOS, and Linux.
Note: a version for Polygonal Annotation can be found in GitHub.
Zight: This is a commercial image annotation tool.
V7: Another commercial image annotation tool.
2. Train your annotators
It is important to train your annotators on how to annotate images correctly. This will help to ensure that the data is consistent and accurate.
3. Use a variety of annotation tools
Now we know there are plenty of free and commercial (proprietary) tools available, the key is to remember that adding spice to the mix will only help the ML model to recognize more variety and thus become more accurate. There are a variety of annotation tools available, as you can see above, so it is important to choose not just one that is right for your project, but also vendors that can offer results from several tools.
4. Annotate the entire image
When annotating images, it is important to annotate the entire image, not just the objects of interest. This will help the model to learn about the context of the image.
5. Use consistent labels (clear and concise annotation style)
It is essential to use consistent labels when annotating images. This will help the model to learn how to identify different objects. The annotations should be easy to understand for both humans and machines.
6. Review the annotations regularly
Once the images have been annotated, it is important to review them to ensure that they are accurate and consistent. This will help to ensure that the data is accurate and up-to-date.
7. Use a consistent color scheme for different objects
This will help the model to learn how to identify different objects.
8. Annotate images in a logical order
The model must learn the relationships between different objects and logical annotation will really help this happen.
9. Annotate images at different levels of detail
Identifying the different levels of detail will be of great benefit to the model to help it learn how to identify objects at different scales.
By following these tips, you can improve the quality and accuracy of your computer vision data sets and make them more useful for training machine learning models.