Semantic

Operates with the semantic information of images or individual video frames.

TitleMetric TypeData Type
Image Singularity - Finds duplicate and near-duplicate imagesimage
Image-level Annotation Quality - Compares image classifications against similar imagesimageradio
Object Annotation Quality - Compares object annotations against similar image cropsimagebounding box, polygon, rotatable bounding box

Image Singularity​

This metric gives each image a score that shows each image's uniqueness.

• A score of zero means that the image has duplicates in the dataset; on the other hand, a score close to one represents that image is quite unique. Among the duplicate images, we only give a non-zero score to a single image, and the rest will have a score of zero (for example, if there are five identical images, only four will have a score of zero). This way, these duplicate samples can be easily tagged and removed from the project.
• Images that are near duplicates of each other will be shown side by side.

Possible actions​

• To delete duplicate images: You can set the quality filter to cover only zero values (that ends up with all the duplicate images), then use bulk tagging (e.g., with a tag like Duplicate) to tag all images.
• To mark duplicate images: Near duplicate images are shown side by side. Navigate through these images and mark whichever is of interest to you.

Implementation on GitHub

Image-level Annotation Quality​

This metric creates embeddings from images. Then, these embeddings are used to build nearest neighbor graph. Similar embeddings' classifications are compared against each other.

Implementation on GitHub

Object Annotation Quality​

This metric transforms polygons into bounding boxes and an embedding for each bounding box is extracted. Then, these embeddings are compared with their neighbors. If the neighbors are annotated differently, a low score is given to it.

Implementation on GitHub