Semantic
Operates with the semantic information of images or individual video frames.
Title | Metric Type | Data Type |
---|---|---|
Image Singularity - Finds duplicate and near-duplicate images | image | |
Image-level Annotation Quality - Compares image classifications against similar images | image | radio |
Object Annotation Quality - Compares object annotations against similar image crops | image | bounding box , polygon , rotatable bounding box |
Image Singularity
This metric gives each image a score that shows each image's uniqueness.
- A score of zero means that the image has duplicates in the dataset; on the other hand, a score close to one represents that image is quite unique. Among the duplicate images, we only give a non-zero score to a single image, and the rest will have a score of zero (for example, if there are five identical images, only four will have a score of zero). This way, these duplicate samples can be easily tagged and removed from the project.
- Images that are near duplicates of each other will be shown side by side.
Possible actions
- To delete duplicate images: You can set the quality filter to cover only zero values (that ends up with all the duplicate images), then use bulk tagging (e.g., with a tag like
Duplicate
) to tag all images. - To mark duplicate images: Near duplicate images are shown side by side. Navigate through these images and mark whichever is of interest to you.
Implementation on GitHub
Image-level Annotation Quality
This metric creates embeddings from images. Then, these embeddings are used to build nearest neighbor graph. Similar embeddings' classifications are compared against each other.
Implementation on GitHub
Object Annotation Quality
This metric transforms polygons into bounding boxes and an embedding for each bounding box is extracted. Then, these embeddings are compared with their neighbors. If the neighbors are annotated differently, a low score is given to it.
Implementation on GitHub