Label quality metrics

Label quality metrics operate on the geometries of objects like bounding boxes, polygons and polylines.

Access Label Quality Metrics

Label Quality Metrics are used for sorting data, filtering data, and data analytics.

TitleMetric TypeOntology Type
Absolute Area - Computes object size in amount of pixels.imagebounding box, polygon, rotatable bounding box
Aspect Ratio - Computes aspect ratios of objects.imagebounding box, polygon, rotatable bounding box
Blue Value - Ranks annotated objects by how blue the average value of the object is.imagebounding box, polygon, rotatable bounding box
Brightness - Ranks annotated objects by their brightness.imagebounding box, polygon, rotatable bounding box
Border Proximity - Ranks annotations by how close they are to image borders.imagebounding box, point, polygon, polyline, rotatable bounding box, skeleton
Broken Object Tracks - Identifies broken object tracks based on object overlaps.sequence, videobounding box, polygon, rotatable bounding box
Brightness - Ranks annotated objects by their brightness.imagebounding box, polygon, rotatable bounding box
Confidence - The confidence that an object was annotated correctly.imagebounding box, polygon, rotatable bounding box
Contrast - Ranks annotated objects by their contrast.imagebounding box, polygon, rotatable bounding box
Classification Quality - Compares image classifications against similar images.imageradio
Green Value - Ranks annotated objects by how green the average value of the object is.imagebounding box, polygon, rotatable bounding box
Height - Ranks annotated objects by the height of the object.imagebounding box, polygon, rotatable bounding box
Inconsistent Object Class - Looks for overlapping objects with different classes (across frames).sequence, videobounding box, polygon, rotatable bounding box
Inconsistent Track ID - Looks for overlapping objects with different track-ids (across frames).sequence, videobounding box, polygon, rotatable bounding box
Label Duplicates - Ranks labels by how likely they are to represent the same object.imagebounding box, polygon, rotatable bounding box
Missing Objects - Identifies missing objects based on object overlaps.sequence, videobounding box, polygon, rotatable bounding box
Object Classification Quality - Compares object annotations against similar image crops.imagebounding box, polygon, rotatable bounding box
Occlusion Risk - Tracks objects and detect outliers in videos.sequence, videobounding box, rotatable bounding box
Polygon Shape Anomaly - Calculates potential outliers by polygon shape.imagepolygon
Randomize Objects - Assigns a random value between 0 and 1 to objects.imagebounding box, polygon, rotatable bounding box
Red Value - Ranks annotated objects by how red the average value of the object is.imagebounding box, polygon, rotatable bounding box
Relative Area - Computes object size as a percentage of total image size.imagebounding box, polygon, rotatable bounding box
Sharpness - Ranks annotated objects by their sharpness.imagebounding box, polygon, rotatable bounding box
Width - Ranks annotated objects by the width of the object.imagebounding box, polygon, rotatable bounding box

To access Label Quality Metrics for Explorer:

  1. Click a Project from the Active home page.

  2. Click Explorer.

  3. Click Labels.

  4. Sort and filter the tabular data.

  5. Click the plot diagram icon.

  6. Sort and filter the embedding plot data.

To access Label Quality Metrics for analytics:

  1. Click a Project from the Active home page.

  2. Click Analytics.

  3. Click Annotations.

  4. Select the quality metric you want to view from the 2D Metrics view or Metrics Distribution graphs.

Absolute Area

Computes object size in amount of pixels.

Implementation on GitHub.

Aspect Ratio

Computes aspect ratios (width/height) of objects.

Implementation on GitHub.

Blue Value

Ranks annotated objects by how blue the average value of the object is.

Implementation on GitHub.

Brightness

Ranks annotated objects by their brightness. Brightness is computed as the average (normalized) pixel value across each object.

Implementation on GitHub.

Broken Object Tracks

Identifies broken object tracks by comparing object overlaps based on a running window.

Example:

If objects of the same class overlap in three consecutive frames (i-1, i, and i+1) but do not share object hash, the frames are flagged as a potentially broken track.

Broken Object Tracks example

CAT:2 is marked as potentially having a wrong track id.

Implementation on GitHub.

Border Proximity

This metric ranks annotations by how close they are to image borders.

Implementation on GitHub.

Confidence

The confidence score (α) is a measure of a machine learning model's certainty that a given prediction is accurate. The higher the confidence score, the more certain a model is about its prediction.

Manual labels are always assigned α = 100%, while label predictions created using models and automated methods such as interpolation have a confidence score below 100% (α < 100%).

Values for this metric are calculated as labels are fetched from Annotate.

ℹ️

Note

While arguably not making much sense when annotated by a human, this value is very important for objects that were automatically labeled.

Contrast

Ranks annotated objects by their contrast. Contrast is computed as the standard deviation of the pixel values.

Implementation on GitHub.

Classification Quality

This metric creates embeddings from images. Then, these embeddings are used to build nearest neighbor graph. Similar embeddings' classifications are compared against each other.

We calculate the embeddings of each image, (for example, change 3xNxM dimensional images to 1xD dimensional vectors using a neural network architecture). Then for each embedding (or image) we look at the 50 nearest neighbors and compare its annotation with the neighboring annotations.

For example, let's say the current image is annotated as A but only 20 out of 50 of its neighbors are also annotated as A. The rest are annotated differently. That gives us a score of 20/50 = 0.4. A score of 1 means that the annotation is very reliable because very similar images are annotated the same. As the score gets closer to the zero, the annotation is not reliable.

Implementation on GitHub.

Green Value

Ranks annotated objects by how green the average value of the object is.

Implementation on GitHub.

Height

Ranks annotated objects by the height of the object.

Implementation on GitHub.

Inconsistent Object Class

This algorithm looks for overlapping objects in consecutive frames that have different classes.

Example:

Inconsistent Object Class example

Dog:1 is flagged as potentially the wrong class, because Dog:1 overlaps with CAT:1.

Implementation on GitHub.

Inconsistent Track ID

This algorithm looks for overlapping objects with different track-ids. Overlapping objects with different track-ids are flagged as potential inconsistencies in tracks.

Example:

Inconsistent Track ID example

Cat:2 is flagged as potentially having a broken track, because track ids 1 and 2 do not match.

Implementation on GitHub.

Label Duplicates

Ranks labels by how likely they are to represent the same object.

Jaccard similarity coefficient is used to measure closeness of two annotations.

Example 1:

An annotator accidentally labels the same thing in a frame twice.

An annotator labeled the same orange twice in a frame. Look carefully at both images and you can see that there are two slightly different labels around the orange.

Duplicate labels example 1

Example 2:

Sometimes the same type of things in a frame are very close to each other and the annotator does not know if the things should be annotated separately or as a group so they do both. Or perhaps the annotator labels all the things in a group and sometimes they label each individual thing, or they label the group and each individual thing in the group.

An annotator labeled a group of oranges and then labeled individual oranges in the group.

Duplicate labels example 2

Implementation on GitHub

Missing Objects

Identifies missing objects by comparing object overlaps based on a running window.

Example:

If an intermediate frame (frame i) does not include an object in the same region, as the two surrounding frames (i-1 and i+1), the frame is flagged.

Missing Objects example

Frame i is flagged as potentially missing an object.

Implementation on GitHub.

Object Classification Quality

This metric transforms polygons into bounding boxes and an embedding for each bounding box is extracted. Then, these embeddings are compared with their neighbors. If the neighbors are annotated/classified differently, a low score is given to the classification.

Implementation on GitHub.

Occlusion Risk

This metric collects information related to object size and aspect ratio for each video and finds outliers among them.

Implementation on GitHub.

Polygon Shape Anomaly

Computes the Euclidean distance between the polygons' Hu moments for each class and the prototypical class moments.

Implementation on GitHub.

Red Value

Ranks annotated objects by how red the average value of the object is.

Implementation on GitHub.

Relative Area

Computes object size as a percentage of total image size.

Implementation on GitHub.

Randomize Objects

Uses a uniform distribution to generate a value between 0 and 1 to each object

Implementation on GitHub.

Sharpness

Ranks annotated objects by their sharpness.

Sharpness is computed by applying a Laplacian filter to each annotated object and computing the variance of the output. In short, the score computes "the amount of edges" in each annotated object.

score = cv2.Laplacian(image, cv2.CV_64F).var()

Implementation on GitHub.

Width

Ranks annotated objects by the width of the object.

Implementation on GitHub.


What’s Next