Touring the COCO Sandbox dataset

In this tutorial, you will see some cool features of Encord Active based on the Coco sandbox dataset.
You will go through the following steps:

  1. Downloading the dataset
  2. Opening Encord Active in your browser
  3. Finding and flagging label errors
  4. Figuring out what metrics influence model performance



This tutorial assumes that you have installed encord-active.

1. Downloading the dataset

Download the data by running this command

encord-active download

The script asks you to choose a project, navigate the options with and and hit enter.

Now encord-active will download your data.

2. Opening Encord Active in your browser

When the download process is done, follow the printed instructions to launch the app with the start CLI command:

cd /path/to/downloaded/project
encord-active start



If the terminal seems stuck and nothing happens, try visiting http://localhost:8000 in your browser.

3. Finding and flagging label errors

You will carry out this process in two steps:

  1. Identifying metrics with label errors
  2. Tagging label errors

Identifying metrics with label errors

  1. Open Encord Active (from the web-app or from a local installation):

    Encord Active Landing page

    Encord Active Landing (Quickstart) page

  2. Select a project.



    If a project does not exist in Encord Active, create one (in the web-app) or import one.

Go to the Summary > Annotation page.
The page should look like this:

Annotation Quality Summary Page

On the Summary page, you will find all the outliers that Encord Active automatically found based on all the metrics that were computed for the labels.

Go to the Explorer page.

Select "Annotation Duplicates" and scroll down the page.

The page should look similar to this:

Annotation Duplicates

The page shows how this metric was computed, how many outliers were found and some of the most severe outliers.

If you hover the mouse over the image with the orange, you can click the expand button as indicated here:

Expand the image

Clicking the button provides a larger view of the images and detailed information about the image.

Duplicated annotations

Notice the duplicated annotations.

Hit Esc to exit the full screen view.

If you take a closer look at the annotations in the other displayed images, you will notice the same issue.

Duplicated annotations
Duplicated annotations
Duplicated annotations



You can find other sources of label errors by inspecting the other tabs. Good places to start could be the "Object Area" and "Object Aspect Ratio" annotation metrics.

Tagging label errors

To tag the images with the identified label errors, select the images and click the TAG button and provide the name for the new tag.

Add new tag

3. Figuring out what metrics influence model performance

Encord Active also allows you to figure out which metrics influence your model performance the most.
In this section, we'll go through a subset of those:

The high level view of model performance

mAP and mAR scores

The first section displays the mean Average Precision (mAP), mean Average Recall (mAR), true positive (TP), false positive (FP), and false negative (FN) of your model based on the IOU threshold set in the top of the page.

mAP and mAR scored

Dargging the IOU slider changes the scores.
You can also choose to see the aggregate score for certain classes by selecting them in the drop-down to the left.

Metric importance and correlation

Scrolling down the Summary page, the importance and correlations of your model performance display as functions of metrics.

Metric Importance
Metric  Correlation

From this overview, you can see that, for example "Confidence" has a high importance for the model performance.

Next, we can jump to the Metric Performance page and take a closer look at exactly how the model performance is affected by this metric. However, we want to show you the rest of this page prior to doing this.

You can skip straight ahead to the Inspecting Model Performance for a Specific Metric if you are too curious to wait.

Before jumping into specific metrics, we want to show you the decomposition of the model performance based on individual classes. Scrolling down the Summary page, the Per Class average precision, average recall, and precision recall curve scores for each individual class appears.

Inspecting model performance for a specific metric

Using the Metric Performance and Explorer pages you can see how specific metrics affect the model performance:

  1. Go to Predictions > Metric Performance.
  2. Select the "Confidence" metric from the Metric drop down list.

Performance by Metric page

The plot shows the precision and the false negative rate as a function of the selected metric; "Confidence" in this case.

  1. Go to Predictions > Explorer.
  2. Filter the data based on a data or prediction metric and the prediction outcome.



Queries are only available in the web-app version of Active.


This concludes the tour around Encord Active with the COCO Sandbox dataset. By now, you should have a good idea about how you can improve both your data, labels, and models by the insights you get from Encord Active.

Next steps

  • We've only covered each page in the app briefly in this tutorial.
  • To learn more about concrete actionable steps you can take to improve your model performance, we suggest that you have a look at the Workflow section.
  • If you want to learn more about the existing metrics or want to build your own metric function, the Quality Metrics section is where you should continue reading.
  • Finally, we have also included some in-depth descriptions the Command Line Interface.