Skip to main content

Understanding Data Distribution

Get insights into the distribution of your visual data with Encord Active

Encord Active enables you to visually explore your data distribution by pre-defined metrics, custom metrics, and label classes. Understanding your data distribution by different metrics helps you uncover areas where you might be missing data that could improve your models performance on different outliers or edge cases.

Prerequisites: Dataset

tip

If you have uploaded your model predictions you can combine this workflow with Find Important Metrics to better prioritise what metrics to look at.

Setup

If you haven't installed Encord Active, visit installation. In this workflow we will be using the COCO validation dataset.

Steps

Navigate to the Data Quality > Explorer tab and select a quality metric in the top left menu to order your data by.

Select a metric to order your data by in the dropdown menu in the top of the page (e.g., Brightness or Aspect Ratio).

data-quality-similar-images.png

In the dashboard you can see the distribution of your data according to the chosen metric.

Use the slider to navigate the dataset ordered by the chosen metric.

data-quality-similar-images-quality.png