Skip to main content

Filtering Data

To filter your data and labels based on metrics, you use the MergedMetrics dataframe.

from pathlib import Path

import pandas as pd
from encord_active.lib.db.connection import DBConnection
from encord_active.lib.db.merged_metrics import MergedMetrics
from encord_active.lib.project.project_file_structure import ProjectFileStructure

project_path = Path("/path/to/your/project/root")
with DBConnection(ProjectFileStructure(project_path)) as conn:
metrics: pd.DataFrame = MergedMetrics(conn).all().reset_index()


This dataframe will have all your data and labels listed with all the associated metrics computed on them:

Index(['identifier', 'url', 'Green Values', 'Sharpness', 'Image Singularity',
'Blur', 'Random Values on Images', 'Red Values', 'Area', 'Aspect Ratio',
'Brightness', 'Blue Values', 'Contrast',
'Image-level Annotation Quality', 'description', 'object_class',
'annotator', 'frame', 'tags'],

Based on this data frame, you can do any filter you might like using pandas.

To get the path to the data item (image) that a specific row corresponds to, you can use this utility function:

from encord_active.lib.project import ProjectFileStructure
from encord_active.lib.common.image_utils import key_to_data_unit

fs = ProjectFileStructure(project_path)

metric_row = metrics.iloc[0]
image_url = key_to_data_unit(metric_row["identifier"], fs).path