Data Groups & Label Spaces

Working with Data Group labels requires the use of Label Spaces. Each Data Group’s Label Row contains multiple Label Spaces, with each Space corresponding to a specific Data Group item.

Get Label Spaces

The following script gets all Label Spaces in Label Row and prints their IDs.

Get All Label Space IDs

from encord import EncordUserClient, Project

# User input
SSH_PATH = "<private_key_path>"  # Replace with the file path to your SSH private key
PROJECT_ID = "<project_id>"  # Replace with the unique Project ID
DATA_TITLE = "<data_unit_title>" # Replace with the title of the data unit 

# --- Connect to Encord ---
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
    # For US platform users use "https://api.us.encord.com"
    domain="https://api.encord.com",
)

# Get Project
project: Project = user_client.get_project(PROJECT_ID)

# Get all Label Rows for the data unit
rows = project.list_label_rows_v2(data_title_eq=DATA_TITLE)
# Assume we want the first Label Row
lr = rows[0]
lr.initialise_labels()

# Get all Label Spaces for the Label Row
label_spaces = lr.get_spaces()

# Print IDs
for space in label_spaces:
    print(
        f"spaceId: {space.space_id}, "
        f"file_name: {space.metadata.file_name}, "
        f"layout_key: {space.metadata.layout_key}"
    )

Get & Update Labels

Use the following scripts to get all annotations in a Label Space. The script shows how annotations can be updated by changing the last_edited_by and annotation coordinates for each annotation in the Label Space. Ensure that you replace:

All variables in the User input section at the top of the script.
The names of objects in your Ontology.
The type and position of the label(s).

Label Spaces can be specified using:

A storage item’s unique ID: label_space = lr.get_space(id="7E3KERd9arYTiPicaijP6c1LfI73", type_="image")
Its Data Group layout key: label_space = lr.get_space(layout_key="2", type_="image").

Using layout_keys lets you reuse the same code across multiple label rows and Data Groups (for example, “left-video” and “right-video”), without needing to look up the underlying storage item ID each time.

from encord import EncordUserClient, Project
from encord.objects.coordinates import BoundingBoxCoordinates

# User input
SSH_PATH = "<private_key_path>"  # Replace with the file path to your SSH private key
PROJECT_ID = "<project_id>"  # Replace with the unique Project ID
DATA_TITLE = "<data_unit_title>" # Replace with the title of the data unit 
LABEL_SPACE = "<label_space_id>" # Replace with the unique ID of the label space

# --- Connect to Encord ---
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
    # For US platform users use "https://api.us.encord.com"
    domain="https://api.encord.com",
)

project: Project = user_client.get_project(PROJECT_ID)

# Get all Label Rows for the data unit
rows = project.list_label_rows_v2(data_title_eq=DATA_TITLE)
# Assume we want the first Label Row
lr = rows[0]
lr.initialise_labels()

# Replace ID with the unique ID of the Label Space 
label_space = lr.get_space(id=LABEL_SPACE, type_="image")

# Get object instances
object_instances = label_space.get_object_instances()

# Get annotations
for annotation in label_space.get_annotations(type_="object"):
	print(annotation.coordinates)
	print(annotation.last_edited_by)
	print(annotation.last_edited_at)
	print(annotation.confidence)
	print(annotation.object_hash)
	print(annotation.space)

	# Update the annotations
	annotation.last_edited_by = "user@encord.com"
	annotation.coordinates = BoundingBoxCoordinates(
        top_left_x=0.6,
        top_left_y=0.4,
        width=0.3,
        height=0.1,
    )

# Save Label Row
lr.save()

Add Labels

The following scripts add object and classification labels to a Label Space. Ensure that you replace:

All variables in the User input section at the top of the script.
The names of objects in your Ontology.
The type and position of the label(s).

Use the on_overlap="replace" parameter in the put_classification_instance and put_object_instance methods if you want existing labels to be replaced by the new object.

from encord import EncordUserClient, Project
from encord.objects import Object, Classification, ClassificationInstance, ObjectInstance, LabelRowV2
from encord.objects.coordinates import BoundingBoxCoordinates
from encord.objects.options import Option

# User input
SSH_PATH = "<private_key_path>"  # Replace with the file path to your SSH private key
PROJECT_ID = "<project_id>"  # Replace with the unique Project ID
DATA_TITLE = "<data_unit_title>" # Replace with the title of the data unit 
LABEL_SPACE = "<label_space_id>" # Replace with the unique ID of the label space

# --- Connect to Encord ---
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
    # For US platform users use "https://api.us.encord.com"
    domain="https://api.encord.com",
)

# Get Project
project: Project = user_client.get_project(PROJECT_ID)

# Get all Label Rows for the data unit
rows = project.list_label_rows_v2(data_title_eq=DATA_TITLE)
# Assume we want the first Label Row
lr = rows[0]
lr.initialise_labels()

# Find bounding box object "Cherry" in the Ontology.
ontology_structure = project.ontology_structure
box_ontology_object: Object = ontology_structure.get_child_by_title(title="Cherry", type_=Object)

# Find classification "Day or Night" in the Ontology
classification: Classification = ontology_structure.get_child_by_title(title="Day or Night", type_=Classification)

# Find classification answer "Day" in the Ontology
classification_answer = classification.get_child_by_title(
    title="Day", type_=Option
)

# Get Label Space
label_space = lr.get_space(layout_key="Fruit", type_="image")

# Create bounding box instance
bb_inst: ObjectInstance = box_ontology_object.create_instance()

# Create classification instance
classification_inst: ClassificationInstance = classification.create_instance()

classification_inst.set_answer(
    answer=classification_answer
)

# Add the bounding box instance 
label_space.put_object_instance(
    object_instance=bb_inst,
    on_overlap="replace",
    coordinates=BoundingBoxCoordinates(
        top_left_x=0.6,
        top_left_y=0.4,
        width=0.3,
        height=0.2
    ),
    frames=[0, 1, 2]
    )

# Add the classification instance 
label_space.put_classification_instance(
    classification_instance=classification_inst,
    frames=[0, 1, 2]
)

lr.save()
print(f"Saved label row for {lr.data_title}")

Remove Labels

The following scripts remove an object from a Label Space. Ensure that you replace:

All variables in the User input section at the top of the script.
The names of objects in your Ontology.
The type and position of the label(s).

from encord import EncordUserClient, Project
from encord.objects import Object, Classification, ClassificationInstance, ObjectInstance, LabelRowV2
from encord.objects.options import Option

# User input
SSH_PATH = "<private_key_path>"  # Replace with the file path to your SSH private key
PROJECT_ID = "<project_id>"  # Replace with the unique Project ID
DATA_TITLE = "<data_unit_title>" # Replace with the title of the data unit 
LABEL_SPACE = "<label_space_id>" # Replace with the unique ID of the label space
OBJECT_INSTANCE = "<object_hash>" # Replace with the object hash of the object you want to remove
CLASSIFICATION_INSTANCE = "<classification_instance>" # Replace with the classification hash of the classification you want to remove

# --- Connect to Encord ---
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
    # For US platform users use "https://api.us.encord.com"
    domain="https://api.encord.com",
)

# Get Project
project: Project = user_client.get_project(PROJECT_ID)

# Get all Label Rows for the data unit
rows = project.list_label_rows_v2(data_title_eq=DATA_TITLE)
# Assume we want the first Label Row
lr = rows[0]
lr.initialise_labels()

# Get Label Space
label_space = lr.get_space(layout_key="Fruit", type_="image")

# Print first object hash in each label space
label_spaces = lr.get_spaces()

for space in label_spaces:
    print(space.get_object_instances()[0].object_hash)


# Remove object
label_space.remove_object_instance(
	object_hash=OBJECT_INSTANCE
)

# Remove classification
label_space.remove_classification_instance(
	classification_hash=CLASSIFICATION_INSTANCE
)

# Save label row
lr.save()
print(f"Saved label row for {lr.data_title}")

Put it all together

I’ll provide an example here so you can see how you might want to use label spaces. We have Data Groups with five data units in the following layout:

+-------------------------------------------+
|              text file                    |
+------------------+------------------------+
|     video 1      |        video 2         |
+------------------+------------------------+
|     video 3      |        video 4         |
+------------------+------------------------+

Create Example Data Groups

The following script creates the Data Groups and specifies the layout.

Create Data Group

from uuid import UUID

from encord.constants.enums import DataType
from encord.objects.metadata import DataGroupMetadata
from encord.orm.storage import DataGroupCustom, StorageItemType
from encord.user_client import EncordUserClient

# --- Configuration ---
SSH_PATH = "/Users/chris-encord/ssh-private-key.txt"  # Replace with the file path to your access key
FOLDER_ID = "00000000-0000-0000-0000-000000000000"  # Replace with the Folder ID
DATASET_ID = "00000000-0000-0000-0000-000000000000"  # Replace with the Dataset ID
PROJECT_ID = "00000000-0000-0000-0000-000000000000"  # Replace with the Project ID

# --- Connect to Encord ---
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
    # For US platform users use "https://api.us.encord.com"
    domain="https://api.encord.com",
)

folder = user_client.get_storage_folder(FOLDER_ID)

# --- Reusable layout and settings ---
layout = {
    "direction": "column",
    "first": {"type": "data_unit", "key": "instructions"},
    "second": {
        "direction": "column",
        "first": {
            "direction": "row",
            "first": {"type": "data_unit", "key": "top-left"},
            "second": {"type": "data_unit", "key": "top-right"},
            "splitPercentage": 50,
        },
        "second": {
            "direction": "row",
            "first": {"type": "data_unit", "key": "bottom-left"},
            "second": {"type": "data_unit", "key": "bottom-right"},
            "splitPercentage": 50,
        },
        "splitPercentage": 50,
    },
    "splitPercentage": 20,
}
settings = {"tile_settings": {"instructions": {"is_read_only": True}}}

# --- Group definitions (name + UUIDs) ---
groups = [
    {
        "name": "group-001",
        "uuids": {
            "instructions": UUID("00000000-0000-0000-0000-000000000000"), # Replace with File ID of clustered_event_log_01.txt
            "top-left": UUID("11111111-1111-1111-1111-111111111111"), # Replace with File ID of 00001_normalized.mp4
            "top-right": UUID("22222222-2222-2222-2222-222222222222"), # Replace with File ID of 00002_normalized.mp4
            "bottom-left": UUID("33333333-3333-3333-3333-333333333333"), # Replace with File ID of 00009.mp4
            "bottom-right": UUID("44444444-4444-4444-4444-444444444444"), # Replace with File ID of 00011_normalized.mp4
        },
    },
    {
        "name": "group-002",
        "uuids": {
            "instructions": UUID("55555555-5555-5555-5555-555555555555"), # Replace with File ID of clustered_event_log_02.txt
            "top-left": UUID("66666666-6666-6666-6666-666666666666"), # Replace with File ID of 00012.mp4
            "top-right": UUID("77777777-7777-7777-7777-777777777777"), # Replace with File ID of 00020.mp4
            "bottom-left": UUID("88888888-8888-8888-8888-888888888888"), # Replace with File ID of 00030.mp4
            "bottom-right": UUID("99999999-9999-9999-9999-999999999999"), # Replace with File ID of 00033.mp4
        },
    },
    {
        "name": "group-003",
        "uuids": {
            "instructions": UUID("12312312-3123-1231-2312-312312312312"), # Replace with File ID of clustered_event_log_03.txt
            "top-left": UUID("23232323-2323-2323-2323-232323232323"), # Replace with File ID of 00034.mp4
            "top-right": UUID("31313131-3131-3131-3131-313131313131"), # Replace with File ID of 00035_normalized.mp4
            "bottom-left": UUID("45645645-6456-4564-5645-645645645645"), # Replace with File ID of 00038_normalized.mp4
            "bottom-right": UUID("56565656-6565-5656-6565-656565656565 "), # Replace with File ID of 00045.mp4
        },
    },
    # More groups...
]

# Create the data groups

for g in groups:
    group = folder.create_data_group(
        DataGroupCustom(
            name=g["name"],
            layout=layout,
            layout_contents=g["uuids"],
            settings=settings,
        )
    )
    print(f"✅ Created group '{g['name']}' with UUID {group}")

# Add all the data groups in a folder to a Dataset
group_items = folder.list_items(item_types=[StorageItemType.GROUP])
d = user_client.get_dataset(DATASET_ID)
d.link_items([item.uuid for item in group_items])

# Add the Dataset with the Data Groups to a Project

p = user_client.get_project(PROJECT_ID)
rows = p.list_label_rows_v2(include_children=True)

# Label Rows of Data Groups use DataGroupMetadata for the layout to Annotate and Review
for row in rows:
    if row.data_type == DataType.GROUP:
        row.initialise_labels()
        assert isinstance(row.metadata, DataGroupMetadata)
        print(row.metadata.children)

Pre-label Data Groups

We want to pre-label the classifications for all videos with the layout_key top-left in the Data Groups with Yes. This way annotators only need to update the classification for top-left videos where there the model predictions and summaries are incorrect. Our example Project uses a Dataset that contains our Data Groups. The Ontology looks like this: Classifications

Prediction correct?
- YES! (Radio button)
- No (Radio button)
  - What's wrong? (Text)
Summary correct?
- YES! (Radio button)
- No (Radio button)
  - What's wrong? (Text)

Import Classification values on top_left videos

from __future__ import annotations

from encord import EncordUserClient, Project
from encord.objects import Classification, ClassificationInstance, LabelRowV2
from encord.objects.options import Option


# User input: Edit these values as you require

SSH_PATH = "/Users/chris-encord/ssh-private-key.txt"
PROJECT_HASH = "00000000-0000-0000-0000-000000000000"


# Only apply to THIS label space
LABEL_SPACE = "top-left"

CLASSIFICATIONS = [
    "Prediction correct?",
    "Summary correct?",
]

ANSWER_TITLE = "YES!"

# Frames to apply the classification to
FRAMES = [0]  # change as needed


# Connect to Encord
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
    # For US users use "https://api.us.encord.com"
    domain="https://api.encord.com",
)


# Get project for which predictions are to be added.
project: Project = user_client.get_project(PROJECT_HASH)

ontology = project.ontology_structure

# Fetch ALL label rows

all_rows: list[LabelRowV2] = project.list_label_rows_v2(include_children=True)

# Identify Data Group
data_group_parents = [row for row in all_rows if row.group_hash is None]
print(f"Found {len(data_group_parents)} data groups")


# Process each Data Group
for parent in data_group_parents:
    parent.initialise_labels()

    # Helpful: print what spaces actually exist on this group label row
    spaces = parent.get_spaces()
    print(f"Data group '{parent.data_title}' → {len(spaces)} spaces")
    print("  Available layout keys:", [s.metadata.layout_key for s in spaces])

    # Get the target label space from the PARENT row
    try:
        label_space = parent.get_space(layout_key=LABEL_SPACE, type_="video")
    except Exception as e:
        print(f"  ⚠ Could not find space '{LABEL_SPACE}' on group '{parent.data_title}': {e}")
        continue

    for classification_title in CLASSIFICATIONS:
        classification: Classification = ontology.get_child_by_title(
            title=classification_title,
            type_=Classification,
        )

        yes_option: Option = classification.get_child_by_title(
            title=ANSWER_TITLE,
            type_=Option,
        )

        classification_inst: ClassificationInstance = classification.create_instance()
        classification_inst.set_answer(yes_option)

        label_space.put_classification_instance(
            classification_instance=classification_inst
            )

    parent.save()
    print(f"  ✔ Labeled group row: {parent.data_title} (space: {LABEL_SPACE})")

Get Started

General

Index

Ontologies

Projects

Labels

Datasets

DICOM

Active API & SDK

Data Groups & Label Spaces

Get Label Spaces

Get & Update Labels

Add Labels

Remove Labels

Put it all together

Create Example Data Groups

Pre-label Data Groups

Get Started

General

Index

Ontologies

Projects

Labels

Datasets

DICOM

Active API & SDK

​Get Label Spaces

​Get & Update Labels

​Add Labels

​Remove Labels

​Put it all together

​Create Example Data Groups

​Pre-label Data Groups

Get Label Spaces

Get & Update Labels

Add Labels

Remove Labels

Put it all together

Create Example Data Groups

Pre-label Data Groups