Tabular Data Projects

We recommend reviewing our end-to-end example to get a practical grasp of using Tabular Data Projects.

Tabular Data currently supports CONSENSUS Projects only.

Tabular data Projects work a little differently than typical Projects in Encord. Annotators and Reviewers select from options columns for each row. You can use multiple columns for selection.

Create Ontology for Tabular Data

Modify the following script example to create your Ontology.

Items of Interest	Notes
`READ_ONLY_COLUMNS`	Specifies the columns you want your Annotators and Reviewers to see in the Label Editor. Column count starts at 0. Omit the columns in your CSV you do not want your Annotators and Reviewers to see.
ANNOTATION_COLUMNS	Specifies the columns your Annotators and Reviewers use to label data from. Your Annotators and Reviewers select answers from a drop down in these columns. Specify the options available to Annotators and Reviewers using the files in `MAPPING_FIELD_OPTION_PATHS`. These files are single columnm files with one option available on each row.
ONTOLOGY_NAME	Specifies the name for your Ontology.
OBJECT_NAME	Specifies the name of the text region for each row in your CSV file. The script applies a label to each row in your CSV file using this text region.

tabular_create_ontology script


import pandas as pd
from encord.objects import OntologyStructure, Shape, TextAttribute
from encord.objects.attributes import RadioAttribute
from encord.user_client import EncordUserClient

# --- Configuration ---
ENCORD_SSH_KEY = "/Users/chris-encord/ssh-private-key.txt" # Replace with the file path to your SHH private key
TASK_CSV_PATH = "/file/path/to/video_game_annotation_1.csv" # Replace with the file path to any of the video_game_annotation_X.csv files

READ_ONLY_COLUMNS = [0, 1, 2]
ANNOTATION_COLUMNS = [3, 4]

# Replace these paths with actual mapping column name > options file
MAPPING_FIELD_OPTION_PATHS = {
    "genre": "/file/path/to/genre-options.csv",
    "platform": "/file/path/to/platform-options.csv",
}

ONTOLOGY_NAME = "E2E - Tabular Data - Ontology"
OBJECT_NAME = "Game Row"


def parse_csv():
    csv_df = pd.read_csv(TASK_CSV_PATH)
    readonly_columns = csv_df.columns[READ_ONLY_COLUMNS].tolist()
    mapping_columns = csv_df.columns[ANNOTATION_COLUMNS].tolist()

    return mapping_columns, readonly_columns


def create_ontology(text_attribute_names, radio_option_names):
    ontology_structure = OntologyStructure()
    text_object = ontology_structure.add_object(name=OBJECT_NAME, shape=Shape.TEXT)

    for attribute in text_attribute_names:
        text_object.add_attribute(TextAttribute, attribute)

    for column_name in radio_option_names:
        options_path = MAPPING_FIELD_OPTION_PATHS.get(column_name)
        if options_path is None:
            raise ValueError(f"No options file defined for column '{column_name}'")

        options = pd.read_csv(options_path).iloc[:, 0].dropna().astype(str).tolist()

        radio_attribute = text_object.add_attribute(RadioAttribute, column_name, required=True)
        for option in options:
            radio_attribute.add_option(option)

    user_client = EncordUserClient.create_with_ssh_private_key(
        ssh_private_key_path=ENCORD_SSH_KEY,
        domain="https://api.encord.com",
    )
    return user_client.create_ontology(ONTOLOGY_NAME, structure=ontology_structure)


if __name__ == "__main__":
    mapping_columns, readonly_columns = parse_csv()
    ontology = create_ontology(readonly_columns, mapping_columns)
    print(f"Created ontology {ontology.title}, id: {ontology.ontology_hash}")

Create Tabular Data Project

Create a Project adding your Ontology and Dataset for Tabular Data.

Tabular data currently supports CONSENSUS Projects only.
An AGENT block must be the first block in the Workflow for Tabular Data Projects.
The AGENT block and AGENT pathway MUST be the exact name specified below.

Run the Agent script

The tabular_run_agent.py populates tasks in the AGENT block in your workflow. Create the following Python scripts. Both scripts must be in the same directory.

tabular_run_agent.py
tabular_utils.py

After creating the scripts, run the tabular_run_agent.py script. After running the script, tasks that were in the AGENT stage are now in the CONSENSUS - ANNOTATE stage.

Items of Interest	Notes
`AGENT_STAGE`	Specifies the name of the AGENT block in your Tabular Data Project. This name must exactly match the name of the AGENT block in your Project.
`AGENT_PATHWAY`	Specifies the name of the Pathway in your AGENT block. This name must exactly match the name of the pathway in the AGENT block in your Project.


from typing import Annotated
from pathlib import Path
import os

from encord_agents.tasks import Runner
from encord.objects.ontology_labels_impl import LabelRowV2
from encord.project import Project
from encord_agents.tasks.dependencies import dep_asset
from encord_agents.core.dependencies import Depends
from encord.objects.common import Shape

from tabular_utils import parse_csv_and_add_objects

# --- Configuration ---
ENCORD_SSH_KEY = "/Users/chris-encord/ssh-private-key.txt" # Replace with the file path to your SSH private key
PROJECT_HASH = "00000000-0000-0000-0000-000000000000" # Replace with unique Project ID of the tabular data Project
AGENT_STAGE = "Pre-label"
AGENT_PATHWAY = "Labelled"

# Inject into environment so Encord Agents can pick it up
os.environ["ENCORD_SSH_KEY_FILE"] = ENCORD_SSH_KEY

runner = Runner(project_hash=PROJECT_HASH)

@runner.stage(stage=AGENT_STAGE)
def agent_logic(
    lr: LabelRowV2, project: Project, asset: Annotated[Path, Depends(dep_asset)]
):
    ontology = project.ontology_structure
    text_object = ontology.objects[0]
    if text_object is None:
        raise Exception("No objects found")
    elif text_object.shape is not Shape.TEXT:
        raise Exception("Text object required")

    parse_csv_and_add_objects(text_object, lr, asset)

    return AGENT_PATHWAY

if __name__ == "__main__":
    runner.run()

Label and Review Tabular Data

Annotators Annotators use drop downs to select the genre and platform for each row. Reviewers Reviewers verify that the labels are correct.

Use any column in a row to select correct answers.

When there is an issue with labels/classifications, Reviewers can:

Reject the task and add a comment about why a task was rejected. Rejected tasks go back to the person who added the labels/classifications.
Edit labels directly using the Edit labels button and then approve the task.

Get Started

General

Index

Annotate

Active

Other

Tabular Data Projects

Create Ontology for Tabular Data

Create Tabular Data Project

Run the Agent script

Label and Review Tabular Data

Get Started

General

Index

Annotate

Active

Other

​Create Ontology for Tabular Data

​Create Tabular Data Project

​Run the Agent script

​Label and Review Tabular Data

Create Ontology for Tabular Data

Create Tabular Data Project

Run the Agent script

Label and Review Tabular Data