A simple example showing how to use objectHashes.
Use Case: Selective OCR on Selected Objects
This functionality allows you to apply your own OCR model to specific objects selected directly within the Encord platform.
When you trigger your agent from the Encord app after selecting objects, the platform automatically sends a list of objectHashes
to your agent. Your agent can then use the dep_objects
method to gain immediate access to these specific object instances, which greatly simplifies integrating your OCR model for targeted processing.
Test the Agent
agent.py
.Copy URL
as shown:The url should have roughly this format: "https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}/0?other_query_params&objectHash={objectHash}"
.
The goals of this example are:
OntologyDataModel
for classifications.Prerequisites
Before you begin, ensure you have:
Run the following commands to set up your environment:
Project Setup
Create a Project with visual content (images, image groups, image sequences, or videos) in Encord. This example uses the following Ontology, but any Ontology containing classifications can be used.
Ontology JSON and Script
To construct the same Ontology as used in this example, run the following script.
The aim is to trigger an agent that transforms a labeling task from Figure A to Figure B.
Figure A: No classification labels.
Figure B: Multiple nested classification labels generated by an LLM.
Create the Agent
This section provides the complete code for creating your editor agent, along with an explanation of its internal workings.
Agent Setup Steps
Import dependencies, authenticate with Encord, and set up the Project. Ensure you insert your Project’s unique identifier.
Create a data model and a system prompt based on the Project Ontology to tell Claude how to structure its response.
Set up an Anthropic API client to establish communication with the Claude model.
Define the Editor Agent. This includes
dep_single_frame
dependency.See the contents of data_model.model_json_schema_str here
Test the Agent
The url should have the following format: "https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}"
.
The goals of this example are:
Create an editor agent that can convert generic object annotations (class-less coordinates) into class specific annotations with nested attributes like descriptions, radio buttons, and checklists.
Demonstrate how to use both the OntologyDataModel
and the dep_object_crops
dependency.
Prerequisites
Before you begin, ensure you have:
Run the following commands to set up your environment:
Project Setup
Create a Project with visual content (images, image groups, image sequences, or videos) in Encord. This example uses the following Ontology, but any Ontology containing classifications can be used provided the object types are the same and there is one entry called "generic"
.
Ontology JSON and Script
To construct the Ontology used in this example, run the following script:
The goal is create an agent that takes a labeling task from Figure A to Figure B
Figure A: No classification labels.
Figure B: Multiple nested classification labels generated by an LLM.
Create the Agent
This section provides the complete code for creating your editor agent, along with an explanation of its internal workings.
Agent Setup Steps
Import dependencies, authenticate with Encord, and set up the Project. Ensure you insert your Project’s unique identifier.
Extract the generic Ontology object and the specific objects of interest. This example sorts Ontology objects based on whether their title is "generic"
. The generic object is used to query image crops within the agent. Before that, other_objects
is used to pass in the specific context we want Claude to focus on. The OntologyDataModel
class helps convert Encord Ontology Objects into a Pydantic model and parse JSON into Encord ObjectInstances.
Prepare the system prompt for each object crop using the data_model
to generate the JSON schema. Only other_objects
is passed to ensure the model can choose only from non-generic object types.
Set up an Anthropic API client to establish communication with the Claude model. You must include your Anthropic API key.
Define the Editor Agent.
dep_object_crops
dependency allows filtering. In this case, it includes only “generic” object crops, excluding those already converted to actual labels.Query Claude using the image crops. The crop
variable has a convenient b64_encoding
method to produce an input that Claude understands.
Parse Claude’s message using the data_model
. When called with a JSON string, it attempts to parse it with respect to the JSON schema we saw above to create an Encord object instance. If successful, the old generic object can be removed and the newly classified object added.
Save the labels with Encord.
See the result of `data_model.model_json_schema_str` for the given example
Test the Agent
The url has following format: "https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}"
.
The goals of this example are:
Prerequisites
Before you begin, ensure you have:
Run the following commands to set up your environment:
Project Setup
Create a Project containing videos in Encord.
This example requires an Ontology with four text classifications:
Ontology
Ontology JSON and Script
To construct the Ontology used in this example, run the following script:
The workflow for this agent is:
A human watches the video and enters a caption in the first text field.
The agent is then triggered and generates three additional caption variations for review.
If no human caption is present when the agent is triggered, the task is sent back for annotation. If the review stage results in rejection, the task is also returned for re-annotation.
Workflow
Create the Agent
This section provides the complete code for creating your editor agent, along with an explanation of its internal workings.
Agent Setup Steps
Click here for a concrete Vision Language Action model use-case.
This example requires the following dependencies:
To set up and test the agent locally:
Save the dependencies above into a requirements.txt
file.
Set up your Python environment and run the agent:
(Replace /path/to/your_private_key
and <your-api-key>
with your actual credentials.)
In a separate terminal, test the agent:
(Replace <url_from_the_label_editor>
with the URL from your Encord Label Editor session.)