Data Curation, Management, and Annotation
STEP 1: Import Your Files to Encord
Create a Cloud Integration
Select your cloud provider.
Create Metadata Schema
Based on your Data Discoverability Strategy, you need to create a metadata schema. The schema provides a method of organization for your custom metadata. Encord supports:
- Scalers: Methods for filtering.
- Enums: Methods with options for filtering.
- Embeddings: Method for embedding plot visualization, similarity search, and natural language search.
Custom metadata
Custom metadata refers to any additional information you attach to files, allowing for better data curation and management based on your specific needs. It can include any details relevant to your workflow, helping you organize, filter, and retrieve data more efficiently. For example, for a video of a construction site, custom metadata could include fields like "site_location": "Algiers"
, "project_phase": "foundation"
, or "weather_conditions": "sunny"
. This enables more precise tracking and management of your data.
Before importing any files with custom metadata to Encord, we recommend that you import a metadata schema. Encord uses metadata schemas to validate custom metadata uploaded to Encord and to instruct Index and Active how to display your metadata.
video.description
, while team B could use audio.description
. Another example could be TeamName.MetadataKey
. This approach maintains clarity and avoids key collisions across departments.Metadata schema table
Use add_scalar
to add a scalar key to your metadata schema.
Scalar Key | Description | Display Benefits |
---|---|---|
boolean | Binary data type with values “true” or “false”. | Filtering by binary values |
datetime | ISO 8601 formatted date and time. | Filtering by time and date |
number | Numeric data type supporting float values. | Filtering by numeric values |
uuid | Customer specified unique identifier for a data unit. | Filtering by customer specified unique identifier |
varchar | Textual data type. Formally string . string can be used as an alias for varchar , but we STRONGLY RECOMMEND that you use varchar . | Filtering by string. |
text | Text data with unlimited length (example: transcripts for audio). Formally long_string . long_string can be used as an alias for text , but we STRONGLY RECOMMEND that you use text . | Storing and filtering large amounts of text. |
Use add_enum
and add_enum_options
to add an enum and enum options to your metadata schema.
Key | Description | Display Benefits |
---|---|---|
enum | Enumerated type with predefined set of values. | Facilitates categorical filtering and data validation |
Use add_embedding
to add an embedding to your metadata schema.
Key | Description | Display Benefits |
---|---|---|
embedding | 512 dimension embeddings for Active, 1 to 4096 for Index. | Filtering by embeddings, similarity search, 2D scatter plot visualization (Coming Soon) |
Incorrectly specifying a data type in the schema can cause errors when filtering your data in Index or Active. If you encounter errors while filtering, verify your schema is correct. If your schema has errors, correct the errors, re-import the schema, and then re-sync your Active Project.
Import your metadata schema to Encord
Verify your schema
After importing your schema to Encord we recommend that you verify that the import is successful. Run the following code to verify your metadata schema imported and that the schema is correct.
Create a Folder
You must create a folder in Index to store your files.
- Navigate to Files under the Index heading in the Encord platform.
- Click the + New folder button to create a new folder. A dialog to create a new folder appears.
-
Give the folder a meaningful name and description.
-
Click Create to create the folder. The folder is listed in Files.
Create JSON or CSV for import
To import files from cloud storage into Encord, you must create a JSON or CSV file specifying the files you want to upload.
Find helpful scripts for creating JSON and CSV files for the data upload process here.
All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the same way, by using a JSON or CSV file. The file includes links to all images, image groups, videos and DICOM files in your cloud storage.
Import your data
STEP 2: Curate your Data
Create a collection
A Collection is a container for data units (images or videos) that you can use to group your data units together.
Creation of a Collection involves filtering and sorting your data. Once you have selected a smaller group of images, videos or audio files, create a Collection.
-
Log in to the Encord platform. The landing page for the Encord platform appears.
-
Go to Index > Files. The All folders page appears with a list of all folders in Encord.
-
Click in to a Folder. The landing page for the Folder appears and the Explorer button is enabled.
-
Click the Explorer button. The Index Explorer page appears.
- Search, sort, and filter your data until you have the subset of the data you need.
-
Select one or more of the images/frames in the Explorer workspace. A ribbon appears at the top of the Explorer workspace.
Selecting a video frame selects the entire video. Specific frames from a video cannot be selected. -
Click Select all to select all the images in the subset.
-
Click Add to a Collection.
-
Click New Collection.
-
Specify a meaningful title and description for the Collection.
The title specified here is applied as a tag/label to every selected image. -
Click Collections to verify the Collection appears in the Collections list.
STEP 3: Set Up Your Project
Create a Dataset from a Collection
Once you have a Collection, you can create a Dataset from your Collection.
Create an Ontology
An Ontology is a structured framework that defines the categories, labels, and relationships used to annotate data consistently and accurately. Ontologies define what you want labelled.
- Click the New ontology button in the Ontologies section to create a new Ontology.
- Give your Ontology a meaningful title and description. A clear title and description keeps your Ontologies organized. Click Next to continue.
- Define your Ontology structure. See our documentation on Ontology structure for more information on the various types of objects, classifications, and attributes.
To add objects:
- Click Add object to create a new object.
- Give the object a name. For example “Apple”.
- Select a shape for the object. For example polygon.
- Optionally, enable the Required toggle to mark the object as Required.
- Optionally, add attributes to the object.
- Repeat these steps for as many objects as necessary.
To add attributes to an object:
You can add attributes to objects that define the object’s characteristics. For example the object “Apple” can have an attribute “Color”.
-
Click the arrow icon next to an object to add attributes to the object.
-
Give the attribute a name. For example “Color”.
-
Click the attribute type to change the attribute type. The default attribute type is a text field.
-
Click Add option to add an option, if you have chosen a radio button or checklist attribute.
-
Enter a name for the attribute option. For example, the attribute “Color” can have the options “Red”, “Green”, and “Yellow”.
-
Click the Back to parent button to return the Ontology creation view.
To add a classification:
- Click Add classification to create a new classification.
- Give the classification a name. For example, “Time of day”.
- Optionally, configure the classification. The default classification type is a text field.
- Optionally, enable the Required toggle to mark the object as Required.
- Repeat these steps for as many classifications as necessary.
Configure classifications:
You can configure classifications to change the classification type, and to add classification options to radio buttons and check lists.
- Click the arrow icon next to an object to configure the classification.
- Click the classification type to change the classification type. The default classification type is a text field.
- Click Add option to add an option if you have selected a radio button or check list classification.
- Enter a name for the classification option. For example, the classification “Time of day” can have the options “Night” and “Day”.
- Click the Back to parent button to return the Ontology creation view.
Create a Task Agent
Task agents enable you to set up custom actions like pre-labeling, leveraging foundation models such as GPT-4, automated quality assurance, or other tailored actions to suit your workflow.
Use the Encord SDK to configure your Task Agent. The Task Agent executes the configured SDK script for all tasks that are routed through the Task Agent stage in your Workflow.
Triggering the Task Agent
Task Agents aggregate all tasks that reach the Agent stage in the workflow. Your custom script must be triggered at this stage before the tasks proceed further in the workflow.
Create a Workflow template
Workflows allow you to design and manage the flow of tasks through various stages of a Project. You have control over how tasks progress and how different stages interact.
To set up a Workflow template, navigate to the Annotate section of the Encord platform, and click the + New workflow button
1. Add users to the Workflow:
Add users from your Organization to the Workflow by clicking Invite collaborators.
- Collaborators are added based on their role within the project - select the role you would like the collaborator(s) to have.
- Start typing the email of a user you would like to add into the area highlighted on the image below, and select the user from the list that appears. Repeat this for every user that will have the same role.
- When you are done selecting users for this role, click Add.
2. Configure your Workflow:
The canvas is populated with a simple Workflow by default.
Click the Add stage button to show all Workflow components.
Customize your Workflow by pulling components onto the canvas.
Add the stages and other components you require for your Workflow onto the canvas:
-
All workflows must begin with the Start stage.
-
All workflows must contain an Annotate stage.
-
Add routers to your Project to determine different pathways through your Workflow a task can take.
-
Add as many Review stages as necessary.
-
All workflows must end at a Complete stage.
-
Link all components on the canvas by clicking and dragging from one connection point to another.
All Workflows can be saved as a template by clicking the Save as a new template button.
3. Configure the stages of your Workflow:
After you arranged the stages in the composer, it is time to configure the details of each stage.
- Click an Annotate card on the canvas to start editing the annotation stage.
- Give the stage a descriptive name.
- Add annotators. If you’d like to specify annotators for this stage, add them as collaborators. For full details on how collaborators can work on tasks at each stage, see our section on managing collaborators.
- Optionally, add a Webhook to receive notifications when labels are submitted at this stage.
- Click a Review card on the canvas to start editing the review stage.
- Give the stage a descriptive name.
- Add reviewers. If you want to specify reviewers for this stage, add them as collaborators. For full details on how collaborators can work on tasks at each stage, see our section on managing collaborators.
-
Click a Router card on your canvas. See the router section for more details on how different types of routers can be configured.
-
Optionally, add a Webhook to the Complete stage to receive a notification when a task has been completed.
-
Optionally, add User assignment restrictions if users in this node should be prevented from being assigned to tasks they completed in the nodes listed.
Create the Project
Projects in Encord bring together Datasets, Ontologies, and Workflows. Datasets are labeled according to the Ontology, while the Workflow defines how tasks progress through the Project from start to finish.
- In the Encord platform, select Projects under Annotate.
- Click the + New annotation project button to create a new Project.
- Give the Project a meaningful title and description.
If you are part of an Organization, an optional Project tags drop-down is visible. Project tags are useful for categorizing and finding your Projects. Select as many tags as are relevant for your Project.
-
Click the Attach ontology button.
-
Select an Ontology from the list using the Select button, or create a new Ontology by clicking the New ontology button.
-
Click OK to attach the Ontology to the Project.
-
Click the Attach datasets button.
-
Select a Dataset from the list using the Attach button, or create a new Dataset by clicking the New Dataset button.
- Click OK to attach the Dataset(s) to the Project.
- Click the Load from template button to use a Workflow template.
-
Select the template you want to use and click Load template.
-
Click Create project to finish creating the Project.
Add users to the Project
After creating a Project you must invite users to act as annotators, reviewers, team managers, and admins. Collaborators can be added as individuals, or as part of user groups.
STEP 4: Label your Data
Now you are ready to label your data. We recommend you and your team watch these introductory videos.
STEP 5: Export your Labels
To train your models you must export your labels.
- Navigate to Labels and export on the Project Dashboard
- Select the Data tab.
- Select the data units you want to export labels for.
- Click the Export and save button. A pop-up appears.
- Give this label version a name.
- Select whether you want your labels to be exported in JSON or COCO format.
- Toggle this if you want to include signed URLs in your export.
- Select which label status(es) to include in the export.
- Select what objects to include in the export.
- Click Export and save to export your labels.
Was this page helpful?