This walkthrough takes you through the complete applied AI workflow in Encord: ingesting data, curating a training set, setting up an annotation Project, labeling with quality control, and exporting labels for training.
By the end, you’ll have a working pattern you can adapt for your own use case.
What you’ll need
- An Encord account with Admin or Member access
- Data files in cloud storage (AWS S3, GCP, or Azure) or available for local upload
- A clear labeling schema (the classes and attributes you want to annotate)
If you don’t have data ready, you can use local uploads to get started quickly.
Step 1: Register your data
In: Index
- Navigate to Index in the Encord platform.
- Create a Folder to organize your files (e.g.
training-data/v1).
- Register your data using one of the following methods:
- Cloud storage — provide a JSON file of cloud URIs pointing to your files in AWS, GCP, or Azure
- Cloud sync — connect a cloud folder and sync automatically as new files are added
- Local upload — drag and drop files directly into the platform
Once registered, your files are indexed and available for curation. Encord does not copy your data — files stay in your own storage.
See Work with Data for detailed instructions.
Step 2: Curate your dataset
In Index
Before annotating, invest time in curation. This ensures you annotate the right data, not just all data.
- Open your Folder in Index and explore your data visually.
- Use Embedding plots to visualize the distribution of your data. Look for:
- Dense clusters — likely duplicates or over-represented conditions
- Sparse regions — edge cases or rare conditions worth prioritizing
- Run duplicate detection to identify and remove near-identical samples.
- Use quality metric filters to remove corrupt, blurry, or low-quality files.
- Use natural language search or metadata filters to find specific conditions (e.g. “night driving”, “small objects”, or specific sensor IDs).
- Select the samples you want to annotate and save them as a Collection.
Aim for a collection that is diverse and representative of the conditions your model will encounter in production — not just whatever data was easiest to collect.
See Collections for instructions.
Step 3: Create a Dataset
In Annotate
- Navigate to Datasets under Annotate.
- Click + New dataset.
- Give the Dataset a meaningful name and description.
- Add your curated Collection from Index to the Dataset.
The Dataset is now available to attach to annotation Projects.
Step 4: Create an Ontology
In Annotate
An Ontology defines your labeling schema — the classes, attributes, and relationships that annotators will use.
- Navigate to Ontologies under Annotate.
- Click + New ontology.
- Add the object classes and classification types your model needs.
- For each class, add any nested attributes (e.g. for a
vehicle class: type, color, occlusion level).
Keep your ontology focused. Every attribute you add increases annotation time and complexity. Only include what your model actually needs.
See Create Ontologies for guidance.
Step 5: Create an annotation Project
In Annotate:
- Navigate to Projects under Annotate.
- Click + New annotation project.
- Configure the Project:
- Title and description — give the Project a clear name
- Ontology — attach the Ontology you created in Step 4
- Dataset — attach the Dataset from Step 3
- Workflow — select or build a Workflow (e.g. Annotate → Review → Complete)
- Collaborators — invite annotators, reviewers, and team managers
- Click Create project.
See Create a Project for the full setup guide.
Step 6: Label your data
In: Annotate — Label Editor
With the Project created, annotators can start labeling:
- Open the Project and navigate to the Queue tab.
- Click Start task to open the Label Editor on the highest-priority task, or Initiate to open a specific task.
- Use the Label Editor tools to annotate each asset:
- Use SAM 2 for fast, accurate segmentation with a single click
- Use Interpolation to propagate bounding boxes or polygons across video frames
- Use classification tools to assign frame-level or object-level attributes
- Submit the task when complete.
If Task Agents are configured, pre-labels from AI models will already be present when the annotator opens the task — reducing labeling time significantly.
Step 7: Review and QA
In: Annotate — Review stage
Tasks submitted by annotators move to the Review stage in your Workflow.
- Reviewers open tasks from the Queue and inspect the labels.
- For each task, the reviewer either:
- Approves — the task moves to the next stage or completion
- Rejects — the task is returned to the annotator with issue notes
- Reviewers can raise Issues on specific labels or frames to flag problems for discussion.
Monitor overall quality using the Analytics tab: track approval rates, rejection rates, time per task, and open issues by annotator or reviewer.
Step 8: Export labels
In: Annotate
Once tasks are complete and approved:
- Click Export in the Project.
- Choose your export format:
- JSON — Encord’s native format, full fidelity
- COCO — standard format for object detection and segmentation
- Specify which workflow stage to export from (typically Complete).
- Optionally save a label version for reproducibility.
- Click Export.
Your labels are now ready for your training pipeline.
See Export Labels for the full export workflow.
Step 9: Train and evaluate
Take your exported labels into your training pipeline. After training and deploying your model, import its predictions back into Active to evaluate performance:
- Import model predictions into Active.
- Review automatic metrics (mAP, mAR, F1 Score).
- Use embedding plots and filters to find where the model underperforms.
- Create a new Collection of high-value samples — data where the model fails or is uncertain.
- Send the Collection back to Annotate (Step 3) and repeat.
This loop — curate, annotate, train, evaluate, repeat — is the core of applied AI development.
What’s next