Training projects

Encord's Annotation Training feature - called 'Training' in the application - provides a novel way to train your annotation and operation teams, thereby improving the quality of your training data.

Annotator teams will be trained based on a benchmark project, that will serve as the 'gold standard' to which your team's annotations will be compared to. It scores the performance of each trainee based on various metrics allowing them to improve the quality of their annotations, and provide clear insights on the quality of their work.

Creating training projects

See our training video below to learn the basics of the Annotation Training feature, including:

  • How to set up a benchmark project.
  • How to set up a training project, based on the benchmark.
  • Tracking a team's progress.

For a more detailed guide, follow the steps below to create a training project, or head over to the working with training projects section to learn how to administer an already existing training project.

1. Create the source project(s)

In Encord, labels are stored at the project level. Recall that projects represent the union of an ontology, dataset(s), and team member(s) that come together to produce a set of labels. In this case, we're interested in first creating our ground-truth labels. Since the ground-truth labels may also be known as the source of truth and are stored in a project, we call the project storing ground-truth labels, the ground-truth source project or simply source project for short.

Training source projects are currently stored as production labeling projects in Encord. Follow the Project creation flow to create your eventual source project. Pay special attention to the ontology you select, as you will need to select the exact same ontology when creating the training project.

2. Create the ground-truth labels

After you've created the source project, you need to add the ground-truth labels. You can add ground-truth labels by annotating data inside our label editor, or upload labels using the SDK.

The Encord system must know that a labeling task has been annotated before it can be used as a ground-truth source. In order to be used as a ground-truth task, its status must be either 'In review' or 'Completed'. A good rule to follow is that the task should appear in the source project's Labels Activity tab with a status of 'In review' or 'Completed'.

If you're using the SDK, you can use the method submit_label_row_for_review to programmatically put labels into the ground-truth label set.



If you don't need to manually review ground-truth labels, for example, when importing them from known sources of truth, you can set a Manual QA Project's "sampling rate" to 0 -- which will send all labeling tasks straight to 'Completed' without entering the review phase.

Now that you've created the source project(s) and prepared the ground-truth labels, you're ready to create the
training project itself.

3. Create the training project

We'll walk through assuming just one source project, but the process is extensible for as many source projects
as you may need.

Name the training project

This step is analogous to naming an annotation project. Choose an easy to recognize name, and set an optional
description if you wish.

Select the ontology

The most important point to keep in mind when choosing an ontology for the annotator training project is that you must choose the same ontology as is used by your intended ground-truth source projects. The annotator training evaluation function works by comparing labels in benchmark tasks vs those in the ground-truth project. Even if the underlying dataset is the same, we are unable to match labels unless they originate from the same ontology, so this is an important step!

Other than the need to match ontology to your source projects however, choosing an ontology is analogous to that of choosing for an annotation project. Click 'Next' after you've confirmed your selection. You can return to this step if you need to choose a different ontology in order to match your desired ground-truth source project(s).

Setup training data

The training data step is where you configure two important settings for a training project.

  1. Choose the project(s) which contain the desired ground-truth labels. When getting started, we recommend choosing source project(s) with 100% annotation task progress. We can only use annotated tasks as benchmark evaluation tasks, so using a project with 100% annotation task progress ensures there are no surprises in relation to which tasks appear in the evaluation task set.

  2. Set up the initial configuration of the benchmark function. We refer to it as a benchmark evaluation function because trainees are benchmarked against the ground-truth, and their performance is calculated according to weights you define over the different label categories and attributes. By default, each category and nested attribute carry equal weight -- the default is represented as 100.

Here, we've selected a single source project with 100% annotation progress, and customized the benchmark function for several ontology classes, reflecting which classes and attributes have greater or lesser importance when evaluating annotator performance. Once you're satisfied with your configuration press 'Next' to continue.



Selection of the source project(s) is final after training project creation, but you can always adjust the benchmark function later, even after project creation. Do not spend too long optimizing your scoring function at this stage. It's best to make an initial guess at your desired configuration, then edit and re-calculate after observing trainee performance.



Some teams may need further insight into the details of the benchmark function in order to devise an accurate system. However, detailed knowledge of the benchmark function may unduly influence trainees behavior. Please contact Encord directly at [email protected] and we'll be more than happy to provide further material on the benchmark process to your administration team. This allows us to empower our customers while protecting the integrity of the benchmarking process.

Assign trainees and create the project

The final step is to add the initial set of annotator trainees to the project. Use this opportunity to add training
project participants, either from a group, or as individuals. Note also this does not have to be the final set
of project participants. If you're unsure, you can always add annotators later.

Press Create training program to create the training project, which will return you to the projects list
with your newly created training project. Proceed to working with training projects to learn how to work with your newly created training project!

Working with training projects

Start here if you want to learn how to run a successful annotator training project. If you don't have a training
project yet, head over to creating a training project to get started.

Roles and permissions

PermissionAdminTeam ManagerAnnotator
View benchmark project sourceβœ…βŒβŒ
Edit benchmark scoring functionβœ…βŒβŒ
Add annotation instructionsβœ…βŒβŒ
Invite team membersβœ…βœ…βŒ
Manage team permissionsβœ…βŒβŒ
Manage adminsβœ…βŒβŒ
Annotate tasks in the task management systemβŒβŒβœ…
Control assignments & status in the task management systemβœ…βœ…βŒ

How to run a training project

After you've created a training project, training normally proceeds along the following lifecycle
milestone steps. We're actively expanding the documentation surrounding our annotator training module,
so please reach us at [email protected] for any unanswered questions.

OK, let's get started!

1. Onboard your annotators

You can add annotators during the creation phase, or by going to 'Settings > Team' and inviting new
annotators. Remember that unlike in annotation projects where each piece of data can only be seen
by one annotator at a time, training projects score each annotator against the same set of benchmark tasks.
Therefore, a copy of each benchmark task will be added to the project for each annotator added.

You can confirm annotators and tasks are ready to go by checking the summary screen. In this case, our source project had 4 tasks and we have 4 annotators assigned. We should expect a total of 16 tasks.



The nature of training projects is to train annotators. Therefore, tasks are not created for admins assigned to the project and administrators can not access annotator tasks via the 'Labels > Queue' tab. This is to prevent administrators from accidentally completing training tasks meant for annotators. Administrators can still confirm annotator submissions using the 'Activity' and 'Data' tabs in the labels page as needed.

Once you've prepared the project with your intended annotator trainee team, send the project URL to each of your team members so they can join and start the training.

2. Annotators proceed through the benchmark tasks

Annotators can access the training project at the URL you share with them. Annotators see a simplified interface which shows only their tasks in both the summary and labels queue pages. Annotators can start their evaluation tasks by clicking the 'Start labelling' button in the upper right or clicking 'Initiate' next to any given labeling task.

Annotation in a training project is the same as it is for an annotation project. Guide your team to the
label editor documentation to get them started. Once an annotator has submitted a task, it
can not be re-opened. We're working on adding greater flexibility to the benchmark task lifecycle, please let us know at [email protected] if you have any related requests or interests!

3. Evaluate annotator performance

Submitted tasks are automatically run through the benchmark function, and the annotators performance on the task is computed. Project administrators can confirm annotator progress and performance in the summary page as below. Use the overview tab for quick insights into overall annotator performance. Use the 'Annotator submissions' tab to confirm individual task submissions on a per-label basis.

At this stage, you can communicate with your annotators in whichever manner is easiest for you and your team.
Use the CSV to download the entire set of results and share with relevant team members. Or perhaps it makes more sense to schedule a live review, using the Annotator submissions' 'View' functionality to verify the benchmark labels and a given annotator's submission in the label editor.

For projects which have hundreds of evaluation labels per annotator, where an 'evaluation label' is defined as an annotation per frame, then we limit the number of evaluation labels displayed in the dashboard for performance reasons. The labels displayed will be some random sampling of the submitted labels. You can always access the full set of evaluation labels by downloading the CSV. Larger downloads may require significant time, and may prompt you to run the downloads in a separate tab so the download can proceed while you can continue working in the current tab.



Some teams may need further insight into the details of the benchmark function in order to devise an accurate system. However, detailed knowledge of the benchmark function may unduly influence trainees behavior. Please contact Encord directly at [email protected] and we'll be more than happy to provide further material on the benchmark process to your administration team. This allows us to empower our customers while protecting the integrity of the benchmarking process.

4. Adjust the benchmark function and re-calculate scores

If, after evaluating annotator performance, you feel that annotator score distributions don't correctly reflect
the skill displayed -- or don't properly reward and penalize annotators for the types of annotations they
made correctly or incorrectly -- it's always possible to adjust the benchmark function and recalculate.

Go the Settings page, and find the section marked 'Benchmark scoring function'. Press the Edit button
to enable the function's weight editor and change the values to match your new plan. Finally, press Save in the upper right to persist the new function configuration.

To see the changes applied against previous submissions, return to the 'Summary' page and press the Re-calculate scores button. If a given annotator's annotations were affected by the weighting change, the 'Benchmark results' column will change to reflect their new score with the new weights! In this case, we see the score of an annotator, on the left and right respectively before and after we changed the scoring function (as above), and pressed the Re-calculate scores button. The annotator's change in score is noticeable, but doesn't seem to change his performance from unskilled to skilled. Likely, this annotator should undergo another round of training.

5. Repeat until finished

You can continue to adjust scores even after all the annotators have finished all their tasks, until you feel
the score distribution matches your intent.

You can also add new annotators to existing projects, as you did in step #1. We recommend however, that when adding a new group or significant amount of annotators, it's easier to manage if you create another new training project. This way, you can manage the new cohort of annotators all at once.