Learn how to set up the components of your active learning process in Encord
Active learning workflows in the Encord platform are specifically designed for workflow projects. This requirement allows for seamless task movement between essential stages such as
completewhen utilizing the SDK.
Active learning workflows in Encord Active share the following key stages:
If you prefer to witness an active learning workflow in action, please take a look at the end-to-end tutorial for MNIST.
To start an active learning workflow, you need an initial labeled dataset for training the machine learning model. In the Encord platform, this corresponds to having a project with annotations.
If you don't have any projects yet, you can watch the tutorial video on setting up a workflow project to get started quickly.
To proceed, you should pull the project into Encord Active. Execute the following CLI command and remember to acknowledge that you would like to include uninitialized label rows, as they represent unannotated data.
encord-active import project
If you require detailed information on the options available during the import process, you can refer to the Import from Encord platform guide.
If your workflow project already contains annotations, you can proceed directly to Model training and update.
If your project does not have any annotations or you are seeking the most appropriate data for labeling, it's essential to score and rank your data.
While random selection is a possibility, Encord Active provides metrics such as
Image Diversity to enhance and optimize annotation impact.
This metric ranks images based on their ease of annotation, enabling prioritization of suitable and manageable data.
For example, you can follow these steps to prioritize labeling for data with the lowest
Image Diversity score using the UI:
- In the Data Quality explorer page, navigate to the toolbox and click on the Filter tab.
- Select the option that correspond to the first labeling stage (usually named
Annotate 1) under the
Workflow Stagemetadata filter to pick the unannotated data.
- Add the
Image Diversityfilter and adjust the slider to select a subset of data with the lowest score.
- Access the Action tab in the toolbox.
- Click on the 🖋 Relabel button and follow the instructions to prioritize labeling for the selected data.
Nevertheless, to mimic the behavior of task prioritization in projects with only single images, you can follow these steps:
- In the Filter tab, select the option that correspond to the first labeling stage (usually named
Annotate 1) under the
Workflow Stagemetadata filter to pick the data ready to be labeled.
- Use the bulk tagging feature to mark them with a data tag, such as
- Add the
Image Diversityfilter and adjust the slider to select a subset of filtered data with the lowest score.
- Use the bulk tagging feature to mark this further selection with a data tag, such as
to label next.
- Reset the filters and choose the
unlabeledtag option under
- Access the Action tab in the toolbox and click on the ✅ Mark as Complete button and follow the instructions to temporarily move all the selected data to the workflow's
- Return to the Filter tab, reset the filters and choose the
to label nexttag option under
- Access the Action tab in the toolbox again, click on the 🖋 Relabel button and follow the instructions to move the selected data to the workflow's first annotation stage.
- Once the selected data has been labeled, use the following filter combination to bring back the remaining data from the
Completestage to the first labeling stage as in step (8):
No classoption under
Object Classand choose the proper tag name (e.g.
unlabeled) option under
By following these steps, you can ensure that the first labeling stage contains only the prioritized data for labeling, and the task states align at the end with the flow that utilizes the task prioritization feature.
In the active learning workflow, model training plays a crucial role. It involves training a machine learning model using the initial labeled dataset and iteratively updating it with newly labeled data. Encord Active provides support for a wide range of models by allowing you to plug in your own model and interface with it using convenient wrappers.
More information can be found in this here.
Updated about 2 months ago