You can. Encord Active is an open source project aimed to support all computer vision based active learning projects. For example:
Encord Active is an open-source project aimed to support all computer vision based active learning projects.
You can initialize a project using the
init command with a local image directory.
Please see our import documentation for more details, and options available to you.
Everything you do with this library stays within your local machine. No statistics, data, or other information will be collected or sent elsewhere.
The only communication that occurs with the outside world is with Encord's main platform - if you have a project linked to Encord.
If you encounter any issues during the installation process, we recommend checking that you have followed the steps outlined in the installation guide carefully.
If the problem persists or if you have any further questions, please don't hesitate to get in touch with us via Slack or email. We'll be happy to assist you with any installation-related issues you may have.
A quality metric is a function that can be applied to your data, labels, and model predictions to assess their quality and rank them accordingly.
Encord Active uses these metrics to analyze and decompose your data, labels, and predictions.
Here is a blog post on how we like to think about quality metrics.
Quality metrics are not only limited to those that ship with Encord Active. In fact, the power lies in defining your own quality metrics for indexing your data just right. Here is the documentation for writing your own metrics.
To import your model predictions into Encord Active, you need to follow these steps:
- Build a list of
encord_active.lib.db.predictions.Predictionobjects that represent your model predictions.
- Store the list of predictions in a pickle file.
- Run the command
encord-active import predictions /path/to/your/file.pkl, where
/path/to/your/file.pklis the path to the pickle file containing your predictions.
By executing this command, Encord Active will import and incorporate your model predictions into the project. You can refer to the workflow description for importing model predictions for more detailed instructions.
See our documentation on writing your own metrics.
For larger projects, initialization can take a while.
While we're working on improving the efficiency, there are a couple of tricks that you can do.
As soon as the metric computations have started (indicated by Encord Active printing a line containing
Running metric) you can open a new terminal and run
encord-active start. This will allow you to continuously see what have been computed so far. Refresh the browser once in a while when new metrics are done computing in your first terminal.
You can also kill the import process as soon as the metrics have started to compute. This will leave you with a project containing fewer quality metrics. As a consequence, you will not be able to see as many insights as if the process is allowed to finish. However, you can always use the
encord-active metric runcommand to run metrics that are missing.
Please see this notebook to learn how to add your own custom embeddings.
The code base is structured such that all data operations live in
encord_active.server which serve as the "backend" for the UI. As such, everything you can do with the UI can also be done by code.
Other good resources can be found in our example notebooks.
Exporting data back into the rest of your pipeline can be done via the toolbox in the application's explorer pages.
Dataset management can be done in two ways.
- You can tag your data to keep track of subsets (or versions) of your dataset.
- If you are planning to do more involved changes to your dataset and you want the ability to go back, your best option is to use the Clone button in the Action tab of the toolbox in the application's explorer pages.
The best way to version your project is to tag your data with the tags feature as you go.
Alternatively, you can use
git. To do that, we suggest adding a
.gitignore file with the following content:
data/**/*.jpg data/**/*.jpeg data/**/*.png data/**/*.tiff data/**/*.mp4
After that, run the following:
git add .; git commit -am "Initial commit".
Throughout the Data Quality, Label Quality, and Model Quality pages, you can tag your data.
There are two different levels at which you can tag data; the data level which applies to the raw images/video frames and the label level which applies to the classifications and objects associated to each image.
You can, e.g. use tags to filter your data for further processing like relabeling, training models, or inspecting model performance based on a specific subset of your data.
Here is some more documentation on using the tagging feature.
See this blog post on finding and fixing label errors using Encord Active.
Encord Active supports the active learning process by allowing you to
- Explore your data to select what to label next
- Employ acquisition functions to automatically select what to label next
- Find label errors that potentially harm your model performance
- Sending data to Encord's Annotation module for labeling
- Automatically decompose your model performance to help you determine where to put your focus for the next model iteration
- Tag subsets of data to set aside test sets for specific edge cases for which you want to maintain your model performance between each production model
For detailed information on active learning and the role of Encord Active, you can refer to our documentation on Active Learning within Encord Active.
Additionally, we greatly appreciate it if you could report the issue on GitHub. Your feedback and bug reports help us improve Encord Active for everyone.
Updated 11 days ago