Importing model predictions
Import model predictions into Encord Active.
If your predictions are stored in the COCO results format you can jump to the import coco predictions section.
To import your model predictions into Encord Active, there are a couple of steps you need to follow:
If you are already familiar with data_hash
es and featureNodeHash
es from the Encord platform, you can safely skip to 2. Prepare a .pkl
File to be Imported.
Just note that when specifying the class_id
for a prediction, Encord Active expects the associated featureNodeHash
from the Encord ontology as id.
In the SDK section, you will also find ways to import predictions KITTI files and directories containing mask files.
1. Covering the Basics
Before we dive into the details, we have a couple of things that we need to cover.
All commands going forward assume you are in the project directory, if now either cd
into it or use the --target
option with each command.
Uniquely Identifying Data Units
At Encord, every data unit has a data_hash
which uniquely defines it.
To see the mapping between the data_hash
es in your Encord project and the filenames that you uploaded, you can use:
encord-active print data-mapping
After selecting the project you want to print mapping for via the command line, it will display a JSON object similar to the following with key-value pairs of (data_hash
, data file name):
{
"c115344f-6869-4608-a4b8-644241fea10c": "image_1.jpg",
"5973f5b6-d284-4a71-9e7e-6576aa3e56cb": "image_2.jpg",
"9f4dae86-cad4-42f8-bb47-b179eb2e4886": "video_1.mp4"
...
}
To store the data mapping as data_mapping.json
in the current working directory, you can do
encord-active print --json data-mapping
Note that for image_groups
, each image in the group has its own data_hash
, while
videos have one data_hash
for the entire video.
As a consequence, predictions for videos will also need a frame
to uniquely define where the prediction belongs.
When you are preparing predictions for import, you need to have the data_hash
and potentially the frame
available.
Uniquely Identifying Predicted Classes
The second thing you will need when you are preparing predictions for import, is the class_id
for each prediction.
The class id tells Encord Active which class the prediction is associated with.
The class_id
s of an Encord project is defined by the featureNodeHash
attribute on objects in the encord ontology.
You can easily print the class names and class_id
s from your project ontology:
encord-active print ontology
After selecting the project you want to print mapping for via the command line, it will display a JSON object similar to the following with key-value pairs of (object name, class_id
):
{
"objects": {
"cat": "OTK8MrM3",
"dog": "Nr52O8Ex",
"horse": "MjkXn2Mx"
},
"classifications": {}
}
Or if the ontology is made of classifications
{
"objects": {},
"classifications": {
"horses": {
"feature_hash": "55eab8b3",
"attribute_hash": "d446851e",
"option_hash": "376b9761"
},
"cats": {
"feature_hash": "55eab8b3",
"attribute_hash": "d446851e",
"option_hash": "d8e85460"
},
"dogs": {
"feature_hash": "55eab8b3",
"attribute_hash": "d446851e",
"option_hash": "e5264a59"
}
}
}
To store the ontology as ontology_output.json
in the current working directory, you can do
encord-active print --json ontology
2. Prepare a .pkl
File to be Imported
Now, you can prepare a pickle file (.pkl
) to be imported by Encord Active.
You do this by building a list of Prediction
objects.
A prediction object holds a unique identifier of the data unit (the data_hash
and potentially a frame
), the class_id
, a model confidence
score, the actual prediction data
, and the format
of that data.
Creating a Prediction
Label
Below, you can find examples of how to create a label of each of the four supported types.
- Classification
- Bounding Box
- Segmentation Mask
- Polygon
Since we can have multiple classifications for the same data unit (e.g. has a dog?
and has a cat?
) we need to uniquely identify them by providing the 3 hashes from the ontology.
prediction = Prediction(
data_hash="<your_data_hash>",
frame = 3, # optional frame for videos
confidence = 0.8,
classification=FrameClassification(
feature_hash="<your_feature_hash>",
attribute_hash="<your_attribute_hash>",
option_hash="<your_option_hash>",
),
)
To find the three hashes, we can inspect the ontology by running
encord-active print ontology
You should specify your BoundingBox
with relative coordinates and dimensions.
That is:
x
: x-coordinate of the top-left corner of the box divided by the image widthy
: y-coordinate of the top-left corner of the box divided by the image heightw
: box pixel width / image widthh
: box pixel height / image height
from encord_active.lib.db.predictions import BoundingBox, Prediction, Format, ObjectDetection
prediction = Prediction(
data_hash="<your_data_hash>",
frame = 3, # optional frame for videos
confidence = 0.8,
object=ObjectDetection(
feature_hash="<the_class_id>",
format = Format.BOUNDING_BOX,
# Your bounding box coordinates in relative terms (% of image width/height).
data = BoundingBox(x=0.2, y=0.3, w=0.1, h=0.4),
),
)
If you don't have your bounding box represented in relative terms, you can convert it from pixel values like this:
img_h, img_w = 720, 1280 # the image size in pixels
BoundingBox(x=10/img_w, y=25/img_h, w=200/img_w, h=150/img_h)
You should specify masks as binary numpy
arrays of size [height, width] and with dtype
np.uint8
.
from encord_active.lib.db.predictions import Prediction, Format
prediction = Prediction(
data_hash = "<your_data_hash>",
frame = 3, # optional frame for videos
confidence = 0.8,
object=ObjectDetection(
feature_hash="<the_class_id>",
format = Format.MASK,
# _binary_ np.ndarray of shape [h, w] and dtype np.uint8
data = mask
),
)
You should specify your Polygon
with relative coordinates as a numpy array of shape [num_points, 2]
.
That is, an array of relative (x
, y
) coordinates:
x
: relative x-coordinate of each point of the polygon (pixel coordinate / image width)y
: relative y-coordinate of each point of the polygon (pixel coordinate / image height)
from encord_active.lib.db.predictions import Prediction, Format
import numpy as np
polygon = np.array([
# x y
[0.2, 0.1],
[0.2, 0.4],
[0.3, 0.4],
[0.3, 0.1],
])
prediction = Prediction(
data_hash = "<your_data_hash>",
frame = 3, # optional frame for videos
confidence = 0.8,
object=ObjectDetection(
feature_hash="<the_class_id>",
format = Format.POLYGON,
# np.ndarray of shape [n, 2] and dtype float in range [0,1]
data = polygon
),
)
If you have your polygon represented in absolute terms of pixel locations, you can convert it to relative terms like this:
img_h, img_w = 720, 1280 # the image size in pixels
polygon = polygon / np.array([[img_w, img_h]])
Notice the double braces [[img_w, img_h]]
to get an array of shape [1, 2]
.
Preparing the Pickle File
Now you're ready to prepare the file.
You can copy the appropriate snippet based on your prediction format from above and paste it in the code below.
Note the highlighted line, which defines where the .pkl
file will be stored.
from encord_active.lib.db.predictions import Prediction, Format
predictions_to_store = []
for prediction in my_predictions: # Iterate over your predictions
predictions_to_store.append(
# PASTE appropriate prediction snippet from above
)
with open("/path/to/predictions.pkl", "wb") as f:
pickle.dump(predictions_to_store, f)
In the above code snippet, you will have to fetch data_hash
, class_id
, etc., from the for loop in line 5.
3. Import Your Predictions via the CLI
To import the predictions into Encord Active, you run the following command from the command line:
encord-active import predictions /path/to/predictions.pkl
This will import your predictions into Encord Active and run all the metrics on your predictions.
Importing Coco Predictions
Make sure you have installed Encord Active with the coco
extras.
This command assume that you have imported you project using the COCO importer and you are inside the projects directory.
Importing COCO predictions is currently the easiest way to import predictions to Encord Active.
You need to have a results JSON file following the COCO results format and run the following command on it:
encord-active import predictions --coco results.json
After the execution is done, you should be ready to evaluate your model performance.