Import and Verify Custom Metadata

Custom metadata, also known as client metadata, is supplementary information you can add to all data imported/registered with Encord. It is provided in the form of a Python dictionary, as shown in the examples. This metadata serves several key functions:

Filtering and sorting in Index and Active.
Creating custom Label Editor layouts based on metadata.

Active and Index support filtering, creating Collections, and by extension, creating Datasets and Projects based on the custom metadata on your data.

Prerequisites

Before you can filter your data or create a Collection based on your data’s custom metadata, the custom metadata must exist in your Annotate Project.

This content applies to custom metadata (clientMetadata), which is the metadata associated with individual data units. This is distinct from videoMetadata that is used to specify video parameters when using Strict client-only access. It is also distinct from patient metadata in DICOM files.

Custom metadata (clientMetadata) is accessed by specifying the dataset using the <dataset_hash>. All Projects that have the specified Dataset attached contain custom metadata.

READ THIS FIRST

While not required, we strongly recommend importing a metadata schema before importing custom metadata into Encord. The process we recommend:

Import a metadata schema. If a metadata schema already exists, you can import metadata. You can run a small piece of code to verify that a metadata schema exists
Import your custom metadata.

Performing multiple schema imports overwrites the current schema with the new schema.

Import Custom Metadata

Importing custom metadata (clientMetadata) for any data type follows a similar format. But there are important differences for each data type.

Videos

BEST PRACTICE: If you want to use Index or Active with your video data, we STRONGLY RECOMMEND using key frames, custom metadata, and custom embeddings. When specifying key frames set the sampling_rate to 0. This imports only the first frame and any key frames you specify in the video. This can significantly speed up the import of your data into Active and Index and help you to focus on only data you identify as critical.

The following table provides some guidance for the examples provided after the table.

Title	Description
Template	Provides the proper JSON format to import videos into Encord. This template provides examples from the most basic to the most complex.
Data	Imports videos into Encord. Why would I do this? You ONLY want to add labels and classifications to your data. You DO NOT want to use Index or Active.
Key Frames	Imports videos with an Encord title and specifies key frames (frames of interest) for Active and Index. Why would I do this? You ONLY want to see frames that you deem critical in Active and Index. You want to significantly improve the time to import videos into Active and Index. `config` is optional when specifying key frames for Active and Index: Specifying a `sampling_rate` of `0` only imports the first frame and all key frames of your video into Active and Index. `"config": { "sampling_rate": "<samples-per-second>", "keyframe_mode": "frame" or "seconds", },` If `config` is not specified, the `sampling_rate` is 1 frame per second, and the `keyframe_mode` is `frame`.
Custom Metadata	Imports videos with an Encord title, specifies key frames (frames of interest), and custom metadata for Active and Index. Custom metadata can be applied to the entire video or individual frames in the video. Why would I do this? Importing custom metadata allows you to filter your data in Active and Index to make it easier to find the data you want to focus on. This speeds up creating Collections and by extension Datasets. Specifying key frames means you ONLY want to see frames that you deem critical in Active and Index AND you want to significantly improve the time to import videos into Active and Index. `config` is optional when specifying key frames for Active and Index: Specifying a `sampling_rate` of `0` only imports the first frame and all key frames of your video into Active and Index. `"config": { "sampling_rate": "<samples-per-second>", "keyframe_mode": "frame" or "seconds", },` If `config` is not specified, the `sampling_rate` is 1 frame per second, and the `keyframe_mode` is `frame`.
Embeddings	Imports videos with an Encord title, specifies key frames (frames of interest), custom metadata, and custom embeddings for Active and Index. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number. Why would I do this? Importing custom embeddings allows you to use scatter plots to examine your data AND allows you to use similarity search and natural language searches. Index supports embedding dimensions 1 to 4096, while Active supports embedding dimensions 1 to 2000. Importing custom metadata allows you to filter your data in Active and Index to make it easier to find the data you want to focus on. This speeds up creating Collections and by extension Datasets. Specifying key frames means you ONLY want to see frames that you deem critical in Active and Index AND you want to significantly improve the time to import videos into Active and Index. `config` is optional when specifying custom embeddings for Active and Index: Specifying a `sampling_rate` of `0` only imports the first frame and all key frames of your video into Active and Index. `"config": { "sampling_rate": "<samples-per-second>", "keyframe_mode": "frame" or "seconds", },` If `config` is not specified, the `sampling_rate` is 1 frame per second, and the `keyframe_mode` is `frame`. Refer to our documentation for more information about Index with Custom Metadata, Index with Custom Embeddings, Active with Custom Metadata and Active with Custom Embeddings.
Video Metadata	Imports videos with the videoMetadata flag. When the videoMetadata flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.

{
  "videos": [
    {
      "objectUrl": "cloud-path-to-your-video-1",
      "title": "title-for-your-video-1",
      "clientMetadata": {"metadata-1": "value", "metadata-2": "value"}

    },
    {
      "objectUrl": "cloud-path-to-your-video-2",
      "title": "title-for-your-video-2",
      "clientMetadata": {
          "metadata-1": "value", "metadata-2": "value",
          "$encord": {
              "frames": ["<frame-number-1>","<frame-number-2>","<frame-number-3>"]
            }
        }
    },
    {
        "objectUrl": "cloud-path-to-your-video-3",
        "title": "title-for-your-video-3",
        "clientMetadata": {
            "metadata-1": "value", "metadata-2": "value",
            "$encord": {
                "frames": {
                    "<frame-number-or-seconds>": {
                      "metadata-1": "value", "metadata-2": "value",
                      "<my-embedding>": [1.0, 2.0, 3.0]
                    },
                    "<frame-number-or-seconds>": {
                      "metadata-1": "value", "metadata-2": "value",
                      "<my-embedding>": [1.0, 2.0, 3.0]
                    }
                }
             }
          }
    },
    {
      "objectUrl": "cloud-path-to-your-video-4",
        "title": "title-for-your-video-4",
        "clientMetadata": {
            "metadata-1": "value", "metadata-2": "value",
            "$encord": {
                "config": {
                    "sampling_rate": "<samples-per-second>",
                    "keyframe_mode": "frame" or "seconds",
                },
                "frames": {
                    "<frame-number-or-seconds>": {
                      "metadata-1": "value", "metadata-2": "value",
                      "<my-embedding>": [1.0, 2.0, 3.0]
                    },
                    "<frame-number-or-seconds>": {
                      "metadata-1": "value", "metadata-2": "value",
                      "<my-embedding>": [1.0, 2.0, 3.0]
                    }
                }
            }
        }
    }
  ],
  "skip_duplicate_urls": true
}

Audio files

Audio Files

The following is an example JSON file for uploading two audio files to Encord.

Template: Imports audio files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
Audio Metadata: Imports one audio file with the audiometadata flag. When the audiometadata flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.

{
  "audio": [
    {
      "objectUrl": "<object url_1>"
    },
    {
      "objectUrl": "<object url_2>",
      "title": "my-custom-audio-file-title.mp3",
      "clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
    }
  ],
  "skip_duplicate_urls": true
}

PDFs

The following is an example JSON file for uploading PDFs to Encord.

Template: Imports PDFs with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
Data: Imports two PDFs with no title or custom metadata.
Custom Metadata: Imports two pdfs with a title and custom metadata.

{
  "pdfs": [
    {
      "objectUrl": "<object url_1>"
    },
    {
      "objectUrl": "<object url_2>",
      "title": "my-file.html",
      "clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
    }
  ],
  "skip_duplicate_urls": true
}

Text Files

The following is an example JSON file for uploading text files to Encord.

Template: Imports text files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
Data: Imports two text files with no title or custom metadata.
Custom Metadata: Imports two text files with a title and custom metadata.

{
  "text": [
    {
      "objectUrl": "<object url_1>"
    },
    {
      "objectUrl": "<object url_2>",
      "title": "my-file.html",
      "clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
    }
  ],
  "skip_duplicate_urls": true
}

Single images

Single Images

For detailed information about the JSON file format used for import go here.The JSON structure for single images parallels that of videos.Template: Provides the proper JSON format to import images into Encord.Examples:

Data Imports the images only.
Custom Metadata: Imports images with an Encord title for the images and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
Embeddings: Imports images with an Encord title, custom metadata, and custom embeddings for each image. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
Image Metadata: Imports images with image metadata. This improves the import speed for your images.

{
  "images": [
    {
      "objectUrl": "file/path/to/images/file-name-01.file-extension"
    },
    {
      "objectUrl": "file/path/to/images/file-name-02.file-extension"
    },
    {
      "objectUrl": "file/path/to/images/file-name-03.file-extension",
      "title": "image-title.file-extension",
      "clientMetadata": {
        "metadata-1": "value", 
        "metadata-2": "value"
        }
    },
    {
      "objectUrl": "file/path/to/images/file-name-04.file-extension",
      "title": "image-title.file-extension",
      "clientMetadata": {
        "metadata-1": "value", 
        "metadata-2": "value", 
        "<my-embedding>": [1.0, 2.0, 3.0]
      }
     }
  ],
  "skip_duplicate_urls": true
}

Image groups

For detailed information about the JSON file format used for import go here.

Image groups are collections of images that are processed as one annotation task.
Images within image groups remain unaltered, meaning that images of different sizes and resolutions can form an image group without the loss of data.
Image groups do NOT require ‘write’ permissions to your cloud storage.
Custom metadata is defined per image group, not per image. See our documentation here to learn how to add clientMetadata to images in an image group.
If skip_duplicate_urls is set to true, all URLs exactly matching existing image groups in the dataset are skipped.

The position of each image within the sequence needs to be specified in the key (objectUrl_{position_number}).

Template: Provides the proper JSON format to import image groups into Encord.Examples:

Data: Imports the image groups only.
Custom Metadata: Imports image groups with an Encord title for the image groups and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.

{
  "image_groups": [
    {
      "title": "<title 1>",
      "createVideo": false,
      "objectUrl_0": "file/path/to/images/file-name-01.file-extension",
      "objectUrl_1": "file/path/to/images/file-name-02.file-extension",
      "objectUrl_2": "file/path/to/images/file-name-03.file-extension",
    },
    {
      "title": "<title 2>",
      "createVideo": false,
      "objectUrl_0": "file/path/to/images/file-name-01.file-extension",
      "objectUrl_1": "file/path/to/images/file-name-02.file-extension",
      "objectUrl_2": "file/path/to/images/file-name-03.file-extension",
      "clientMetadata": {"optional": "metadata"}
    }
  ],
  "skip_duplicate_urls": true
}

Image sequences

For detailed information about the JSON file format used for import go here.

Image sequences are collections of images that are processed as one annotation task and represented as a video.
Images within image sequences may be altered as images of varying sizes and resolutions are made to match that of the first image in the sequence.
Creating Image sequences from cloud storage requires ‘write’ permissions, as new files have to be created in order to be read as a video.
Each object in the image_groups array with the createVideo flag set to true represents a single image sequence.
Custom client metadata is defined per image sequence, not per image.
If skip_duplicate_urls is set to true, all URLs exactly matching existing image sequences in the dataset are skipped.

The only difference between adding image groups and image sequences using a JSON file is that image sequences require the createVideo flag to be set to true. Both use the key image_groups.

The position of each image within the sequence needs to be specified in the key (objectUrl_{position_number}).

Encord supports up to 32,767 entries (21:50 minutes) for a single image sequence. We recommend up to 10,000 to 15,000 entries for a single image sequence for best performance. If you need a longer sequence, we recommend using video instead of an image sequence.

Template: Provides the proper JSON format to import image groups into Encord.** Examples:**

Data: Imports the images groups only.
Custom Metadata: Imports image groups and custom metadata. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.

{
  "image_groups": [
    {
      "title": "<title 1>",
      "createVideo": true,
      "objectUrl_0": "<object url>"
    },
    {
      "title": "<title 2>",
      "createVideo": true,
      "objectUrl_0": "<object url>",
      "objectUrl_1": "<object url>",
      "objectUrl_2": "<object url>",
      "clientMetadata": {"optional": "metadata"}
    }
  ],
  "skip_duplicate_urls": true
}

DICOM

For detailed information about the JSON file format used for import go here.

Each dicom_series element can contain one or more DICOM series.
Each series requires a title and at least one object URL, as shown in the example below.
If skip_duplicate_urls is set to true, all object URLs exactly matching existing DICOM files in the dataset will be skipped.

Custom metadata is distinct from patient metadata, which is included in the .dcm file and does not have to be specific during the upload to Encord.

The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.

The first series contains only a single object URL, as it is composed of a single file.
The second series contains 3 object URLs, as it is composed of three separate files.
The third series contains 2 object URLs, as it is composed of two separate files.

For each DICOM upload, an additional DicomSeries file is created. This file represents the series file-set. Only DicomSeries are displayed in the Encord application.

Template

{
  "dicom_series": [
    {
      "title": "Series-1",
      "objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series1-file.dcm"
    },
    {
      "title": "Series-2",
      "objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series2-file1.dcm",
      "objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series2-file2.dcm",
      "objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series2-file3.dcm",
    },
      {
      "title": "Series-3",
      "objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series3-file1.dcm",
      "objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series3-file2.dcm",
    }
  ],
  "skip_duplicate_urls": true
}

NIfTI

The following is an example JSON file for uploading two NIfTI files to Encord.

Template

{
    "nifti": [
      {
        "title": "<file-1>",
        "objectUrl": "https://my-bucket/.../nifti-file1.nii"
      },
      {
        "title": "<file-2>",
        "objectUrl": "https://my-bucket/.../nifti-file2.nii.gz"
      }
    ],
    "skip_duplicate_urls": true
  }

Multiple file types

You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.

Multiple file types


{
  "images": [
    {
      "objectUrl": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/Image1.png"
    }
  ],
  "videos": [
    {
      "objectUrl": "https://encord-integration.s3.eu-west-2.amazonaws.com/videos/Cooking.mp4"
    },
    {
      "objectUrl": "https://encord-integration.s3.eu-west-2.amazonaws.com/videos/Oranges.mp4"
    }
  ],
  "image_groups": [
    {
      "title": "apple-samsung-light",
      "createVideo": true,
      "objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(32).jpg",
      "objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(33).jpg",
      "objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(34).jpg",
      "objectUrl_3": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(35).jpg"
    },
    {
      "title": "apple-samsung-dark",
      "createVideo": true,
      "objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(32).jpg",
      "objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(33).jpg",
      "objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(34).jpg",
      "objectUrl_3": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(35).jpg"
    }
  ],
  "image_groups": [
    {
      "title": "apple-ios-light",
      "createVideo": false,
      "objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/3+(32).jpg",
      "objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/3+(33).jpg"
    }
  ],
  "skip_duplicate_urls": true
}

Import to Files already in Index

We recommend importing custom metadata when you import your data, because importing with your data can significantly save you time when importing at scale. However, you can import custom metadata on data that already exists in Encord.

Importing with Custom Embeddings You can import custom embeddings with custom metadata. When importing custom embeddings with custom metadata keep the following in mind: config is optional when importing your custom embeddings:

"config": {
    "sampling_rate": "<samples-per-second>",
    "keyframe_mode": "frame" or "seconds",
},

If config is not specified, the sampling_rate is 1 frame per second, and the keyframe_mode is frame.

Specifying a sampling_rate of 0 only imports the first frame and all keyframes of your video into Index.

Examples

# Import dependencies
from encord import EncordUserClient
from encord.http.bundle import Bundle
from encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy

# Authentication
SSH_PATH = "<file-path-to-ssh-private-key>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

updates = {

    # Imports custom metadata 
    "<data-hash-1>": {"metadata-1": "value", "metadata-2": "value"},
    "<data-hash-2>": {"metadata-1": "value", "metadata-2": "value"},

    # Imports custom metadata and specifies key frames for Active and Index
    "<data-hash-3>": {
        "metadata-1": "value", "metadata-2": "value",
        "$encord": {
            "frames": {"111", "113", "117", "119"}
            }
    },

    # Imports custom metadata and specifies key frames, custom metadata on frames, and the custom embeddings for those key frames
    "<data-hash-3>": {
        "metadata-1": "value", "metadata-2": "value",
        "$encord": {
            "frames": {
                "<frame-number-or-seconds>": {
                  "metadata-1": "value", "metadata-2": "value",
                  "<my-embedding>": [1.0, 2.0, 3.0]
                    },
                "<frame-number-or-seconds>": {
                  "metadata-1": "value", "metadata-2": "value",
                  "<my-embedding>": [1.0, 2.0, 3.0]
                    }
                }
            }
    },

    # Imports custom metadata and specifies key frames and the custom embeddings for those key frames
    "<data-hash-4>": {
        "metadata-1": "value", "metadata-2": "value",
        "$encord": {
            "config": {
                "sampling_rate": <samples-per-second>,  # VIDEO ONLY (optional default = 1 sample/second)
                "keyframe_mode": "frame" or "seconds",  # VIDEO ONLY (optional default = "frame")
            },
            "frames": {
                "<frame-number-or-seconds>": {
                  "metadata-1": "value", "metadata-2": "value",
                    "<my-embedding>": [1.0, 2.0, 3.0]
                    },
                "<frame-number-or-seconds>": {
                  "metadata-1": "value", "metadata-2": "value",
                    "<my-embedding>": [1.0, 2.0, 3.0]
                    }
                }
            }
        }
    },
}

# Use the Bundle context manager
with Bundle() as bundle:
    # Update the storage items based on the dictionary
    for item_uuid, metadata_update in updates.items():
        item = user_client.get_storage_item(item_uuid=item_uuid)

        # Make a copy of the current metadata and update it with the new metadata
        curr_metadata = item.client_metadata.copy()
        curr_metadata.update(metadata_update)

        # Update the item with the new metadata and bundle
        item.update(client_metadata=curr_metadata, bundle=bundle)

Folders and Custom Metadata

List custom metadata (Folders)


from encord import EncordUserClient
from encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy

# Authentication
SSH_PATH = "<file-path-to-ssh-private-key-file>"
FOLDER_HASH = "<unique-folder-id>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

folder = user_client.get_storage_folder(FOLDER_HASH)
items = folder.list_items()

for item in items:
    print (item.uuid, item.client_metadata)

Import Custom Metadata (Folders)

Before importing custom metadata to Encord, first import a metadata schema. We strongly recommend that you upload your custom metadata to Folders, instead of importing using Datasets. Importing custom metadata to data in folders allows you to filter your data in Index by custom metadata.

After importing or updating custom metadata, verify that your custom metadata (list the data units with custom metadata) applied correctly. Do not simply add a print command after importing or updating your custom metadata.

Import custom metadata to specific data units in a Folder

This code allows you to import custom metadata on specific data units in Index. This code OVERWRITES all existing custom metadata on a data unit.

# Import dependencies
from encord import EncordUserClient
from encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy


# Authentication
SSH_PATH = "<file-path-to-ssh-private-key-file>"
FOLDER_HASH = "<unique-folder-id>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

folder = user_client.get_storage_folder(FOLDER_HASH)

# Define a dictionary with item UUIDs and their respective metadata updates
updates = {
    # "<data-unit-id>": {"metadata": "metadata-value"},
    # "<data-unit-id>": {"metadata": False},
    # "<data-unit-id>": {"metadata": "metadata-value"},
    # "<data-unit-id>": {"metadata": true}
}

# Update the storage items based on the dictionary
for item_uuid, metadata in updates.items():
    item = user_client.get_storage_item(item_uuid=item_uuid)
    item.update(client_metadata=metadata)

Import custom metadata to all data units in a Folder

This code allows you to update ALL custom metadata on ALL data units in a Folder in Index. This code OVERWRITES all existing custom metadata on a data unit.

# Import dependencies
from encord import EncordUserClient
from encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy

# Authentication
SSH_PATH = "<file-path-to-ssh-private-key-file>"
FOLDER_HASH = "<unique-folder-id>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

folder = user_client.get_storage_folder(FOLDER_HASH)
items = folder.list_items()

for item in items:
     item.update(client_metadata={"metadata": "value", "metadata": "value"})

Update custom metadata

The Specific Data Units code enables you to update custom metadata for specific data units in Index. It does not overwrite all existing custom metadata on a data unit. Instead, it updates metadata that matches existing keys with new values and adds any new custom metadata keys to the data unit without affecting other existing metadata. The All data units in a Project code updates the custom metadata for all data units in the specified Project. Replace the client_metadata with the metadata you want to update.

# Import dependencies
from encord import EncordUserClient

# Authentication
SSH_PATH = "<private_key_path>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

# Define a dictionary with item UUIDs and their respective metadata updates
updates = {
     "<data-unit-id>": {"metadata": "metadata-value"},
     "<data-unit-id>": {"metadata": False},
     "<data-unit-id>": {"metadata": "metadata-value"},
    # "<data-unit-id>": {"metadata": true}
}

# Update the storage items based on the dictionary
for item_uuid, metadata_update in updates.items():
    item = user_client.get_storage_item(item_uuid=item_uuid)

    # make a copy of the current metadata and update it with the new metadata
    curr_metadata = item.client_metadata.copy()
    curr_metadata.update(metadata_update)

    # update the item with the new metadata
    item.update(client_metadata=curr_metadata)

Bulk import custom metadata to all data units in a Folder

This code allows you to update custom metadata on all data units in a Folder in Index. This code OVERWRITES all existing custom metadata on a data unit. Using bundle allows you to update up to 1000 label rows at a time.

# Import dependencies
from encord import EncordUserClient
from encord.http.bundle import Bundle
from encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy

# Authentication
SSH_PATH = "<ssh-private-key>"
FOLDER_HASH = "<unique-folder-id>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

folder = user_client.get_storage_folder(FOLDER_HASH)
items = folder.list_items()

# Use the Bundle context manager
with Bundle() as bundle:
    for item in items:
        # Update each item with client metadata
        item.update(client_metadata={"metadata-1": "value", "metadata-2": False}, bundle=bundle)

Bulk custom metadata import on specific data units

This code allows you to update custom metadata on specific data units in a Folder in Index. This code DOES NOT OVERWRITE existing custom metadata on a data unit. It does overwrite custom metadata with existing values and adds new custom metadata to the data unit. Using bundle allows you to update up to 1000 label rows at a time.

# Import dependencies
from encord import EncordUserClient
from encord.http.bundle import Bundle
from encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy

# Authentication
SSH_PATH = "<ssh-private-key>"
FOLDER_HASH = "<unique-folder-id>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

folder = user_client.get_storage_folder(FOLDER_HASH)
updates = {
    # "<data-unit-id>": {"metadata-1": "metadata-value"},
    # "<data-unit-id>": {"metadata-2": False},
    # "<data-unit-id>": {"metadata-1": "metadata-value"},
    # "<data-unit-id>": {"metadata-2": true}
}

# Use the Bundle context manager
with Bundle() as bundle:
    for storage_item in folder.list_items():
        # Update each item with client metadata
        update = updates[storage_item.uuid]
        storage_item.update(client_metadata=update, bundle=bundle)

Add custom metadata to images in an image group

The following script adds clientMetadata to all images / frames in a specified Image Group. Ensure that you:

Replace <private_key_path> with the file path to your private SSH key.
Replace <image-group-id> with the File ID (UUID) of the target Image Group.
Customize the _get_metadata_for_image function with the clientMetadata you want to add. To add unique metadata for each image, make the function dynamic by passing additional variables.

from uuid import UUID
from encord import EncordUserClient
from encord.http.bundle import Bundle

# Initialize the SDK client
user_client = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path="<private_key_path>"
    )

# Replace image-group-id with the File ID of the image group
image_group_uuid = "<image-group-id>"

# Function to define metadata for each image. Can be made dynamic by passing variables.
def _get_metadata_for_image(image):
    return {
        "somekindof": "string",
        "somekindof": "number"
    }

# Fetch the uploaded image group
uploaded_image_group = user_client.get_storage_item(image_group_uuid)

# Retrieve and update metadata for each image in the group
frame_items = uploaded_image_group.get_child_items()

with Bundle() as bundle:
    for frame_item in frame_items:
        # Update client metadata for each image
        frame_item.update(client_metadata=_get_metadata_for_image(frame_item), bundle=bundle)
        print (frame_item.client_metadata)

# Re-fetch and verify updates
updated_frame_items = uploaded_image_group.get_child_items()

for updated_frame_item in updated_frame_items:
    expected_metadata = _get_metadata_for_image(updated_frame_item)
    assert updated_frame_item.client_metadata == expected_metadata

print("Client metadata successfully added and verified for all images in the Image Group.")

Datasets and custom metadata

Before importing custom metadata to Encord, first import a metadata schema.We strongly recommend that you upload your custom metadata to Folders, instead of importing using Datasets. Importing custom metadata to data in Folders allows you to filter your data in Index by custom metadata.

List custom metadata (Datasets)

The following code lists the custom metadata of all data units in the specified Dataset. The code prints the custom metadata along with the data unit’s index within the dataset.


# Import dependencies
from encord import EncordUserClient
from encord.client import DatasetAccessSettings

# Authenticate with Encord using the path to your private key
client = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path="<private_key_path>"
)

# Specify a dataset to read or write metadata to
dataset = client.get_dataset("<dataset_hash>")

# Fetch the dataset's metadata
dataset.set_access_settings(DatasetAccessSettings(fetch_client_metadata=true))

# Read the metadata of all data units in the dataset.
for data_unit, data_row in enumerate(dataset.data_rows):
    print(f"{data_row.client_metadata} - Data Unit: {data_unit}")

Import custom metadata (Datasets)

Before importing custom metadata to Encord, first import a metadata schema. We strongly recommend that you import your custom metadata to Folders, instead of importing to Datasets. Importing custom metadata to data in folders allows you to filter your data in Index by custom metadata.

Import custom metadata to a specific data unit in your Dataset

You can import custom metadata (clientMetadata) to specific data units in the Dataset.

Replace <private_key_path> with the path to your private key.
Replace <dataset_hash> with the hash of your Dataset.
Replace Image1.png and the other file names in the metadata variable with the names of the files in your Dataset to which you want to add metadata.

You can find the <data unit number> by reading all metadata in the Dataset.

Multiple data units

from encord import EncordUserClient
from encord.client import DatasetAccessSettings

# Instantiate Encord client by substituting the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
                ssh_private_key_path="<private_key_path>",
            )


# Specify a Dataset to read or write metadata to
dataset = user_client.get_dataset("<dataset_hash>")

# Fetch the dataset's metadata
dataset.set_access_settings(DatasetAccessSettings(fetch_client_metadata=true))

metadata = {
  'Image1.png': {"group-id": "A", 'layout-group': 'A'},
  'Image2.png': {"group-id": "B", 'layout-group': 'A'},
  'Image3.png': {"group-id": "C", 'layout-group': 'B'},
  'Image4.png': {"group-id": "D", 'layout-group': 'B'},
}

for data_row in dataset.data_rows:
  data_row.client_metadata = metadata[data_row['data_title']]
  data_row.save()
  print(data_row.client_metadata)

Import custom metadata (`clientMetadata`) to all data units in a Dataset

The following code adds the same custom metadata (clientMetadata) to each data unit in the specified dataset. The code prints the custom metadata along with the data units index within the dataset, so that you can verify that the custom metadata was set correctly.


# Import dependencies
from encord import EncordUserClient
from encord.client import DatasetAccessSettings

# Authenticate with Encord using the path to your private key
client = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path="<private_key_path>"
)

# Specify a dataset to read or write metadata to
dataset = client.get_dataset("<dataset_hash>")

# Fetch the Dataset's metadata
dataset.set_access_settings(DatasetAccessSettings(fetch_client_metadata=true))

# Add metadata to all data units in the Dataset.
# Replace {"my": "metadata"} with the metadata you want to add
for data_unit, data_row in enumerate(dataset.data_rows):
    data_row.client_metadata = {"my": "metadata"}
    data_row.save()
    print(f"{data_row.client_metadata} - Data Unit: {data_unit}")

Deprecated - Reserved Keywords

Reserved keywords are strings that are set aside for exclusive use. The following keywords are reserved:

keyframes

KEYFRAMES

keyframes is reserved for use with frames of interest in videos. Specifying keyframes on specific frames ensures that those frames import into Index and Active. That means frames specified using keyframes are available to filter your frames and for calculating embeddings on your data.


 client_metadata = {
     "keyframes": [<frame_number>, <frame_number>, <frame_number>, <frame_number>, <frame_number>]
}

You can include keyframes while importing your videos or after you import your videos. Import keyframes to Specific Data Units (Folder): This code allows you to import keyframes on specific videos in Index. This code DOES NOT OVERWRITE all existing custom metadata on a data unit. It does overwrite custom metadata with existing values and adds new custom metadata to the data unit.

# Import dependencies
from encord import EncordUserClient

# Authentication
SSH_PATH = "<private_key_path>"

# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path=SSH_PATH,
)

# Define a dictionary with item UUIDs and their keyframes updates
updates = {
    "<data-unit-id>": {"keyframes": [<frame_number>, <frame_number>, <frame_number>, <frame_number>, <frame_number>]},
    "<data-unit-id>": {"keyframes": [<frame_number>, <frame_number>, <frame_number>, <frame_number>, <frame_number>]},
    "<data-unit-id>": {"keyframes": [<frame_number>, <frame_number>, <frame_number>, <frame_number>, <frame_number>]},
    "<data-unit-id>": {"keyframes": [<frame_number>, <frame_number>, <frame_number>, <frame_number>, <frame_number>]}
}

# Update the storage items based on the dictionary
for item_uuid, metadata_update in updates.items():
    item = user_client.get_storage_item(item_uuid=item_uuid)

    # make a copy of the current metadata and update it with the new metadata
    curr_metadata = item.client_metadata.copy()
    curr_metadata.update(metadata_update)

    # update the item with the new metadata
    item.update(client_metadata=curr_metadata)

Custom Metadata in Index

Once your custom metadata is imported to a Folder, you can create Collections based on your custom metadata and then create Datasets and Projects based on the Collections. To create a Dataset from an Index Collection:

Log in to the Encord platform.
The landing page for the Encord platform appears.
Go to Index > Files.
The All folders pages appears with a list of all folders in Encord.
Click in to a folder.
The landing page for the folder appears and the View in Explorer button is enabled.
Click the View in Explorer button.
The Index Explorer page appears.
Search, sort, and filter your data until you have the subset of the data you need.
Select one or more of the images in the Explorer workspace.
A ribbon appears at the top of the Explorer workspace.
Click Select all to select all the images in the subset.
Click Add to a Collection.
Click New Collection.
Specify a meaningful title and description for the Collection.
The title specified here is applied as a tag/label to every selected image.
Click Collections.
The Collections page appears.

Select the checkbox for the Collection to create a Dataset.
Click Create Dataset.
The Create Dataset dialog appears.

Specify meaningful content for the following:

Dataset Title
Dataset Description

Select Split image groups/sequences to extract images from the groups or sequences and add each image separately to the Dataset, if your Collection includes images from a group or sequence.

Custom Metadata in Active

Once your custom metadata is included in your Annotate Project (Folder or Dataset), you can create Collections based on your custom metadata and then send those Collections to Annotate.

Import your Project that has custom metadata.
Click the Project once import completes.
The Project opens with the Explorer page displaying.
Filter the Project Data, Labels, or Predictions in the Explorer using a Custom Metadata filter.
Continue searching, sorting, and filtering your data/labels/predictions until you have the subset of the data you need.
Select one or more of the images in the Explorer workspace.
A ribbon appears at the top of the Explorer workspace.
Click Select all to select all the images.
Click Add to a Collection.
Click New Collection.
Specify a meaningful title and description for the Collection.
The title specified here is applied as a tag/label to every selected image.
Send the Collection to Annotate.

Get Started

General

Index

Ontologies

Projects

Labels

Datasets

DICOM

Active API & SDK

​Prerequisites

​READ THIS FIRST

​Import Custom Metadata

​Import to Files already in Index

​Folders and Custom Metadata

​List custom metadata (Folders)

​Import Custom Metadata (Folders)

​Import custom metadata to specific data units in a Folder

​Import custom metadata to all data units in a Folder

​Update custom metadata

​Bulk import custom metadata to all data units in a Folder

​Bulk custom metadata import on specific data units

​Add custom metadata to images in an image group

​Datasets and custom metadata

​List custom metadata (Datasets)

​Import custom metadata (Datasets)

​Import custom metadata to a specific data unit in your Dataset

​Import custom metadata (clientMetadata) to all data units in a Dataset

​Deprecated - Reserved Keywords

​KEYFRAMES

​Custom Metadata in Index

​Custom Metadata in Active

Prerequisites

READ THIS FIRST

Import Custom Metadata

Import to Files already in Index

Folders and Custom Metadata

List custom metadata (Folders)

Import Custom Metadata (Folders)

Import custom metadata to specific data units in a Folder

Import custom metadata to all data units in a Folder

Update custom metadata

Bulk import custom metadata to all data units in a Folder

Bulk custom metadata import on specific data units

Add custom metadata to images in an image group

Datasets and custom metadata

List custom metadata (Datasets)

Import custom metadata (Datasets)

Import custom metadata to a specific data unit in your Dataset

Import custom metadata (`clientMetadata`) to all data units in a Dataset

Deprecated - Reserved Keywords

KEYFRAMES

Custom Metadata in Index

Custom Metadata in Active