Documentation Index
Fetch the complete documentation index at: https://docs.encord.com/llms.txt
Use this file to discover all available pages before exploring further.
DatasetUserRole Objects
class DatasetUserRole(IntEnum)
Legacy dataset user roles.
This enum represents the role a user has on a dataset (for example
admin or standard user). Prefer DatasetUserRoleV2 for
new integrations.
DatasetUserRoleV2 Objects
class DatasetUserRoleV2(CamelStrEnum)
String-based dataset user roles used by the current API.
This enum mirrors DatasetUserRole but uses string values
and is the preferred representation for new code.
dataset_user_role_str_enum_to_int_enum
def dataset_user_role_str_enum_to_int_enum(
str_enum: DatasetUserRoleV2) -> DatasetUserRole
Convert a string-based dataset user role to the legacy integer enum.
This helper maps DatasetUserRoleV2 values to the
corresponding DatasetUserRole values so that existing code
which still relies on the integer-based representation continues to
work with the newer API.
DatasetUser Objects
class DatasetUser(BaseDTO)
Dataset user membership.
Arguments:
user_email - Email address of the user who has access to the dataset.
user_role - Role of the user on the dataset.
dataset_hash - Identifier of the dataset the user has access to.
DataLinkDuplicatesBehavior Objects
class DataLinkDuplicatesBehavior(Enum)
Behavior when linking data that already exists in a dataset.
Values:
- DUPLICATE: Allow duplicates and create a new link for each request.
- FAIL: Fail the operation if a duplicate link would be created.
- SKIP: Skip data that is already linked and continue with the rest.
@dataclasses.dataclass(frozen=True)
class DataClientMetadata()
Metadata attached to a data item by the client.
This wrapper is used to pass arbitrary metadata through to the
backend, for example custom tags or identifiers maintained by
the client application.
Arg:
payload: Arbitrary JSON-serialisable metadata provided by the client.
ImageData Objects
Information about individual images within a single DataRow of type
IMG_GROUP (). Get this information
using the images () property.
file_type
@property
def file_type() -> str
The MIME type of the file.
file_size
@property
def file_size() -> int
The size of the file in bytes.
signed_url
@property
def signed_url() -> Optional[str]
The signed URL if one was generated when this class was created.
DataRow Objects
class DataRow(dict, Formatter)
Each individual DataRow is one upload of a video, image group, single image, or DICOM series.
This class has dict-style accessors for backwards compatibility.
Clients who are using this class for the first time are encouraged to use the property accessors and setters
instead of the underlying dictionary.
The mixed use of the dict style member functions and the property accessors and setters is discouraged.
WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the
datastructure.
uid
@property
def uid() -> str
The unique identifier for this data row. Note that the setter does not update the data on the server.
title
@property
def title() -> str
The data title.
The setter updates the custom client metadata. This queues a request for the backend which will be
executed on a call of save().
data_type
@data_type.setter
@deprecated(version="0.1.181")
def data_type(value: DataType) -> None
DEPRECATED. Do not this function as it will never update the created_at in the server.
created_at
@created_at.setter
@deprecated(version="0.1.181")
def created_at(value: datetime) -> None
DEPRECATED. Do not this function as it will never update the created_at in the server.
frames_per_second
@property
def frames_per_second() -> Optional[int]
If the data type is VIDEO () this returns the
actual number of frames per second for the video. Otherwise, it returns None as a frames_per_second
field is not applicable.
duration
@property
def duration() -> Optional[int]
If the data type is VIDEO () this returns the
actual duration for the video. Otherwise, it returns None as a duration field is not applicable.
@property
def client_metadata() -> Optional[MappingProxyType]
The currently cached client metadata. To cache the client metadata, use the
refetch_data() function.
The setter updates the custom client metadata. This queues a request for the backend which will
be executed on a call of save().
width
@property
def width() -> Optional[int]
An actual width of the data asset. This is None for data types of
IMG_GROUP () where
is_image_sequence () is False, because
each image in this group can have a different dimension. Inspect the
images () to get the height of individual images.
height
@property
def height() -> Optional[int]
An actual height of the data asset. This is None for data types of
IMG_GROUP () where
is_image_sequence () is False, because
each image in this group can have a different dimension. Inspect the
images () to get the height of individual images.
file_link
@property
def file_link() -> Optional[str]
A permanent file link of the given data asset. When stored in
CORD_STORAGE () this will be the
internal file path. In private bucket storage location this will be the full path to the file.
If the data type is DataType.DICOM then this returns None as no single file is associated with the
series.
signed_url
@property
def signed_url() -> Optional[str]
The cached signed url of the given data asset. To cache the signed url, use the
refetch_data() function.
file_size
@property
def file_size() -> int
The file size of the given data asset in bytes.
file_type
@property
def file_type() -> str
A MIME file type of the given data asset as a string
images_data
@property
def images_data() -> Optional[List[ImageData]]
A list of the cached ImageData objects for the given data asset.
Fetch the images with appropriate settings in the refetch_data() function.
If the data type is not IMG_GROUP ()
then this returns None.
is_optimised_image_group
@property
@deprecated("0.1.98", ".is_image_sequence")
def is_optimised_image_group() -> Optional[bool]
If the data type is an IMG_GROUP (),
returns whether this is a performance optimized image group. Returns None for other data types.
DEPRECATED: This method is deprecated and will be removed in the upcoming library version.
Please use is_image_sequence() instead
is_image_sequence
@property
def is_image_sequence() -> Optional[bool]
If the data type is an IMG_GROUP (),
returns whether this is an image sequence. Returns None for other data types.
For more details refer to the
:ref:documentation on image sequences <https://docs.encord.com/docs/annotate-supported-data#image-sequences>
backing_item_uuid
@property
def backing_item_uuid() -> UUID
The id of the StorageItem that underlies this data row.
See also get_storage_item().
refetch_data
def refetch_data(
*,
signed_url: bool = False,
images_data_fetch_options: Optional[ImagesDataFetchOptions] = None,
client_metadata: bool = False)
Fetches all the most up-to-date data. If any of the parameters are falsy, the current values will not be
updated.
Arguments:
signed_url - If True, this will fetch a generated signed url of the data asset.
images_data_fetch_options - If not None, this will fetch the image data of the data asset. You can
additionally specify what to fetch with the ImagesDataFetchOptions class.
client_metadata - If True, this will fetch the client metadata of the data asset.
save
@deprecated(version="0.1.192", alternative="encord.storage.StorageItem.update")
def save() -> None
DEPRECATED: Use update() instead to update the underlying
StorageItem. You can access the UUID of the underlying
StorageItem using backing_item_uuid().
Sync local state to the server, if updates are made. This is a blocking function.
The newest values from the Encord server will update the current DataRow object.
DataRows Objects
@dataclasses.dataclass(frozen=True)
class DataRows(dict, Formatter)
This is a helper class that forms request for filtered dataset rows
Not intended to be used directly
DatasetInfo Objects
@dataclasses.dataclass(frozen=True)
class DatasetInfo()
This class represents a dataset in the context of listing
Dataset Objects
class Dataset(dict, Formatter)
__init__
def __init__(title: str,
storage_location: str,
data_rows: List[DataRow],
dataset_hash: str,
description: Optional[str] = None,
backing_folder_uuid: Optional[UUID] = None)
DEPRECATED - prefer using the Dataset class instead.
This class has dict-style accessors for backwards compatibility.
Clients who are using this class for the first time are encouraged to use the property accessors and setters
instead of the underlying dictionary.
The mixed use of the dict style member functions and the property accessors and setters is discouraged.
WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the
datastructure.
DatasetDataInfo Objects
class DatasetDataInfo(BaseDTO)
Minimal information about a single data item in a dataset.
Arguments:
data_hash - Internal identifier of the data item.
title - Human-readable title applied to the data item.
backing_item_uuid - UUID of the storage item that backs this dataset data.
AddPrivateDataResponse Objects
@dataclasses.dataclass(frozen=True)
class AddPrivateDataResponse(Formatter)
Response of add_private_data_to_dataset
CreateDatasetResponse Objects
class CreateDatasetResponse(dict, Formatter)
__init__
def __init__(title: str, storage_location: int, dataset_hash: str,
user_hash: str, backing_folder_uuid: Optional[UUID])
This class has dict-style accessors for backwards compatibility.
Clients who are using this class for the first time are encouraged to use the property accessors and setters
instead of the underlying dictionary.
The mixed use of the dict style member functions and the property accessors and setters is discouraged.
WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the
datastructure.
StorageLocation Objects
class StorageLocation(IntEnum)
Storage backends supported for datasets and data items.
The enum values indicate where the underlying media is stored, such
as Encord-managed storage or an external cloud provider. Some values
are legacy and may only appear for existing datasets.
Values:
- CORD_STORAGE: Encord-managed storage.
- AWS: AWS S3 bucket.
- GCP: Google Cloud Storage.
- AZURE: Azure Blob Storage.
- S3_COMPATIBLE: S3-compatible storage.
- NEW_STORAGE: This is a placeholder for a new storage location that is not yet supported by your SDK version.
Please update your SDK to the latest version.
DatasetType
For backwards compatibility
DatasetData Objects
class DatasetData(base_orm.BaseORM)
Video base ORM.
SignedVideoURL Objects
class SignedVideoURL(base_orm.BaseORM)
A signed URL object with supporting information.
SignedImageURL Objects
class SignedImageURL(base_orm.BaseORM)
A signed URL object with supporting information.
SignedImagesURL Objects
class SignedImagesURL(base_orm.BaseListORM)
A signed URL object with supporting information.
SignedAudioURL Objects
class SignedAudioURL(base_orm.BaseORM)
A signed URL object with supporting information.
SignedDicomURL Objects
class SignedDicomURL(base_orm.BaseORM)
A signed URL object with supporting information.
SignedDicomsURL Objects
class SignedDicomsURL(base_orm.BaseListORM)
A signed URL object with supporting information.
Video Objects
class Video(base_orm.BaseORM)
A video object with supporting information.
ImageGroup Objects
class ImageGroup(base_orm.BaseORM)
An image group object with supporting information.
Image Objects
class Image(base_orm.BaseORM)
An image object with supporting information.
SingleImage Objects
For native single image upload.
Audio Objects
class Audio(base_orm.BaseORM)
An audio object with supporting information.
Images Objects
@dataclasses.dataclass(frozen=True)
class Images()
Uploading multiple images in a batch mode.
DicomSeries Objects
@dataclasses.dataclass(frozen=True)
class DicomSeries()
Minimal information about a DICOM series belonging to a dataset.
Arguments:
data_hash - Internal identifier of the DICOM series.
title - Human-readable name or description of the series.
DicomDeidentifyTask Objects
@dataclasses.dataclass(frozen=True)
class DicomDeidentifyTask()
Task describing how to de-identify DICOM data in a dataset.
Arguments:
dicom_urls - List of DICOM object URLs to be de-identified.
integration_hash - Identifier of the integration or configuration used to carry
out the de-identification.
ImageGroupOCR Objects
@dataclasses.dataclass(frozen=True)
class ImageGroupOCR()
OCR results extracted from an image group.
Arguments:
processed_texts - Mapping of identifiers to recognized text blocks produced by
the OCR pipeline.
ReEncodeVideoTaskResult Objects
class ReEncodeVideoTaskResult(BaseDTO)
Result of a video re-encoding task.
Arguments:
data_hash - Identifier of the data item that was re-encoded.
signed_url - Optional signed URL for downloading the re-encoded video. Only
present when using CORD_STORAGE.
bucket_path - Path inside the storage bucket where the re-encoded video is
stored.
ReEncodeVideoTask Objects
class ReEncodeVideoTask(BaseDTO)
A re encode video object with supporting information.
DatasetAccessSettings Objects
@dataclasses.dataclass
class DatasetAccessSettings()
Settings for using the dataset object.
Whether client metadata should be retrieved for each data_row.
ImagesDataFetchOptions Objects
@dataclasses.dataclass
class ImagesDataFetchOptions()
Whether to fetch signed urls for each individual image.
Only set this to True if you need to download the
images.
Arguments:
fetch_signed_urls - If True, include signed URLs for image data so that the
media can be downloaded directly from storage.
LongPollingStatus Objects
class LongPollingStatus(str, Enum)
Represents the lifecycle status of a long-polling job submitted through the
Encord SDK or UI. These statuses are returned by asynchronous job endpoints
(for example: data upload, private dataset ingestion) to indicate the current state
of job execution.
This enum is stable and lists all possible job statuses returned
by the long-polling API. Client code should use these values to determine
whether a job is still running, has completed successfully, completed with
errors, or was explicitly canceled.
PENDING
Job will automatically start soon (waiting in queue) or already started processing.
DONE
Job has finished successfully (possibly with errors if ignore_errors=True).
If ignore_errors=False was specified in
add_private_data_to_dataset_start(),
the job will only have the status DONE if there were no errors.
If ignore_errors=True was specified in
add_private_data_to_dataset_start(),
the job will always show the status DONE once complete and will never show
ERROR status if this flag was set to True. There could be errors that were
ignored.
Information about number of errors and stringified exceptions is available in the
units_error_count: int and errors: List[str] attributes.
ERROR
Job has completed with errors. This can only happen if ignore_errors was set to
False. Information about errors is available in the units_error_count: int
and errors: List[str] attributes.
CANCELLED
Job was canceled explicitly by the user through the Encord UI or via the Encord
SDK using the add_data_to_folder_job_cancel method.
In the context of this status:
- The job may have been partially processed, but it was explicitly interrupted
before completion by a user action.
- Cancellation can occur either manually through the Encord UI or programmatically
using the SDK method
add_data_to_folder_job_cancel.
- Once a job is canceled, no further processing will occur, and any processed
data before the cancellation will be available.
- The presence of canceled data units (
units_cancelled_count) indicates that
some data upload units were interrupted and canceled before completion.
- If
ignore_errors was set to True, the job may continue despite errors, and
cancellation will only apply to the unprocessed units.
DataUnitError Objects
class DataUnitError(BaseDTO)
A description of an error for an individual upload item
object_urls
URLs involved. A single item for videos and images; a list of frames for image groups and DICOM
error
The error message
subtask_uuid
Opaque ID of the process. Please quote this when contacting Encord support.
action_description
Human-readable description of the action that failed (e.g. ‘Uploading DICOM series’).
DatasetDataLongPolling Objects
class DatasetDataLongPolling(BaseDTO)
Response of the upload job’s long polling request.
Note: An upload job consists of job units, where job unit could be
either a video, image group, dicom series, or a single image.
status
Status of the upload job. Documented in detail in LongPollingStatus()
data_hashes_with_titles
Information about data which was added to the dataset.
errors
Stringified list of exceptions.
data_unit_errors
Structured list of per-item upload errors. See DataUnitError for more details.
units_pending_count
Number of upload job units that have pending status.
units_done_count
Number of upload job units that have done status.
units_error_count
Number of upload job units that have error status.
units_cancelled_count
Number of upload job units that have been canceled.
DatasetLinkItems Objects
@dataclasses.dataclass(frozen=True)
class DatasetLinkItems()
Mapping between a dataset and its underlying storage items.
Arguments:
items - List of storage item identifiers linked to the dataset.
CreateDatasetPayload Objects
class CreateDatasetPayload(BaseDTO)
Payload for creating a new dataset.
Arg:
title: Title of the dataset to create.
description: Optional description of the dataset and its intended use.
create_backing_folder: If True, create a legacy “mirror” dataset together with a
backing storage folder in a single operation. This behavior
is retained for backwards compatibility.
legacy_call: Internal flag used for analytics to detect usage of legacy
dataset creation flows. This field will be removed in a
future version and should not be set manually.
create_backing_folder
this creates a legacy “mirror” dataset and it’s backing folder in one go
legacy_call
this field will be removed soon
CreateDatasetResponseV2 Objects
class CreateDatasetResponseV2(BaseDTO)
Response returned when creating a dataset (current format).
Arguments:
dataset_uuid - UUID of the newly created dataset.
backing_folder_uuid - Optional UUID of the backing folder created alongside the
dataset, if applicable.
A ‘not None’ indicates a legacy “mirror” dataset was created.
backing_folder_uuid
a ‘not None’ indicates a legacy “mirror” dataset was created
DatasetsWithUserRolesListParams Objects
class DatasetsWithUserRolesListParams(BaseDTO)
Filter parameters for listing datasets together with user roles.
Arguments:
title_eq - Optional filter to return only datasets whose title exactly
matches the given string.
title_cont - Optional filter to return only datasets whose title contains
the given substring.
created_before - If set, only datasets created before this timestamp are
returned.
created_after - If set, only datasets created on or after this timestamp are
returned.
edited_before - If set, only datasets last edited before this timestamp are
returned.
edited_after - If set, only datasets last edited on or after this timestamp
are returned.
include_org_access - If True, include datasets that are visible through
organisation-level access in addition to user-level sharing.
DatasetWithUserRole Objects
class DatasetWithUserRole(BaseDTO)
Dataset with the role of the current user attached.
Arguments:
dataset_uuid - UUID of the dataset.
title - Title of the dataset.
description - Description of the dataset.
created_at - Timestamp when the dataset was created.
last_edited_at - Timestamp when the dataset was last modified.
user_role - Role of the requesting user on this dataset, if any.
storage_location - Storage location of the dataset’s underlying data, if known.
backing_folder_uuid - UUID of the legacy backing folder if this dataset was created
as a “mirror” dataset.
storage_location
legacy field: you can have data from mixed locations now
backing_folder_uuid
if set, this indicates a legacy ‘mirror’ dataset
DatasetsWithUserRolesListResponse Objects
class DatasetsWithUserRolesListResponse(BaseDTO)
Response payload for listing datasets with user roles.
Arguments:
result - List of datasets together with the role of the current user.