Skip to main content

DatasetUserRole Objects

class DatasetUserRole(IntEnum)
Legacy dataset user roles. This enum represents the role a user has on a dataset (for example admin or standard user). Prefer DatasetUserRoleV2 for new integrations.

DatasetUserRoleV2 Objects

class DatasetUserRoleV2(CamelStrEnum)
String-based dataset user roles used by the current API. This enum mirrors DatasetUserRole but uses string values and is the preferred representation for new code.

dataset_user_role_str_enum_to_int_enum

def dataset_user_role_str_enum_to_int_enum(
        str_enum: DatasetUserRoleV2) -> DatasetUserRole
Convert a string-based dataset user role to the legacy integer enum. This helper maps DatasetUserRoleV2 values to the corresponding DatasetUserRole values so that existing code which still relies on the integer-based representation continues to work with the newer API.

DatasetUser Objects

class DatasetUser(BaseDTO)
Dataset user membership. Arguments:
  • user_email - Email address of the user who has access to the dataset.
  • user_role - Role of the user on the dataset.
  • dataset_hash - Identifier of the dataset the user has access to.

DataLinkDuplicatesBehavior Objects

class DataLinkDuplicatesBehavior(Enum)
Behavior when linking data that already exists in a dataset. Values:
  • DUPLICATE: Allow duplicates and create a new link for each request.
  • FAIL: Fail the operation if a duplicate link would be created.
  • SKIP: Skip data that is already linked and continue with the rest.

DataClientMetadata Objects

@dataclasses.dataclass(frozen=True)
class DataClientMetadata()
Metadata attached to a data item by the client. This wrapper is used to pass arbitrary metadata through to the backend, for example custom tags or identifiers maintained by the client application. Arg: payload: Arbitrary JSON-serialisable metadata provided by the client.

ImageData Objects

class ImageData()
Information about individual images within a single DataRow of type IMG_GROUP (). Get this information using the images () property.

file_type

@property
def file_type() -> str
The MIME type of the file.

file_size

@property
def file_size() -> int
The size of the file in bytes.

signed_url

@property
def signed_url() -> Optional[str]
The signed URL if one was generated when this class was created.

DataRow Objects

class DataRow(dict, Formatter)
Each individual DataRow is one upload of a video, image group, single image, or DICOM series. This class has dict-style accessors for backwards compatibility. Clients who are using this class for the first time are encouraged to use the property accessors and setters instead of the underlying dictionary. The mixed use of the dict style member functions and the property accessors and setters is discouraged. WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the datastructure.

uid

@property
def uid() -> str
The unique identifier for this data row. Note that the setter does not update the data on the server.

title

@property
def title() -> str
The data title. The setter updates the custom client metadata. This queues a request for the backend which will be executed on a call of save().

data_type

@data_type.setter
@deprecated(version="0.1.181")
def data_type(value: DataType) -> None
DEPRECATED. Do not this function as it will never update the created_at in the server.

created_at

@created_at.setter
@deprecated(version="0.1.181")
def created_at(value: datetime) -> None
DEPRECATED. Do not this function as it will never update the created_at in the server.

frames_per_second

@property
def frames_per_second() -> Optional[int]
If the data type is VIDEO () this returns the actual number of frames per second for the video. Otherwise, it returns None as a frames_per_second field is not applicable.

duration

@property
def duration() -> Optional[int]
If the data type is VIDEO () this returns the actual duration for the video. Otherwise, it returns None as a duration field is not applicable.

client_metadata

@property
def client_metadata() -> Optional[MappingProxyType]
The currently cached client metadata. To cache the client metadata, use the refetch_data() function. The setter updates the custom client metadata. This queues a request for the backend which will be executed on a call of save().

width

@property
def width() -> Optional[int]
An actual width of the data asset. This is None for data types of IMG_GROUP () where is_image_sequence () is False, because each image in this group can have a different dimension. Inspect the images () to get the height of individual images.

height

@property
def height() -> Optional[int]
An actual height of the data asset. This is None for data types of IMG_GROUP () where is_image_sequence () is False, because each image in this group can have a different dimension. Inspect the images () to get the height of individual images.
@property
def file_link() -> Optional[str]
A permanent file link of the given data asset. When stored in CORD_STORAGE () this will be the internal file path. In private bucket storage location this will be the full path to the file. If the data type is DataType.DICOM then this returns None as no single file is associated with the series.

signed_url

@property
def signed_url() -> Optional[str]
The cached signed url of the given data asset. To cache the signed url, use the refetch_data() function.

file_size

@property
def file_size() -> int
The file size of the given data asset in bytes.

file_type

@property
def file_type() -> str
A MIME file type of the given data asset as a string

images_data

@property
def images_data() -> Optional[List[ImageData]]
A list of the cached ImageData objects for the given data asset. Fetch the images with appropriate settings in the refetch_data() function. If the data type is not IMG_GROUP () then this returns None.

is_optimised_image_group

@property
@deprecated("0.1.98", ".is_image_sequence")
def is_optimised_image_group() -> Optional[bool]
If the data type is an IMG_GROUP (), returns whether this is a performance optimized image group. Returns None for other data types. DEPRECATED: This method is deprecated and will be removed in the upcoming library version. Please use is_image_sequence() instead

is_image_sequence

@property
def is_image_sequence() -> Optional[bool]
If the data type is an IMG_GROUP (), returns whether this is an image sequence. Returns None for other data types. For more details refer to the :ref:documentation on image sequences <https://docs.encord.com/docs/annotate-supported-data#image-sequences>

backing_item_uuid

@property
def backing_item_uuid() -> UUID
The id of the StorageItem that underlies this data row. See also get_storage_item().

refetch_data

def refetch_data(
        *,
        signed_url: bool = False,
        images_data_fetch_options: Optional[ImagesDataFetchOptions] = None,
        client_metadata: bool = False)
Fetches all the most up-to-date data. If any of the parameters are falsy, the current values will not be updated. Arguments:
  • signed_url - If True, this will fetch a generated signed url of the data asset.
  • images_data_fetch_options - If not None, this will fetch the image data of the data asset. You can additionally specify what to fetch with the ImagesDataFetchOptions class.
  • client_metadata - If True, this will fetch the client metadata of the data asset.

save

def save() -> None
Sync local state to the server, if updates are made. This is a blocking function. The newest values from the Encord server will update the current DataRow object.

DataRows Objects

@dataclasses.dataclass(frozen=True)
class DataRows(dict, Formatter)
This is a helper class that forms request for filtered dataset rows Not intended to be used directly

DatasetInfo Objects

@dataclasses.dataclass(frozen=True)
class DatasetInfo()
This class represents a dataset in the context of listing

Dataset Objects

class Dataset(dict, Formatter)

__init__

def __init__(title: str,
             storage_location: str,
             data_rows: List[DataRow],
             dataset_hash: str,
             description: Optional[str] = None,
             backing_folder_uuid: Optional[UUID] = None)
DEPRECATED - prefer using the Dataset class instead. This class has dict-style accessors for backwards compatibility. Clients who are using this class for the first time are encouraged to use the property accessors and setters instead of the underlying dictionary. The mixed use of the dict style member functions and the property accessors and setters is discouraged. WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the datastructure.

DatasetDataInfo Objects

class DatasetDataInfo(BaseDTO)
Minimal information about a single data item in a dataset. Arguments:
  • data_hash - Internal identifier of the data item.
  • title - Human-readable title applied to the data item.
  • backing_item_uuid - UUID of the storage item that backs this dataset data.

AddPrivateDataResponse Objects

@dataclasses.dataclass(frozen=True)
class AddPrivateDataResponse(Formatter)
Response of add_private_data_to_dataset

CreateDatasetResponse Objects

class CreateDatasetResponse(dict, Formatter)

__init__

def __init__(title: str, storage_location: int, dataset_hash: str,
             user_hash: str, backing_folder_uuid: Optional[UUID])
This class has dict-style accessors for backwards compatibility. Clients who are using this class for the first time are encouraged to use the property accessors and setters instead of the underlying dictionary. The mixed use of the dict style member functions and the property accessors and setters is discouraged. WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the datastructure.

StorageLocation Objects

class StorageLocation(IntEnum)
Storage backends supported for datasets and data items. The enum values indicate where the underlying media is stored, such as Encord-managed storage or an external cloud provider. Some values are legacy and may only appear for existing datasets. Values:
  • CORD_STORAGE: Encord-managed storage.
  • AWS: AWS S3 bucket.
  • GCP: Google Cloud Storage.
  • AZURE: Azure Blob Storage.
  • S3_COMPATIBLE: S3-compatible storage.
  • NEW_STORAGE: This is a placeholder for a new storage location that is not yet supported by your SDK version. Please update your SDK to the latest version.

DatasetType

For backwards compatibility

DatasetData Objects

class DatasetData(base_orm.BaseORM)
Video base ORM.

SignedVideoURL Objects

class SignedVideoURL(base_orm.BaseORM)
A signed URL object with supporting information.

SignedImageURL Objects

class SignedImageURL(base_orm.BaseORM)
A signed URL object with supporting information.

SignedImagesURL Objects

class SignedImagesURL(base_orm.BaseListORM)
A signed URL object with supporting information.

SignedAudioURL Objects

class SignedAudioURL(base_orm.BaseORM)
A signed URL object with supporting information.

SignedDicomURL Objects

class SignedDicomURL(base_orm.BaseORM)
A signed URL object with supporting information.

SignedDicomsURL Objects

class SignedDicomsURL(base_orm.BaseListORM)
A signed URL object with supporting information.

Video Objects

class Video(base_orm.BaseORM)
A video object with supporting information.

ImageGroup Objects

class ImageGroup(base_orm.BaseORM)
An image group object with supporting information.

Image Objects

class Image(base_orm.BaseORM)
An image object with supporting information.

SingleImage Objects

class SingleImage(Image)
For native single image upload.

Audio Objects

class Audio(base_orm.BaseORM)
An audio object with supporting information.

Images Objects

@dataclasses.dataclass(frozen=True)
class Images()
Uploading multiple images in a batch mode.

DicomSeries Objects

@dataclasses.dataclass(frozen=True)
class DicomSeries()
Minimal information about a DICOM series belonging to a dataset. Arguments:
  • data_hash - Internal identifier of the DICOM series.
  • title - Human-readable name or description of the series.

DicomDeidentifyTask Objects

@dataclasses.dataclass(frozen=True)
class DicomDeidentifyTask()
Task describing how to de-identify DICOM data in a dataset. Arguments:
  • dicom_urls - List of DICOM object URLs to be de-identified.
  • integration_hash - Identifier of the integration or configuration used to carry out the de-identification.

ImageGroupOCR Objects

@dataclasses.dataclass(frozen=True)
class ImageGroupOCR()
OCR results extracted from an image group. Arguments:
  • processed_texts - Mapping of identifiers to recognized text blocks produced by the OCR pipeline.

ReEncodeVideoTaskResult Objects

class ReEncodeVideoTaskResult(BaseDTO)
Result of a video re-encoding task. Arguments:
  • data_hash - Identifier of the data item that was re-encoded.
  • signed_url - Optional signed URL for downloading the re-encoded video. Only present when using CORD_STORAGE.
  • bucket_path - Path inside the storage bucket where the re-encoded video is stored.

ReEncodeVideoTask Objects

class ReEncodeVideoTask(BaseDTO)
A re encode video object with supporting information.

DatasetAccessSettings Objects

@dataclasses.dataclass
class DatasetAccessSettings()
Settings for using the dataset object.

fetch_client_metadata

Whether client metadata should be retrieved for each data_row.

ImagesDataFetchOptions Objects

@dataclasses.dataclass
class ImagesDataFetchOptions()
Whether to fetch signed urls for each individual image. Only set this to True if you need to download the images. Arguments:
  • fetch_signed_urls - If True, include signed URLs for image data so that the media can be downloaded directly from storage.

LongPollingStatus Objects

class LongPollingStatus(str, Enum)
Represents the lifecycle status of a long-polling job submitted through the Encord SDK or UI. These statuses are returned by asynchronous job endpoints (for example: data upload, private dataset ingestion) to indicate the current state of job execution. This enum is stable and lists all possible job statuses returned by the long-polling API. Client code should use these values to determine whether a job is still running, has completed successfully, completed with errors, or was explicitly cancelled. PENDING Job will automatically start soon (waiting in queue) or already started processing. DONE Job has finished successfully (possibly with errors if ignore_errors=True). If ignore_errors=False was specified in add_private_data_to_dataset_start(), the job will only have the status DONE if there were no errors. If ignore_errors=True was specified in add_private_data_to_dataset_start(), the job will always show the status DONE once complete and will never show ERROR status if this flag was set to True. There could be errors that were ignored. Information about number of errors and stringified exceptions is available in the units_error_count: int and errors: List[str] attributes. ERROR Job has completed with errors. This can only happen if ignore_errors was set to False. Information about errors is available in the units_error_count: int and errors: List[str] attributes. CANCELLED Job was cancelled explicitly by the user through the Encord UI or via the Encord SDK using the add_data_to_folder_job_cancel method. In the context of this status:
  • The job may have been partially processed, but it was explicitly interrupted before completion by a user action.
  • Cancellation can occur either manually through the Encord UI or programmatically using the SDK method add_data_to_folder_job_cancel.
  • Once a job is cancelled, no further processing will occur, and any processed data before the cancellation will be available.
  • The presence of cancelled data units (units_cancelled_count) indicates that some data upload units were interrupted and cancelled before completion.
  • If ignore_errors was set to True, the job may continue despite errors, and cancellation will only apply to the unprocessed units.

DataUnitError Objects

class DataUnitError(BaseDTO)
A description of an error for an individual upload item

object_urls

URLs involved. A single item for videos and images; a list of frames for image groups and DICOM

error

The error message

subtask_uuid

Opaque ID of the process. Please quote this when contacting Encord support.

action_description

Human-readable description of the action that failed (e.g. ‘Uploading DICOM series’).

DatasetDataLongPolling Objects

class DatasetDataLongPolling(BaseDTO)
Response of the upload job’s long polling request. Note: An upload job consists of job units, where job unit could be either a video, image group, dicom series, or a single image.

status

Status of the upload job. Documented in detail in LongPollingStatus()

data_hashes_with_titles

Information about data which was added to the dataset.

errors

Stringified list of exceptions.

data_unit_errors

Structured list of per-item upload errors. See DataUnitError for more details.

units_pending_count

Number of upload job units that have pending status.

units_done_count

Number of upload job units that have done status.

units_error_count

Number of upload job units that have error status.

units_cancelled_count

Number of upload job units that have been cancelled.

DatasetLinkItems Objects

@dataclasses.dataclass(frozen=True)
class DatasetLinkItems()
Mapping between a dataset and its underlying storage items. Arguments:
  • items - List of storage item identifiers linked to the dataset.

CreateDatasetPayload Objects

class CreateDatasetPayload(BaseDTO)
Payload for creating a new dataset. Arg: title: Title of the dataset to create. description: Optional description of the dataset and its intended use. create_backing_folder: If True, create a legacy “mirror” dataset together with a backing storage folder in a single operation. This behavior is retained for backwards compatibility. legacy_call: Internal flag used for analytics to detect usage of legacy dataset creation flows. This field will be removed in a future version and should not be set manually.

create_backing_folder

this creates a legacy “mirror” dataset and it’s backing folder in one go

legacy_call

this field will be removed soon

CreateDatasetResponseV2 Objects

class CreateDatasetResponseV2(BaseDTO)
Response returned when creating a dataset (current format). Arguments:
  • dataset_uuid - UUID of the newly created dataset.
  • backing_folder_uuid - Optional UUID of the backing folder created alongside the dataset, if applicable. A ‘not None’ indicates a legacy “mirror” dataset was created.

backing_folder_uuid

a ‘not None’ indicates a legacy “mirror” dataset was created

DatasetsWithUserRolesListParams Objects

class DatasetsWithUserRolesListParams(BaseDTO)
Filter parameters for listing datasets together with user roles. Arguments:
  • title_eq - Optional filter to return only datasets whose title exactly matches the given string.
  • title_cont - Optional filter to return only datasets whose title contains the given substring.
  • created_before - If set, only datasets created before this timestamp are returned.
  • created_after - If set, only datasets created on or after this timestamp are returned.
  • edited_before - If set, only datasets last edited before this timestamp are returned.
  • edited_after - If set, only datasets last edited on or after this timestamp are returned.
  • include_org_access - If True, include datasets that are visible through organisation-level access in addition to user-level sharing.

DatasetWithUserRole Objects

class DatasetWithUserRole(BaseDTO)
Dataset with the role of the current user attached. Arguments:
  • dataset_uuid - UUID of the dataset.
  • title - Title of the dataset.
  • description - Description of the dataset.
  • created_at - Timestamp when the dataset was created.
  • last_edited_at - Timestamp when the dataset was last modified.
  • user_role - Role of the requesting user on this dataset, if any.
  • storage_location - Storage location of the dataset’s underlying data, if known.
  • backing_folder_uuid - UUID of the legacy backing folder if this dataset was created as a “mirror” dataset.

storage_location

legacy field: you can have data from mixed locations now

backing_folder_uuid

if set, this indicates a legacy ‘mirror’ dataset

DatasetsWithUserRolesListResponse Objects

class DatasetsWithUserRolesListResponse(BaseDTO)
Response payload for listing datasets with user roles. Arguments:
  • result - List of datasets together with the role of the current user.