Orm.dataset - Encord

DatasetUserRole Objects

class DatasetUserRole(IntEnum)

Legacy dataset user roles. This enum represents the role a user has on a dataset (for example admin or standard user). Prefer DatasetUserRoleV2 for new integrations.

DatasetUserRoleV2 Objects

class DatasetUserRoleV2(CamelStrEnum)

String-based dataset user roles used by the current API. This enum mirrors DatasetUserRole but uses string values and is the preferred representation for new code.

dataset_user_role_str_enum_to_int_enum

def dataset_user_role_str_enum_to_int_enum(
        str_enum: DatasetUserRoleV2) -> DatasetUserRole

Convert a string-based dataset user role to the legacy integer enum. This helper maps DatasetUserRoleV2 values to the corresponding DatasetUserRole values so that existing code which still relies on the integer-based representation continues to work with the newer API.

DatasetUser Objects

class DatasetUser(BaseDTO)

Dataset user membership. Arguments:

user_email - Email address of the user who has access to the dataset.
user_role - Role of the user on the dataset.
dataset_hash - Identifier of the dataset the user has access to.

DataLinkDuplicatesBehavior Objects

class DataLinkDuplicatesBehavior(Enum)

Behavior when linking data that already exists in a dataset. Values:

DUPLICATE: Allow duplicates and create a new link for each request.
FAIL: Fail the operation if a duplicate link would be created.
SKIP: Skip data that is already linked and continue with the rest.

DataClientMetadata Objects

@dataclasses.dataclass(frozen=True)
class DataClientMetadata()

Metadata attached to a data item by the client. This wrapper is used to pass arbitrary metadata through to the backend, for example custom tags or identifiers maintained by the client application. Arg: payload: Arbitrary JSON-serialisable metadata provided by the client.

ImageData Objects

class ImageData()

Information about individual images within a single DataRow of type IMG_GROUP (). Get this information using the images () property.

file_type

@property
def file_type() -> str

The MIME type of the file.

file_size

@property
def file_size() -> int

The size of the file in bytes.

signed_url

@property
def signed_url() -> Optional[str]

The signed URL if one was generated when this class was created.

DataRow Objects

class DataRow(dict, Formatter)

Each individual DataRow is one upload of a video, image group, single image, or DICOM series. This class has dict-style accessors for backwards compatibility. Clients who are using this class for the first time are encouraged to use the property accessors and setters instead of the underlying dictionary. The mixed use of the dict style member functions and the property accessors and setters is discouraged. WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the datastructure.

uid

@property
def uid() -> str

The unique identifier for this data row. Note that the setter does not update the data on the server.

title

@property
def title() -> str

The data title. The setter updates the custom client metadata. This queues a request for the backend which will be executed on a call of save().

data_type

@data_type.setter
@deprecated(version="0.1.181")
def data_type(value: DataType) -> None

DEPRECATED. Do not this function as it will never update the created_at in the server.

created_at

@created_at.setter
@deprecated(version="0.1.181")
def created_at(value: datetime) -> None

DEPRECATED. Do not this function as it will never update the created_at in the server.

frames_per_second

@property
def frames_per_second() -> Optional[int]

If the data type is VIDEO () this returns the actual number of frames per second for the video. Otherwise, it returns None as a frames_per_second field is not applicable.

duration

@property
def duration() -> Optional[int]

If the data type is VIDEO () this returns the actual duration for the video. Otherwise, it returns None as a duration field is not applicable.

client_metadata

@property
def client_metadata() -> Optional[MappingProxyType]

The currently cached client metadata. To cache the client metadata, use the refetch_data() function. The setter updates the custom client metadata. This queues a request for the backend which will be executed on a call of save().

width

@property
def width() -> Optional[int]

An actual width of the data asset. This is None for data types of IMG_GROUP () where is_image_sequence () is False, because each image in this group can have a different dimension. Inspect the images () to get the height of individual images.

height

@property
def height() -> Optional[int]

An actual height of the data asset. This is None for data types of IMG_GROUP () where is_image_sequence () is False, because each image in this group can have a different dimension. Inspect the images () to get the height of individual images.

file_link

@property
def file_link() -> Optional[str]

A permanent file link of the given data asset. When stored in CORD_STORAGE () this will be the internal file path. In private bucket storage location this will be the full path to the file. If the data type is DataType.DICOM then this returns None as no single file is associated with the series.

signed_url

@property
def signed_url() -> Optional[str]

The cached signed url of the given data asset. To cache the signed url, use the refetch_data() function.

file_size

@property
def file_size() -> int

The file size of the given data asset in bytes.

file_type

@property
def file_type() -> str

A MIME file type of the given data asset as a string

images_data

@property
def images_data() -> Optional[List[ImageData]]

A list of the cached ImageData objects for the given data asset. Fetch the images with appropriate settings in the refetch_data() function. If the data type is not IMG_GROUP () then this returns None.

is_optimised_image_group

@property
@deprecated("0.1.98", ".is_image_sequence")
def is_optimised_image_group() -> Optional[bool]

If the data type is an IMG_GROUP (), returns whether this is a performance optimized image group. Returns None for other data types. DEPRECATED: This method is deprecated and will be removed in the upcoming library version. Please use is_image_sequence() instead

is_image_sequence

@property
def is_image_sequence() -> Optional[bool]

If the data type is an IMG_GROUP (), returns whether this is an image sequence. Returns None for other data types. For more details refer to the :ref:documentation on image sequences <https://docs.encord.com/docs/annotate-supported-data#image-sequences>

backing_item_uuid

@property
def backing_item_uuid() -> UUID

The id of the StorageItem that underlies this data row. See also get_storage_item().

refetch_data

def refetch_data(
        *,
        signed_url: bool = False,
        images_data_fetch_options: Optional[ImagesDataFetchOptions] = None,
        client_metadata: bool = False)

Fetches all the most up-to-date data. If any of the parameters are falsy, the current values will not be updated. Arguments:

signed_url - If True, this will fetch a generated signed url of the data asset.
images_data_fetch_options - If not None, this will fetch the image data of the data asset. You can additionally specify what to fetch with the ImagesDataFetchOptions class.
client_metadata - If True, this will fetch the client metadata of the data asset.

save

def save() -> None

Sync local state to the server, if updates are made. This is a blocking function. The newest values from the Encord server will update the current DataRow object.

DataRows Objects

@dataclasses.dataclass(frozen=True)
class DataRows(dict, Formatter)

This is a helper class that forms request for filtered dataset rows Not intended to be used directly

DatasetInfo Objects

@dataclasses.dataclass(frozen=True)
class DatasetInfo()

This class represents a dataset in the context of listing

Dataset Objects

class Dataset(dict, Formatter)

init

def __init__(title: str,
             storage_location: str,
             data_rows: List[DataRow],
             dataset_hash: str,
             description: Optional[str] = None,
             backing_folder_uuid: Optional[UUID] = None)

DEPRECATED - prefer using the Dataset class instead. This class has dict-style accessors for backwards compatibility. Clients who are using this class for the first time are encouraged to use the property accessors and setters instead of the underlying dictionary. The mixed use of the dict style member functions and the property accessors and setters is discouraged. WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the datastructure.

DatasetDataInfo Objects

class DatasetDataInfo(BaseDTO)

Minimal information about a single data item in a dataset. Arguments:

data_hash - Internal identifier of the data item.
title - Human-readable title applied to the data item.
backing_item_uuid - UUID of the storage item that backs this dataset data.

AddPrivateDataResponse Objects

@dataclasses.dataclass(frozen=True)
class AddPrivateDataResponse(Formatter)

Response of add_private_data_to_dataset

CreateDatasetResponse Objects

class CreateDatasetResponse(dict, Formatter)

init

def __init__(title: str, storage_location: int, dataset_hash: str,
             user_hash: str, backing_folder_uuid: Optional[UUID])

This class has dict-style accessors for backwards compatibility. Clients who are using this class for the first time are encouraged to use the property accessors and setters instead of the underlying dictionary. The mixed use of the dict style member functions and the property accessors and setters is discouraged. WARNING: Do NOT use the .data member of this class. Its usage could corrupt the correctness of the datastructure.

StorageLocation Objects

class StorageLocation(IntEnum)

Storage backends supported for datasets and data items. The enum values indicate where the underlying media is stored, such as Encord-managed storage or an external cloud provider. Some values are legacy and may only appear for existing datasets. Values:

CORD_STORAGE: Encord-managed storage.
AWS: AWS S3 bucket.
GCP: Google Cloud Storage.
AZURE: Azure Blob Storage.
S3_COMPATIBLE: S3-compatible storage.
NEW_STORAGE: This is a placeholder for a new storage location that is not yet supported by your SDK version. Please update your SDK to the latest version.

DatasetType

For backwards compatibility

DatasetData Objects

class DatasetData(base_orm.BaseORM)

Video base ORM.