ℹ️

Note

This documentation is only relevant for customers with early access to Encord Storage. Contact [email protected] to learn more and gain access.

If you do not have access to Storage, please see our documentation here to learn about Datasets.

Creating Datasets

See our video on creating Datasets, or follow the step-by-step tutorial below to learn how to create Datasets in Encord.

  1. Click the New dataset button in the Datasets section under the Index heading, to create a new Dataset.
  1. Give your Dataset a meaningful title and description. A clear title and description keeps your data organized.

👍

Tip

Toggle Create a synced folder with this dataset to sync this Dataset with a folder. Creating a synced folder allows you to upload files to the Dataset directly. Any files added to the Dataset will appear in the corresponding synced folder, and any files added to the synced folder will automatically appear in the Dataset.

  1. Click Create dataset to create the Dataset.

Attach data

Data can be added to a Dataset once it has been created.

👍

Tip

We recommend uploading smaller batches of data: limit uploads to 100 videos and up to 1000 images at a time. You have the option to create multiple Datasets, all of which can be linked to a single Project. Familiarize yourself with our limits and best practices for data import before uploading data to Encord.

👍

Tip

You can add data to Datasets from Storage.

Unsynced Datasets

  1. Select the Dataset you want to add data to.

  2. Click +Attach data.

  1. Select the data you want to add. Click Attach data to confirm your selection.

ℹ️

Note

You can add data from synced and unsynced folders.

Synced Datasets

Datasets that are synced are indicated by the icon and are linked to a specific folder. This ensures that any data added to this folder is automatically included in the Dataset. Additionally, you can upload data directly to the Dataset, and it will be automatically saved in the linked folder, which shares the same name.

  1. Select the Dataset you want to add data to.

  2. Click +Add data.

  1. Select the type of data you want to add:
  • A - Upload: Drag and drop local images and / or videos. Click Upload to finish.
  • B - Batch images as: Create an image group or image sequence from local images. See the section on creating image groups and image sequences for more information. Click Upload and batch images to finish.
  • C - DICOM: Create a DICOM series from local DICOM files. Click Upload to finish.
  • D - Import from private cloud: Add any data stored in your cloud storage. At least one data integration is required to upload cloud data. Learn how to upload private cloud data here. Click Import to finish.

Upload cloud storage data

  1. Create a correctly formatted JSON or CSV file specifying the data you would like to add to the Dataset.

  2. Click the upload area, or drag-and-drop the JSON or CSV files which specify which cloud storage data should be uploaded. Your stored objects may contain files which are not supported by Encord and which may produce errors on upload - toggle the Ignore individual file errors toggle to ignore these. Click Add data when you're ready.

ℹ️

Note

The data is fetched from your cloud storage and processed asynchronously. This processing involves fetching appropriate metadata and other file information to help the Encord platform render the files appropriately and to check for any framerate inconsistencies. We do not store your files in any way.

ℹ️

Note

For information on how to add data from your private cloud go here


Entity relationships

The following diagram illustrates how Datasets relate to other entities in Encord.

  • Projects bring together Ontologies, Datasets, Workflows, and collaborators.
  • A Project can have multiple Datasets attached to it, but only one Ontology.
  • One Ontology can be attached to multiple Projects.

Roles and permissions

Collaborator permissions can be set in the Team section of the Dataset Settings.

PermissionAdminViewer
View dataset
Add data
Adjust settings

Manage Datasets

Use the Datasets tab in the Navigation bar to manage your Datasets.

Click a Dataset to:

  • Upload additional files to the Dataset.
  • Remove files from the Dataset.
  • Manage who has access to the Dataset.

The dashboard is split into two tabs:


Data tab

Use the data tab to upload, and manage existing files.

  • A - Click and select (or drag-and-drop) files into the area highlighted below to upload files to a Dataset.

  • B - Manage files contained in the Dataset.

    • Edit the filename by clicking the icon.
    • Select a file by clicking the checkbox next to the file name.
    • Select a file and press to delete the file from a dataset.
    • Re-encode a file by selecting it and pressing the Re-encode(auto) button.

Settings tab

Team

The Team pane shows a list of collaborators on the Dataset.

  • Invite collaborators by clicking the + Invite collaborators button and adding their emails.
  • New collaborators are assigned the 'Viewer' role by default. A 'Viewer' cannot make changes to the Dataset, only an 'Admin' can.
  • Collaborators can be upgraded to an 'Admin' using the 3 dots to the right of their name.
  • Click the icon to delete a collaborator.

ℹ️

Note

An 'Admin' cannot be reverted to a 'Viewer'. To do so you must delete and re-invite the user.

Linked Projects

The Projects pane shows a list of Projects using the Dataset.

Click on View to navigate to that Project.


Danger zone (Delete Datasets)

Use the Danger zone pane to delete Datasets.

Click the Delete dataset button to delete the entire Dataset. You are prompted to type the word 'delete' into the resulting pop-up to delete the Dataset.

🚧

Caution

Deleting a Dataset can't be undone. Make sure you want to perform this action before continuing.