Datasets are subsets of your files that can be attached to one or more Projects for annotation. Datasets are created from files you upload to Encord.
After a Dataset has been created, you can attach data.
To ensure smoother uploads and faster completion times, and avoid hitting absolute file limits, we recommend adding smaller batches of data. Limit uploads to 100 videos or up to 1,000 images at a time. You can also create multiple Datasets, all of which can be linked to a single Project. Familiarize yourself with our limits and best practices for data import and registration before adding data to Encord.
Select the folders containing the files you want to attach to the Dataset. To select individual files, double-click a folder to see its contents, and select the files you want to add to the Dataset.
Click Attach data to attach the selected files to the Dataset.
Select the Dataset you want to add data to.
Click +Upload files.
Select a folder to store the files in, or create a new folder.
Select the Import from private cloud tab and select the integration you want to use.
Mirrored Datasets provide a continuity solution that retains the organization of data prior to the release of Index. With the transition to Index, all existing data within Datasets has been transferred to Files in the form of Mirrored Datasets. Mirrored Datasets can be managed using both the Files and Datasets sections of the Encord platform.
For example, moving a file named “chicken.mp4” from a mirrored Dataset titled “Animal videos” to another mirrored Dataset called “Chicken videos”, results in “chicken.mp4” being visible in all Projects associated with “Chicken videos”.
The following diagram illustrates how Datasets relate to other entities in Encord.
The following diagram shows how entities in Encord are organized.
Collaborator permissions can be set in the Team section of the Dataset Settings.
Permission | Admin | Viewer |
---|---|---|
View dataset | ✅ | ✅ |
Add data | ✅ | ❌ |
Adjust settings | ✅ | ❌ |
Use the Datasets tab in the navigation bar to manage your Datasets.
Click a Dataset to:
The dashboard is split into two tabs:
Use the data tab to upload, and manage existing files.
A - Click and select (or drag-and-drop) files into the area highlighted below to add files to a Dataset.
B - Manage files contained in the Dataset.
The Team pane shows a list of collaborators on the Dataset.
The Projects pane shows a list of Projects using the Dataset.
Click on View to navigate to that Project.
Use the Danger zone pane to delete Datasets.
Click the Delete dataset button to delete the entire Dataset. You are prompted to type the word ‘delete’ into the resulting pop-up to delete the Dataset.
Organization Admins can search for and join any Datasets that exist within the Organization.
When an Organization Admin joins a Dataset, they are automatically assigned the Admin user role for that Dataset.