Datasets
Create Datasets
Create Datasets
Creating a Dataset and adding files to a Dataset are two distinct steps. Click here to learn how to add data to an existing Dataset.
Datasets cannot be deleted using the SDK or the API. Use the Encord platform to delete Datasets.
The following example creates a Dataset called “Houses” that expects data hosted on AWS S3.
- Substitute
<private_key_path>
with the file path for your private key. - Replace “Houses” with the name you want your Dataset to have.
Storage location | StorageLocation method argument | Represented by |
---|---|---|
AWS S3 | AWS | 1 |
GCP | GCP | 2 |
Azure blob | AZURE | 3 |
Open Telekom Cloud | OTC | 4 |
Encord storage | CORD_STORAGE | 0 |
Create a Dataset from Label Rows
Use the following script to create a new Dataset from the label rows of a specific Project.
- Replace
<private_key_path>
with the path to your private key. - Replace
<project_hash>
with the hash of the Project containing the data units you want to create a new Dataset from. - Replace
My new Dataset
with the name you want to give your new Dataset.
If create_backing_folder
is True
, a mirrored Dataset is created. Mirrored Datasets sync the content of the backed Folder with the Dataset.
List existing Datasets
Use the EncordUserClient method to query and list the user client’s Datasets.
The following example fetches all Datasets available to the user. Substitute <private_key_path>
with the file path for your private key.
The Dataset hash can be found within the URL once a Dataset has been selected:
app.encord.com/projects/view/\<dataset_hash>/summary
or app.us.encord.com/projects/view/\<dataset_hash>/summary
The type attribute in the output refers to the
StorageLocation
Was this page helpful?