Your data is registered to the Files section of Index where it is organized into folders and sub-folders.
Registering your cloud data with Encord only references the files. Your data is not stored on Encord servers.
Registering your data into Encord is a multi-step process:
Before you can do anything with the Encord platform and cloud storage, you need to configure your cloud storage to work with Encord. Once the integration between Encord and your cloud storage is complete, you can then use your data in Encord.
In order to integrate with AWS S3, you need to:
You have the following options to integrate AWS and Encord:
In the Integrations section of the Encord platform, click +New integration to create a new integration.
Select AWS S3 at the top of the chooser.
All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the same way, by using a JSON or CSV file. The file includes links to all images, image groups, videos and DICOM files in your cloud storage.
Encord enforces the following upload limits for each JSON file used for file registration:
Optimal upload chunking can vary depending on your data type and the amount of associated metadata. For tailored recommendations, contact Encord support. We recommend starting with smaller uploads and gradually increasing the size based on how quickly jobs are processed. Generally, smaller chunks result in faster data reflection within the platform.
clientMetadata
) to specify key frames, custom metadata, and custom embeddings. For more information go here or here for information on using the SDK.For detailed information about the JSON file format used for import go here.
The information provided about each of the following data types is designed to get you up and running as quickly as possible without going too deeply into the why or how. Look at the template for each data type, then the examples, and adjust the examples to suit your needs.
skip_duplicate_urls
is set to true
, all object URLs that exactly match existing images/videos in the dataset are skipped.Audio files
The following is an example JSON file for uploading two audio files to Encord.
audiometadata
flag. When the audiometadata
flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.Text Files
The following is an example JSON file for uploading text files to Encord.
Single images
For detailed information about the JSON file format used for import go here.
The JSON structure for single images parallels that of videos.
Template: Provides the proper JSON format to import images into Encord.
Examples:
Image groups
For detailed information about the JSON file format used for import go here.
skip_duplicate_urls
is set to true
, all URLs exactly matching existing image groups in the dataset are skipped.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
Examples:
Image sequences
For detailed information about the JSON file format used for import go here.
image_groups
array with the createVideo
flag set to true
represents a single image sequence.skip_duplicate_urls
is set to true
, all URLs exactly matching existing image sequences in the dataset are skipped.createVideo
flag to be set to true
. Both use the key image_groups
.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
** Examples:**
DICOM
For detailed information about the JSON file format used for import go here.
dicom_series
element can contain one or more DICOM series.skip_duplicate_urls
is set to true
, all object URLs exactly matching existing DICOM files in the dataset will be skipped..dcm
file and does not have to be specific during the upload to Encord. The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.
For each DICOM upload, an additional DicomSeries
file is created. This file represents the series file-set. Only DicomSeries
are displayed in the Encord application.
Multiple file types
You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.
When using a Multi-Region Access Point for your AWS S3 buckets the JSON file has to be slightly different from the examples provided. Instead of an object’s URL, objects are specified using the ARN of the Multi-Region Access Point followed by the object name. The example below shows how video files from a Multi-Region Access Point would be specified.
In the CSV file format, the column headers specify which type of data is being uploaded. You can add and single file format at a time, or combine multiple data types in a single CSV file.
Details for each data format are given in the sections below.
ObjectUrl
column is interpreted as a request for video upload. If your objects are of a different type (for example, images), this error displays: “Expected a video, got a file of type XXX”.Videos
A CSV file containing videos should contain two columns with the following mandatory column headings:
‘ObjectURL’ and ‘Video title’. All headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the video resource.
The ‘Video title’ column containing the video_title
. If left blank, the original file name is used.
In the example below files 1, 2 and 4 will be assigned the names in the title column, while file 3 will keep its original file name.
ObjectUrl | Video title |
---|---|
path/to/storage-location/frame1.mp4 | Video 1 |
path/to/storage-location/frame2.mp4 | Video 2 |
path/to/storage-location/frame3.mp4 | |
path/to/storage-location/frame4.mp4 | Video 3 |
Single images
A CSV file containing single images should contain two columns with the following mandatory headings:
‘ObjectURL’ and ‘Image title’. All headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the image resource.
The ‘Image title’ column containing the image_title
. If left blank, the original file name is used.
In the example below files 1, 2 and 4 will be assigned the names in the title column, while file 3 will keep its original file name.
ObjectUrl | Image title |
---|---|
path/to/storage-location/frame1.jpg | Image 1 |
path/to/storage-location/frame2.jpg | Image 2 |
path/to/storage-location/frame3.jpg | |
path/to/storage-location/frame4.jpg | Image 3 |
Image groups
A CSV file containing image groups should contain three columns with the following mandatory headings:
‘ObjectURL’, ‘Image group title’, and ‘Create video’. All three headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the resource.
The ‘Image group title’ column containing the image_group_title
. This field is mandatory, as it determines which image group a file will be assigned to.
In the example below the first two URLs are grouped together into ‘Group 1’, while the following two files are grouped together into ‘Group 2’.
ObjectUrl | Image group title | Create video |
---|---|---|
path/to/storage-location/frame1.jpg | Group 1 | false |
path/to/storage-location/frame2.jpg | Group 1 | false |
path/to/storage-location/frame3.jpg | Group 2 | false |
path/to/storage-location/frame4.jpg | Group 2 | false |
Image sequences
A CSV file containing image sequences should contain three columns with the following mandatory headings: ‘ObjectURL’, ‘Image group title’, and ‘Create video’. All three headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the resource.
The ‘Image group title’ column containing the image_group_title
. This field is mandatory, as it determines which image sequence a file will be assigned to. The dimensions of the image sequence are determined by the first file in the sequence.
The ‘Create video’ column. This can be left blank, as the default value is ‘true’.
In the example below the first two URLs are grouped together into ‘Sequence 1’, while the second two files are grouped together into ‘Sequence 2’.
ObjectUrl | Image group title | Create video |
---|---|---|
path/to/storage-location/frame1.jpg | Sequence 1 | true |
path/to/storage-location/frame2.jpg | Sequence 1 | true |
path/to/storage-location/frame3.jpg | Sequence 2 | true |
path/to/storage-location/frame4.jpg | Sequence 2 | true |
DICOM
A CSV file containing DICOM files should contain two columns with the following mandatory headings: ‘ObjectURL’ and ‘Dicom title’. Both headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the resource.
The ‘Series title’ column containing the dicom_title
. When two files are given the same title they are grouped into the same DICOM series. If left blank, the original file name is used.
In the example below the first two files are grouped into ‘dicom series 1’, the next two files are grouped into ‘dicom series 2’, while the final file will remain separated as ‘dicom series 3’.
ObjectUrl | Series title |
---|---|
path/to/storage-location/frame1.dcm | dicom series 1 |
path/to/storage-location/frame2.dcm | dicom series 1 |
path/to/storage-location/frame3.dcm | dicom series 2 |
path/to/storage-location/frame4.dcm | dicom series 2 |
path/to/storage-location/frame5.dcm | dicom series 3 |
Multiple file types
You can upload multiple file types with a single CSV file by using a new header each time there is a change of file type. Three headings will be required if image sequences are included.
true
all files that are not image sequences must contain the value false
The example below shows a CSV file for the following:
ObjectUrl | Image group title | Create video |
---|---|---|
path/to/storage-location/frame1.jpg | Sequence 1 | true |
path/to/storage-location/frame2.jpg | Sequence 1 | true |
path/to/storage-location/frame3.jpg | Sequence 2 | true |
path/to/storage-location/frame4.jpg | Sequence 2 | true |
path/to/storage-location/frame5.jpg | Group 1 | false |
path/to/storage-location/frame6.jpg | Group 1 | false |
ObjectUrl | Image title | Create video |
path/to/storage-location/frame1.jpg | Image 1 | false |
ObjectUrl | Image title | Create video |
full/storage/path/video.mp4 | Video 1 | false |
Your data is registered to the Files section of Index where it is organized into folders and sub-folders.
Registering your cloud data with Encord only references the files. Your data is not stored on Encord servers.
Registering your data into Encord is a multi-step process:
Before you can do anything with the Encord platform and cloud storage, you need to configure your cloud storage to work with Encord. Once the integration between Encord and your cloud storage is complete, you can then use your data in Encord.
In order to integrate with AWS S3, you need to:
You have the following options to integrate AWS and Encord:
In the Integrations section of the Encord platform, click +New integration to create a new integration.
Select AWS S3 at the top of the chooser.
All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the same way, by using a JSON or CSV file. The file includes links to all images, image groups, videos and DICOM files in your cloud storage.
Encord enforces the following upload limits for each JSON file used for file registration:
Optimal upload chunking can vary depending on your data type and the amount of associated metadata. For tailored recommendations, contact Encord support. We recommend starting with smaller uploads and gradually increasing the size based on how quickly jobs are processed. Generally, smaller chunks result in faster data reflection within the platform.
clientMetadata
) to specify key frames, custom metadata, and custom embeddings. For more information go here or here for information on using the SDK.For detailed information about the JSON file format used for import go here.
The information provided about each of the following data types is designed to get you up and running as quickly as possible without going too deeply into the why or how. Look at the template for each data type, then the examples, and adjust the examples to suit your needs.
skip_duplicate_urls
is set to true
, all object URLs that exactly match existing images/videos in the dataset are skipped.Audio files
The following is an example JSON file for uploading two audio files to Encord.
audiometadata
flag. When the audiometadata
flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.Text Files
The following is an example JSON file for uploading text files to Encord.
Single images
For detailed information about the JSON file format used for import go here.
The JSON structure for single images parallels that of videos.
Template: Provides the proper JSON format to import images into Encord.
Examples:
Image groups
For detailed information about the JSON file format used for import go here.
skip_duplicate_urls
is set to true
, all URLs exactly matching existing image groups in the dataset are skipped.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
Examples:
Image sequences
For detailed information about the JSON file format used for import go here.
image_groups
array with the createVideo
flag set to true
represents a single image sequence.skip_duplicate_urls
is set to true
, all URLs exactly matching existing image sequences in the dataset are skipped.createVideo
flag to be set to true
. Both use the key image_groups
.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
** Examples:**
DICOM
For detailed information about the JSON file format used for import go here.
dicom_series
element can contain one or more DICOM series.skip_duplicate_urls
is set to true
, all object URLs exactly matching existing DICOM files in the dataset will be skipped..dcm
file and does not have to be specific during the upload to Encord. The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.
For each DICOM upload, an additional DicomSeries
file is created. This file represents the series file-set. Only DicomSeries
are displayed in the Encord application.
Multiple file types
You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.
When using a Multi-Region Access Point for your AWS S3 buckets the JSON file has to be slightly different from the examples provided. Instead of an object’s URL, objects are specified using the ARN of the Multi-Region Access Point followed by the object name. The example below shows how video files from a Multi-Region Access Point would be specified.
In the CSV file format, the column headers specify which type of data is being uploaded. You can add and single file format at a time, or combine multiple data types in a single CSV file.
Details for each data format are given in the sections below.
ObjectUrl
column is interpreted as a request for video upload. If your objects are of a different type (for example, images), this error displays: “Expected a video, got a file of type XXX”.Videos
A CSV file containing videos should contain two columns with the following mandatory column headings:
‘ObjectURL’ and ‘Video title’. All headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the video resource.
The ‘Video title’ column containing the video_title
. If left blank, the original file name is used.
In the example below files 1, 2 and 4 will be assigned the names in the title column, while file 3 will keep its original file name.
ObjectUrl | Video title |
---|---|
path/to/storage-location/frame1.mp4 | Video 1 |
path/to/storage-location/frame2.mp4 | Video 2 |
path/to/storage-location/frame3.mp4 | |
path/to/storage-location/frame4.mp4 | Video 3 |
Single images
A CSV file containing single images should contain two columns with the following mandatory headings:
‘ObjectURL’ and ‘Image title’. All headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the image resource.
The ‘Image title’ column containing the image_title
. If left blank, the original file name is used.
In the example below files 1, 2 and 4 will be assigned the names in the title column, while file 3 will keep its original file name.
ObjectUrl | Image title |
---|---|
path/to/storage-location/frame1.jpg | Image 1 |
path/to/storage-location/frame2.jpg | Image 2 |
path/to/storage-location/frame3.jpg | |
path/to/storage-location/frame4.jpg | Image 3 |
Image groups
A CSV file containing image groups should contain three columns with the following mandatory headings:
‘ObjectURL’, ‘Image group title’, and ‘Create video’. All three headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the resource.
The ‘Image group title’ column containing the image_group_title
. This field is mandatory, as it determines which image group a file will be assigned to.
In the example below the first two URLs are grouped together into ‘Group 1’, while the following two files are grouped together into ‘Group 2’.
ObjectUrl | Image group title | Create video |
---|---|---|
path/to/storage-location/frame1.jpg | Group 1 | false |
path/to/storage-location/frame2.jpg | Group 1 | false |
path/to/storage-location/frame3.jpg | Group 2 | false |
path/to/storage-location/frame4.jpg | Group 2 | false |
Image sequences
A CSV file containing image sequences should contain three columns with the following mandatory headings: ‘ObjectURL’, ‘Image group title’, and ‘Create video’. All three headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the resource.
The ‘Image group title’ column containing the image_group_title
. This field is mandatory, as it determines which image sequence a file will be assigned to. The dimensions of the image sequence are determined by the first file in the sequence.
The ‘Create video’ column. This can be left blank, as the default value is ‘true’.
In the example below the first two URLs are grouped together into ‘Sequence 1’, while the second two files are grouped together into ‘Sequence 2’.
ObjectUrl | Image group title | Create video |
---|---|---|
path/to/storage-location/frame1.jpg | Sequence 1 | true |
path/to/storage-location/frame2.jpg | Sequence 1 | true |
path/to/storage-location/frame3.jpg | Sequence 2 | true |
path/to/storage-location/frame4.jpg | Sequence 2 | true |
DICOM
A CSV file containing DICOM files should contain two columns with the following mandatory headings: ‘ObjectURL’ and ‘Dicom title’. Both headings are case-insensitive.
The ‘ObjectURL’ column containing the objectUrl
. This field is mandatory for each file, as it specifies the full URL of the resource.
The ‘Series title’ column containing the dicom_title
. When two files are given the same title they are grouped into the same DICOM series. If left blank, the original file name is used.
In the example below the first two files are grouped into ‘dicom series 1’, the next two files are grouped into ‘dicom series 2’, while the final file will remain separated as ‘dicom series 3’.
ObjectUrl | Series title |
---|---|
path/to/storage-location/frame1.dcm | dicom series 1 |
path/to/storage-location/frame2.dcm | dicom series 1 |
path/to/storage-location/frame3.dcm | dicom series 2 |
path/to/storage-location/frame4.dcm | dicom series 2 |
path/to/storage-location/frame5.dcm | dicom series 3 |
Multiple file types
You can upload multiple file types with a single CSV file by using a new header each time there is a change of file type. Three headings will be required if image sequences are included.
true
all files that are not image sequences must contain the value false
The example below shows a CSV file for the following:
ObjectUrl | Image group title | Create video |
---|---|---|
path/to/storage-location/frame1.jpg | Sequence 1 | true |
path/to/storage-location/frame2.jpg | Sequence 1 | true |
path/to/storage-location/frame3.jpg | Sequence 2 | true |
path/to/storage-location/frame4.jpg | Sequence 2 | true |
path/to/storage-location/frame5.jpg | Group 1 | false |
path/to/storage-location/frame6.jpg | Group 1 | false |
ObjectUrl | Image title | Create video |
path/to/storage-location/frame1.jpg | Image 1 | false |
ObjectUrl | Image title | Create video |
full/storage/path/video.mp4 | Video 1 | false |