Supported Data
Comparing file formats
The following video tutorial explains all the different file formats in Encord.
Comparison of image file types
Single Image | Image Group | Image Sequence | |
---|---|---|---|
‘Write’ permissions required in cloud storage? | No | No | Yes |
Multiple images per task? | No | Yes | Yes |
Data loss during creation? | No | No | Yes* |
ML features available for labeling? | No | No | Yes |
* Data is lost during the creation of an image sequence when images of varying dimensions are combined into a single image sequence. For more information about this, please see the section on image sequences below.
Supported file formats
Single Image | Image Group | Image Sequence | Video | |
---|---|---|---|---|
Supported file formats | .jpeg .png .webp .avif .bmp .tiff * .tif * | .jpeg .png .webp .avif .bmp .tiff * .tif * | .jpeg .png .webp .avif .bmp | .mp4 .mov * .webm .mkv |
mov
files are only supported in the Safari browser. However, we strongly recommend using Chrome whenever possible. Single images
Single images are individual images that are uploaded to Encord as separate files. Each single image constitutes its own data package, hence images will be listed individually.
We support the following file types for single images:
.jpeg
, .png
, .webp
, .avif
, .bmp
, .tiff
*, .tif
*
* Due to Chromium-based browser limitations, TIFF files can only be viewed in the label editor using the Safari browser.
Image groups
Image groups are collections of images that are grouped together into a single data unit. They can contain images of varying orientations and no data is lost in the process of creating an image group.
For faster uploads, use image groups instead of image sequences. Image groups bundle multiple images into a single data unit, while image sequences convert images into a video, which requires more processing power and time.
We support the following file types for image groups:
.jpeg
, .png
, .webp
, .avif
, .bmp
, .tiff
*, .tif
*
Image sequences
Image sequences are collections of images that are grouped together into a single data unit, and annotated in the same way as videos. As a result, image sequences are able to make use of powerful machine learning features such as automated labeling.
Only images with the same dimensions can be combined into an image sequence. If your upload includes images with varying dimensions, a separate image sequence is be created for each unique dimension.
For faster uploads, use image groups instead of image sequences. Image groups bundle multiple images into a single data unit, while image sequences convert images into a video, which requires more processing power and time.
We support the following file types for image sequences:
.jpeg
, .png
, .webp
, .avif
, .bmp
, .tiff
*, .tif
*
Creating image sequences
Image sequences can be created when importing files into Encord, or by batching images already sorted in Files into image sequences.
Only images with the same dimensions can be combined into an image sequence. If your upload includes images with varying dimensions, a separate image sequence is be created for each unique dimension.
Videos
Video files can be uploaded onto the Encord platform and annotated frame by frame, enabling you to track elements within the video. Even though frames are treated as individual images when being annotated, the video itself constitutes a single data package.
The following video formats, contains, and codecs are supported.
Video Formats and Containers | Supported Codecs |
---|---|
matroska, webm | vp8 |
matroska, webm | vp9 |
matroska, webm | libdav1d |
mov, mp4, m4a, 3gp, 3g2, mj2 | libdav1d |
mov, mp4, m4a, 3gp, 3g2, mj2 | h264 |
mov, mp4, m4a, 3gp, 3g2, mj2 | h265 (hevc) |
For optimized performance we recommend using the following video formats:
.mp4
file with h.265..mp4
with vp9.
File uploads must be less than 1 GB or 100,000 frames (1hr at 30FPS). Please see our guidelines for optimal performance for more details.
Resizing videos
Videos can be re-sized using ffmpeg to fit the required specifications in the following ways:
- Lowering their resolution.
- Lowering their frame rate.
- Specify a key frame interval.
Follow the steps below to reduce the video file size by adjusting the resolution, as well as the number of frames per second (fps).
- Download and install ffmpeg.
- Open Command Line (on Windows) or Terminal (on Mac & Linux).
- Run the command shown below, substituting the values for your desired resolution as well as fps, followed by the path to the video you’d like to downsize.
ffmpeg -i normal-video.mp4 -c:v libx265 -an -movflags faststart -tune zerolatency -f mp4 -vf "scale=-1:1080,fps=30" </Users/Desktop/normal-video-downsampled.mp4>
The example above resizes normal-video-downsampled.mp4
to 30 fps
and a resolution of -1:1080px
. This means that the video height is set to 1080px
, while the -1
adjusts the width so that the aspect ratio remains the same as in the original.
The following example imports the video by generic key frame intervals using the global variable -g
(every 30 frames).
ffmpeg -i normal-video.mp4 -c:v libx265 -an -movflags faststart -tune zerolatency -f mp4 -g 30 </Users/Desktop/normal-video-downsampled.mp4>
The following example imports the video by key frame intervals by specific codec -x264opts keyint=
(every 30 frames for codec H264).
ffmpeg -i normal-video.mp4 -an -movflags faststart -tune zerolatency -f -c:v libx264 -x264opts keyint=30" </Users/Desktop/normal-video-downsampled.mp4>
Pixel aspect ratio
To ensure accurate labels, only videos with a pixel aspect ratio of 1:1 should be used in Encord.
To check the pixel aspect ratio of your file, run the following ffprobe
command, where input-file.mp4
is the full file path to the file you want to output the pixel aspect ratio for. ffprobe
refers to pixel aspect ratio as ‘sample aspect ratio (SAR)‘.
ffprobe -v error -select_streams v:0 -show_entries stream=sample_aspect_ratio -of default=noprint_wrappers=1:nokey=1 input-file.mp4
ffprobe
command will output ‘N/A’ is the file’s SAR is not defined in the video’s metadata.Converting videos to images
In machine learning, most infrastructures and codebases are tailored for image data rather than video. Therefore, when working with videos, a common prerequisite is to convert videos into a series of images. This process is essential for training ML models that do not directly support video inputs. We outline two approaches for extracting images from videos, each catering to different needs.
Approach 1: Extracting Every Frame as an Image
This method involves decompressing each frame of a video into separate image files. It’s straightforward and useful when you need to analyze or process every frame individually. Navigate to the directory your video is in and run the following command in your Terminal, replacing your_video.mp4
with the name of your video file, and /path/to/output/dir/frame
with the path to your output directory.
ffmpeg -i your_video.mp4 -start_number 0 /path/to/output/dir/frame_%d.png -hide_banner
Approach 2: In-Memory Frame Iteration with Python
For scenarios where saving individual image files is unnecessary, you can iterate over video frames directly in memory using Python. This method is efficient for immediate processing or conversion to formats like TFRecords.
pyav
Python library is installed.
import av
# Open the video file
with av.open('your_video.mp4') as container:
# Loop through streams in the container
for frame in container.decode(video=0):
# Process the frame. For example, convert the frame to a PIL Image and show it
img = frame.to_image()
img.show()
DICOM
Encord provides native support for Digital Imaging and Communications in Medicine (DICOM) browser rendering and data annotations. Our Ontologies allow you to create any type of labeling protocol - for example RECIST, which requires measuring the longest diameter of a lesion.
With the DICOM editor, you can:
- Annotate modalities such as CT, X-ray, and MRI.
- Label using any annotation type in 2D (with 3D in the works), and seamlessly toggle between axial, coronal, and sagittal views.
- Render 20,000 pixel intensities and set custom window widths & levels.
- Natively display DICOM metadata.
- Track and interpolate objects between slices.
- Reduce manual annotations with automation features.
Multi-frame DICOM Support
We support multi-frame DICOMs, making it even easier and faster to load data onto our platform! Multi-frame DICOM support can save time and streamline workflows due to a reduced amount of header data repetition.
Simply upload your multi-frame DICOM files like you would any other files - either via the user interface, or using our python SDK.
Caching DICOM data
Caching allows frequently accessed DICOM files to be stored securely in your local browser memory, allowing them to be retrieved quickly and efficiently. As a result you can expect faster load times when accessing medical imaging data.
This is particularly beneficial for busy healthcare professionals who require quick access to patient records and diagnostic imaging.
Re-encode your data
If we detect that an uploaded video cannot be added to a dataset, the video must be re-encoded. This simply means bringing the video in line with Encord’s requirements to avoid frame synchronization issues and ensure accurate labels.
Your video needs re-encoding if you see after successfully uploading your video. Hover over the icon to read what issues we have found with the video.
You can always check the progress of any process by clicking the bell icon in the top left corner of the platform. The number displayed in red signifies how many new notifications you have.
Re-encode (Auto)
Auto re-encoding serves common frame synchronization issues and ensures there is no quality loss during re-encoding. If for example the only issue is that audio is present in the video, auto re-encoding only removes the audio while keeping the video as-is.
To automatically re-encode your video, simply select the file in question and click the Re-encode (auto) button to prevent any frame synchronization issues from arising. A notification telling you your video is re-encoded should appear at the top of your screen.
Re-encode (Full)
The Force full re-encoding button converts the video to a specific frame rate and video codec regardless of why frame synchronization issues are occurring. A full re-encoding takes longer and can lead to issues with existing labels.
Re-encode locally
Some cases, such as corrupted metadata, might require you to re-encode your data locally before uploading them to the Encord platform.
To do this, try the following ffmpeg
command, replacing “video.mp4” with the name of the file you would like to re-encode, and “re-encoded-video.mp4” with the name you’d like the re-encoded file to have:
ffmpeg -err_detect aggressive -fflags discardcorrupt -i video.mp4 -c:v libx264 -movflags faststart -an -tune zerolatency re-encoded-video.mp4
Frame synchronization
Our servers use the FFmpeg project to extract frames from a video one by one. That way every label for a video is mapped to a specific frame index within that video, providing a reliable way to map labels to the specific frames in the video.
This process is not as straight forward in the browser. When a video gets embedded in the browser, we have no control over how exactly the individual frames are displayed when the video is played. Even if you upload a video which has a constant frame rate of 30 fps, this unfortunately does not guarantee that the browser will play exactly 30 frames in a second.
In order to seek individual frames we have to seek specific timestamps, and thus we are at the mercy of the browser to decide which frames start and end at which timestamps. We have worked hard to create the necessary safety mechanisms to detect possible issues and provide a robust solution to ensure labels are saved to the correct frame index.
Frame synchronization issue causes and solutions
We report frame synchronization issues in the dataset view after a video has been uploaded. Hover over the yellow Warning icon in the video row to see which issues have been identified.
We also report frame synchronization issues when the video is loaded in the label editor with the following warning:
Our recommended solution is to re-encode problematic videos in your dataset. You can use Encord to re-encode your videos using the following steps:
- Navigate to your dataset in the dataset settings.
- Select all the videos that need re-encoding.
- Hit the Re-encode button.
- Wait until you see the re-encoded videos added to the dataset. The re-encoded videos will have the same title as the selected videos, with “_normalized” appended in the title.
- Optionally remove the previously selected videos to avoid having problematic videos being labeled.
When you click the re-encode button, our platform will apply the least invasive transformation to the video to fix the issues. This is so that the re-encoded video and the original video are as similar as possible which will reduce possible frictions in your integration pipeline with the Encord platform. You can read more about what happens during re-encoding in the examples below.
To detect some of these problems you can inspect your video with ffprobe. You will need to download FFmpeg to do so.
Existing labels before re-encoding
In certain scenarios, you may have pre-existing labels that you want to upload to our platform. If we advise you to re-encode the video, there is a possibility that your new labels might become misplaced. This situation can arise if:
- The labels were created incorrectly before, because of frame synchronization issues on other platforms
- You have labels for a variable frame rate video and now the frame rate has changed
If you upload labels and need to re-encode your video, we strongly encourage you to double-check the accuracy of your pre-existing labels after re-encoding the video.
If you believe that your labels are misaligned, please do reach out to the Encord team, so we can help you with the uploading process.
Fixing cloud storage buffering issues
A number of factors determine how fast remote content is rendered and displayed in a web browser:
- Geographical distance from host bucket.
- Local internet speed.
- Use of a Virtual Private Network (VPN).
- The bitrate of the video file being requested.
While these factors are outside of Encord’s influence, we have put together a set of common issues when loading data from private cloud, along with possible solutions to improve your user experience.
Geographical distance from host
The physical distance between you and the server hosting your data will affect how quickly content can be accessed. If, for example, an Amazon S3 bucket is hosted in the ‘EU (London) eu-west-2’ region, then users from India, or Brazil might experience latency issues due to data having to travel a long way.
The solution is to make sure that users always connect to their closest server, reducing the time it takes for data to travel from one place to another - however, setting this up differs between different cloud storage providers.
Local internet speed
Test your internet connection to make sure it is fast enough to download your video from cloud storage without buffering. Consistently low internet speeds might mean your provider isn’t providing you with high enough internet speeds to load large quantities of data from your cloud storage.
You can see the speed at which a video is being downloaded at the bottom of the ‘editor pane’ in the ‘label editor’ on Encord’s platform. It also displays the download speed required to display a video without buffering.
VPNs
VPNs can slow down your connection by requiring the data travel through the VPN’s server before arriving to you.
We recommend you turn off your VPN when using Encord’s platform if you are experiencing buffering issues.
High bitrates in videos
Video content itself varies widely in terms of frames rate, compression, definition and size. The most important factor when determining how quickly a remote video will load in the browser is the bitrate. The higher the bitrate, the more data will need to be loaded per second of video.
To make sure videos load and display quickly in the Label Editor:
- An annotator’s download speed from private cloud storage has to be at least 3-4 times the video bitrate.
- Have enough keyframes to ensure smooth frame-by-frame navigation (roughly every 50 frames for high-resolution videos).
Lowering a video’s bitrate will reduce the time it takes to load from your cloud storage. This can be achieved using tools such as FFmpeg. Please see our section on resizing videos for information on how this can be achieved.
Supported browsers
We recommend the Google Chrome or Brave web-browsers when using Encord.
Additionally, we recommend disabling hardware acceleration in Chrome. Hardware acceleration can introduce another layer of uncertainty to video rendering in the browser, potentially leading to unexpected behavior. By disabling it, you can ensure a more stable and consistent experience when working with videos on our platform.
Was this page helpful?