Re-encoding videos

You can use Encord's Python SDK to re-encode videos. See our detailed documentation on re-encoding for more information.

ℹ️

Note

If your data is hosted on a private cloud, Encord requires 'write' permissions to your cloud to re-encode videos.

Use the dataset.re_encode_data() method to re-encode a list of videos, replacing <video1_data_hash> and <video2_data_hash> below with the hashes of the videos to be re-encoded.

# Import dependencies
from encord import EncordUserClient

# Authenticate with Encord using the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(ssh_private_key_path="<private_key_path>")

# Specify the dataset you want to re-encord data from by using its dataset hash
dataset = user_client.get_dataset("<dataset_hash>")

# Specify the data hashes of the files to be re-encorded
task_id = dataset.re_encode_data(
    [
        "<video1_data_hash>",
        "<video2_data_hash>",
    ]
)

# Print the task ID for the re-encoding job
print(task_id)
1337 

The output is a list of task ID's of each re-encoding job. The task_id can be used to monitor the progress of the task.

ℹ️

Note

Ensure that the list contains videos from the same Dataset that was used to initialize the EncordClient. Any videos that do not belong to the Dataset used for initialisation are ignored.


Check the status of a re-encoding task

Use the dataset.re_encode_data_status() method to get the status of an existing re-encoding task. Replace task_id in the sample below with the task ID of the re-encoding task you'd like to check the status of.

# Import dependencies
from encord import EncordUserClient

# Authenticate with Encord using the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(ssh_private_key_path="<private_key_path>")

# Specify the dataset you want to re-encord data from by using its dataset hash
dataset = user_client.get_dataset("<dataset_hash>")

# Define and print the status of the re-encoding task by inserting the re-encoding task's ID
task = (
    dataset.re_encode_data_status("<task_id>")
)
print(task)
ReEncodeVideoTask(
    status="DONE",
    result=[
        ReEncodeVideoTaskResult(
            data_hash="<data_hash>",
            signed_url="<signed_url>",
            bucket_path="<bucket_path>",
        ),
        ...
    ]
) 

The ReEncodeVideoTask object contains a status field, which can take the following values:

  • "SUBMITTED": the task is currently in progress and the status should be checked back again later.

  • "DONE": the task has been completed successfully and the field ‘result’ would contain metadata about the re-encoded video.

  • "ERROR": the task has failed and could not complete the re-encoding.

Re-encode locally

Some cases, such as corrupted metadata, might require you to re-encode your data locally before uploading them to the Encord platform.

Use the following ffmpeg command, replacing "video.mp4" with the name of the file you want to re-encode, and "re-encoded-video.mp4" with the name you want the re-encoded file to have:

ffmpeg -err_detect aggressive -fflags discardcorrupt -i video.mp4 -c:v libx264 -movflags faststart -an -tune zerolatency re-encoded-video.mp4