Clickable Div Import data from AWS Import data from Azure Import data from GCP Import data from OTC

Upload private cloud data

All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the exact same way.

Use the script below to upload your private cloud data to a specified Dataset.

👍

Tip

If the following script returns "Upload is still in progress, try again later!", check the upload status at a later time.


# Import dependencies
from encord import EncordUserClient
from encord.orm.dataset import LongPollingStatus

# Instantiate user client. Replace <private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(ssh_private_key_path="<private_key_path>")

# Specify the dataset you want to upload data to by replacing <dataset_hash> with the dataset hash
dataset = user_client.get_dataset("<dataset_hash>")

# Specify the integration you want to upload data to by replacing <integration_title> with the integration title
integrations = user_client.get_cloud_integrations()
integration_idx = [i.title for i in integrations].index("<integration_title>")
integration = integrations[integration_idx].id

# Initiate cloud data upload. Replace path/to/json/file.json with the path to your JSON file
upload_job_id = dataset.add_private_data_to_dataset_start(
    integration, "path/to/json/file.json"
)

# timeout_seconds determines how long the code will wait after initiating upload until continuing and checking upload status
res = dataset.add_private_data_to_dataset_get_result(upload_job_id, timeout_seconds=5)
print(f"Execution result: {res}")


if res.status == LongPollingStatus.PENDING:
    print("Upload is still in progress, try again later!")
elif res.status == LongPollingStatus.DONE:
    print("Upload completed without errors")
else:
    print(f"Errors: {res.errors}")
add_private_data_to_dataset job started with upload_job_id=c4026edb-4fw2-40a0-8f05-a1af7f465727.
SDK process can be terminated, this will not affect successful job execution.
You can follow the progress in the web app via notifications.
add_private_data_to_dataset job completed with upload_job_id=c4026edb-4fw2-40a0-8f05-a1af7f465727.
Execution result: DatasetDataLongPolling(status=<LongPollingStatus.DONE: 'DONE'>, data_hashes_with_titles=[DatasetDataInfo(data_hash='cd42333d-8014-46q7-837b-5bf68b9b5', title='funny_image.jpg')], errors=[], units_pending_count=0, units_done_count=1, units_error_count=0)
Upload completed without errors

Check data upload

If the code returns "Upload is still in progress, try again later!", run the following code to query the Encord server again. Replace upload_job_id with the output by the previous code. In the example above upload_job_id=c4026edb-4fw2-40a0-8f05-a1af7f465727.

# Import dependencies
from encord import EncordUserClient
from encord.orm.dataset import LongPollingStatus

# Instantiate user client
user_client = EncordUserClient.create_with_ssh_private_key(ssh_private_key_path="/Users/encord/.ssh/new-key-db-private-key.txt")

# Check upload status
res = dataset.add_private_data_to_dataset_get_result(upload_job_id, timeout_seconds=5)
print(f"Execution result: {res}")

if res.status == LongPollingStatus.PENDING:
    print("Upload is still in progress, try again later!")
elif res.status == LongPollingStatus.DONE:
    print("Upload completed without errors")
else:
    print(f"Errors: {res.errors}")

👍

Tip

Omitting the timeout_seconds argument from the add_private_data_to_dataset_get_result() method performs status checks until the status upload has finished.