Point Cloud Data Projects

Point Cloud Data (PCD) Projects are multi-modal projects that involve labeling and reviewing 3D point cloud data. Autonomous driving, robotics, and drone technology are all examples of Projects that PCD Projects were built to support.

Point Cloud Data Supported

Encord supports the following formats for Point Cloud Data:

.pcd - Point Cloud Data
.ply - Polygon File Format
.las - LAS point cloud (up to LAS v1.3)
.laz - Compressed LAS (up to LAS v1.3)
.mcap - MCAP container
.bag - ROS bag files
.db3 - ROS2 SQLite bag files

The Encord platform supports many more Point Cloud Data file formats. If you do not see a format you want supported, contact us at support@encord.com.

Data for PCD Projects

PCD Projects use “Scenes”. Scenes are the data units that your Taskers, and possibly your Agents, work with to create and review your labels. A Scene can be one of the following:

A PCD file (.pcd, .ply, .las, .laz, and so on)
A group of videos and PCD files bound together as a coherent group.

Encord supports the following for multimodal Scenes:

Videos up to 1GB in size
Up to 9 videos in a Scene
Up to 100 frames in each video
Up to 10 million cloud data points per frame

The Encord platform supports many more Point Cloud Data file formats. If you do not see a format you want supported, contact us at support@encord.com.

To register/import point cloud data into Encord, the data needs to be mirrored exactly in the cloud and locally. The main.py script to create Scenes in Encord for an autonomous driving project.

main.py

# /// script
# requires-python = ">=3.12"
# dependencies = [
#   "matplotlib>=3.10.3",
#   "np>=1.0.2",
#   "nuscenes-devkit>=1.1.9",
#   "pillow>=11.3.0",
#   "pydantic>=2.11.5",
#   "pypcd4>=1.2.1",
#   "requests>=2.32.3",
#   "scipy>=1.15.3",
#   "tqdm>=4.67.1",
# ]
# ///
from __future__ import annotations

import argparse
import json
import os
import pathlib
import re
import shutil
import tarfile
from dataclasses import dataclass
from enum import StrEnum, auto
from math import floor
from typing import Annotated, Any, Literal

import numpy as np
import pypcd4
import requests
import tqdm
from nuscenes import nuscenes
from pydantic import BaseModel, ConfigDict, Field
from scipy.spatial.transform import Rotation

"""
This script processes scenes from the nuScenes (https://www.nuscenes.org/) dataset and converts them into a
the Encord upload JSON format for visualization and annotation. It can handle
lidar, radar, and camera data, as well as 3D annotations and ego-vehicle poses.
The script downloads the nuScenes minisplit if not found locally, and processes it, including:
- Converting the point cloud data from .bin to .pcd
- Timestamps are normalized to start from 0 at the beginning of the scene
- Converting positions so that the vehicle's starting position is treated as the origin (0, 0, 0)
"""


def snake2camel(snake: str, start_lower: bool = True) -> str:
    """
    Converts a snake_case string to camelCase.

    The `start_lower` argument determines whether the first letter in the generated camelcase should
    be lowercase (if `start_lower` is True), or capitalized (if `start_lower` is False).
    """
    camel = snake.title()
    camel = re.sub("([0-9A-Za-z])_(?=[0-9A-Z])", lambda m: m.group(1), camel)
    if start_lower:
        camel = re.sub("(^_*[A-Z])", lambda m: m.group(1).lower(), camel)
    return camel


class CamelModel(BaseModel):
    model_config = ConfigDict(alias_generator=snake2camel, populate_by_name=True)


@dataclass
class CameraIntrinsics:
    fx: Annotated[float, Field(description="Focal length x")]
    fy: Annotated[float, Field(description="Focal length y")]
    ox: Annotated[float, Field(description="Principal point offset x")]
    oy: Annotated[float, Field(description="Principal point offset y")]
    s: Annotated[float, Field(description="Axis skew")]


@dataclass
class CameraExtrinsics:
    rotation: Annotated[
        tuple[float, float, float, float, float, float, float, float, float],
        Field(description="Rotation matrix R"),
    ]
    position: Annotated[
        tuple[float, float, float], Field(description="Translation vector T")
    ]


@dataclass
class CameraParams:
    width_px: int
    height_px: int
    intrinsics: Annotated[CameraIntrinsics, Field(description="The intrinsic matrix K")]
    extrinsics: Annotated[
        CameraExtrinsics, Field(description="The extrinsic 4x4 matrix R|T")
    ]


@dataclass
class FrameOfReference:
    id: Annotated[str, Field(description="ID of this frame of reference")]
    parent_FOR: Annotated[
        str | None, Field(description="ID of a parent frame of reference")
    ]
    rotation: tuple[float, float, float, float, float, float, float, float, float]
    position: tuple[float, float, float]


Position = tuple[float, float, float]
EulerOrientation = tuple[float, float, float]
Size = tuple[float, float, float]


class Pose(CamelModel):
    position: Position
    orientation: EulerOrientation


class CuboidGeometry(CamelModel):
    type: Literal["cuboid"] = "cuboid"
    pose: Pose
    size: Size


@dataclass
class _FORIdMixin:
    frame_of_reference_id: Annotated[
        str | None, Field(description="ID of the frame of reference the entity is in")
    ] = None


@dataclass
class _URIMixin:
    uri: str


@dataclass
class _EventMixin:
    timestamp: float | None = None


class URIEvent(CamelModel, _EventMixin, _URIMixin):
    pass


class CameraParamsEvent(CamelModel, _EventMixin, CameraParams):
    pass


class FOREvent(CamelModel, _EventMixin, FrameOfReference):
    pass


class ModelEvent(CamelModel, _EventMixin):
    geometries: list[CuboidGeometry]


class CompositeScene(CamelModel):
    type: Literal["composite"] = "composite"
    streams: dict[str, EventStream]


class EntityType(StrEnum):
    POINT_CLOUD = auto()
    FRAME_OF_REFERENCE = auto()
    IMAGE = auto()
    MODEL = auto()
    CAMERA_PARAMETERS = auto()


class PCDStream(CamelModel, _FORIdMixin):
    entity_type: Literal[EntityType.POINT_CLOUD] = EntityType.POINT_CLOUD
    events: Annotated[list[URIEvent], Field(description="List of point cloud events")]


class CameraStream(CamelModel, _FORIdMixin):
    entity_type: Literal[EntityType.CAMERA_PARAMETERS] = EntityType.CAMERA_PARAMETERS
    events: list[CameraParamsEvent]


class ImageStream(CamelModel):
    entity_type: Literal[EntityType.IMAGE] = EntityType.IMAGE
    events: list[URIEvent]
    camera_id: Annotated[
        str | None,
        Field(
            description="ID of the camera associated with the image. Used to position the image in-scene"
        ),
    ]


class ModelStream(CamelModel):
    entity_type: Literal[EntityType.MODEL] = EntityType.MODEL
    events: list[URIEvent | ModelEvent]
    camera_id: str | None


class FORStream(CamelModel):
    entity_type: Literal[EntityType.FRAME_OF_REFERENCE] = EntityType.FRAME_OF_REFERENCE
    events: Annotated[
        list[FOREvent], Field(description="List of frame of reference events")
    ]


class EventStream(CamelModel):
    type: Literal["event"] = "event"
    id: str
    stream: Annotated[
        PCDStream | CameraStream | FORStream | ImageStream | ModelStream,
        Field(discriminator="entity_type"),
    ]


DATASET_DIR = pathlib.Path("./dataset")


class Config:
    env: str
    output_dir: pathlib.Path
    base_url: str

    def __init__(self):
        self.env = "remote"
        self.output_dir = pathlib.Path("./scenes")
        self.base_url = (
            "https://storage.cloud.google.com/my-bucket-name/scenes/nuscenes" # Replace this with the file path in your bucket to the dataset
        )


config = Config()


def ensure_scene_available(
    root_dir: pathlib.Path, dataset_version: str, scene_name: str
) -> None:
    """
    Ensure that the specified scene is available.

    Downloads minisplit into root_dir if scene_name is part of it and root_dir is empty.

    Raises ValueError if scene is not available and cannot be downloaded.
    """
    try:
        nusc = nuscenes.NuScenes(
            version=dataset_version, dataroot=str(root_dir), verbose=False
        )
    except AssertionError:  # dataset initialization failed
        if dataset_version == "v1.0-mini":
            download_minisplit(root_dir)
            nusc = nuscenes.NuScenes(
                version=dataset_version, dataroot=str(root_dir), verbose=False
            )
        else:
            print(
                f"Could not find dataset at {root_dir} and could not automatically download specified scene."
            )
            exit()

    scene_names = [s["name"] for s in nusc.scene]
    if scene_name not in scene_names:
        raise ValueError(f"{scene_name=} not found in dataset")


def nuscene_sensor_names(nusc: nuscenes.NuScenes, scene_name: str) -> list[str]:
    """Return all sensor names in the scene."""

    sensor_names = set()

    scene = next(s for s in nusc.scene if s["name"] == scene_name)
    first_sample = nusc.get("sample", scene["first_sample_token"])
    for sample_data_token in first_sample["data"].values():
        sample_data = nusc.get("sample_data", sample_data_token)
        if sample_data["sensor_modality"] == "camera":
            current_camera_token = sample_data_token
            while current_camera_token != "":
                sample_data = nusc.get("sample_data", current_camera_token)
                sensor_name = sample_data["channel"]
                sensor_names.add(sensor_name)
                current_camera_token = sample_data["next"]

    # For a known set of cameras, order the sensors in a circle.
    ordering = {
        "CAM_FRONT_LEFT": 0,
        "CAM_FRONT": 1,
        "CAM_FRONT_RIGHT": 2,
        "CAM_BACK_RIGHT": 3,
        "CAM_BACK": 4,
        "CAM_BACK_LEFT": 5,
    }
    return sorted(
        sensor_names, key=lambda sensor_name: ordering.get(sensor_name, float("inf"))
    )


# Write all uri assets required for the scene to a separate output directory
def write_asset(path: pathlib.Path):
    shutil.copyfile(path, pathlib.Path("./output") / path.name)


def write_nuscenes_json(scene: CompositeScene, name: str):
    OUTPUT_FILE = config.output_dir / "nuscenes.json"
    with open(OUTPUT_FILE, "w") as f:
        dummy_json = scene.model_dump_json(by_alias=True, indent=2)
        f.write(dummy_json)
        print("Wrote to", OUTPUT_FILE)


def write_upload_json(scenes: list[tuple[CompositeScene, str]]):
    scenes_final = []
    for scene, name in scenes:
        streams = list(scene.model_dump(by_alias=True)["streams"].values())
        scenes_final.append(
            {
                "title": name,
                "streams": streams,
            }
        )

    final = {"scenes": scenes_final}

    OUTPUT_FILE = config.output_dir / "upload.json"
    with open(OUTPUT_FILE, "w") as f:
        json.dump(final, f, indent=2)
        print("Wrote to", OUTPUT_FILE)


first_timestamp = 0
first_position = [0, 0, 0]
hz = 0


def sub(a, b) -> tuple[float, float, float]:
    return [a[i] - b[i] for i in range(len(a))]


def log_nuscenes(
    nusc: nuscenes.NuScenes, scene_name: str, max_time_sec: float, sample_hz: float
) -> CompositeScene:
    """Log nuScenes scene."""
    print(f"Logging scene {scene_name}")

    result = CompositeScene(streams={})

    scene = next(s for s in nusc.scene if s["name"] == scene_name)

    location = nusc.get("log", scene["log_token"])["location"]

    # Get the first sample
    first_sample_token = scene["first_sample_token"]
    first_sample = nusc.get("sample", scene["first_sample_token"])

    # Get the timestamp (in seconds)
    global first_timestamp
    first_timestamp = first_sample["timestamp"] / 1e6
    global first_position
    first_position = (0, 0, 0)
    global hz
    hz = sample_hz

    first_lidar_tokens = []
    first_radar_tokens = []
    first_camera_tokens = []
    for sample_data_token in first_sample["data"].values():
        sample_data = nusc.get("sample_data", sample_data_token)
        log_sensor_calibration(result, sample_data, nusc)

        if sample_data["sensor_modality"] == "lidar":
            first_lidar_tokens.append(sample_data_token)
        elif sample_data["sensor_modality"] == "radar":
            first_radar_tokens.append(sample_data_token)
        elif sample_data["sensor_modality"] == "camera":
            first_camera_tokens.append(sample_data_token)

    first_timestamp_us = nusc.get("sample_data", first_lidar_tokens[0])["timestamp"]
    max_timestamp_us = first_timestamp_us + 1e6 * max_time_sec

    log_lidar_and_ego_pose(result, location, first_lidar_tokens, nusc, max_timestamp_us)
    log_cameras(result, first_camera_tokens, nusc, max_timestamp_us)
    log_radars(result, first_radar_tokens, nusc, max_timestamp_us)
    log_annotations(result, location, first_sample_token, nusc, max_timestamp_us)

    return result


def log_cameras(
    scene: CompositeScene,
    first_camera_tokens: list[str],
    nusc: nuscenes.NuScenes,
    max_timestamp_us: float,
) -> None:
    """Log camera data."""
    for first_camera_token in first_camera_tokens:
        current_camera_token = first_camera_token
        last_logged_timestamp = -10000
        while current_camera_token != "":
            sample_data = nusc.get("sample_data", current_camera_token)
            if max_timestamp_us < sample_data["timestamp"]:
                break
            sensor_name = sample_data["channel"]

            if sensor_name not in scene.streams:
                scene.streams[sensor_name] = EventStream(
                    id=sensor_name,
                    stream=ImageStream(
                        events=[],
                        camera_id=sensor_name + "-camera",
                        frame_of_reference_id=sensor_name + "-calibration",
                    ),
                )

            timestamp = sample_data["timestamp"] * 1e-6 - first_timestamp
            if hz > 0:
                timestamp *= hz
                timestamp = floor(timestamp)
            if hz > 0 and timestamp - last_logged_timestamp < 1.0:
                current_camera_token = sample_data["next"]
                continue
            last_logged_timestamp = timestamp

            data_file_path = nusc.dataroot / sample_data["filename"]

            # write_asset(data_file_path)
            event = URIEvent(
                uri=config.base_url + "/" + str(data_file_path),
                timestamp=timestamp,
            )
            scene.streams[sensor_name].stream.events.append(event)

            current_camera_token = sample_data["next"]


def log_lidar_and_ego_pose(
    scene: CompositeScene,
    location: str,
    first_lidar_token: list[str],
    nusc: nuscenes.NuScenes,
    max_timestamp_us: float,
) -> None:
    """Log lidar data and vehicle pose."""

    scene.streams["ego_vehicle"] = EventStream(
        id="ego_vehicle",
        stream=FORStream(events=[]),
    )

    last_logged_timestamp = -10000

    for current_lidar_token in first_lidar_token:
        while current_lidar_token != "":
            sample_data = nusc.get("sample_data", current_lidar_token)
            sensor_name = sample_data["channel"]

            if max_timestamp_us < sample_data["timestamp"]:
                break

            timestamp = sample_data["timestamp"] * 1e-6 - first_timestamp
            if hz > 0:
                timestamp *= hz
                timestamp = floor(timestamp)
            if hz > 0 and timestamp - last_logged_timestamp < 1.0:
                current_lidar_token = sample_data["next"]
                continue
            last_logged_timestamp = timestamp

            ego_pose = nusc.get("ego_pose", sample_data["ego_pose_token"])
            rotation = (
                Rotation.from_quat(ego_pose["rotation"], scalar_first=True)
                .as_matrix()
                .transpose()
                .flatten()
            )
            position = ego_pose["translation"]
            if timestamp == 0:
                global first_position
                first_position = position

            event = FOREvent(
                id="ego_vehicle",
                parent_FOR="root",
                position=sub(position, first_position),
                rotation=rotation,
                timestamp=timestamp,
            )
            scene.streams["ego_vehicle"].stream.events.append(event)

            current_lidar_token = sample_data["next"]

            data_file_path = nusc.dataroot / sample_data["filename"]

            if sensor_name not in scene.streams:
                scene.streams[sensor_name] = EventStream(
                    id=sensor_name,
                    stream=PCDStream(
                        events=[], frame_of_reference_id=sensor_name + "-calibration"
                    ),
                )

            data_file_path = nusc.dataroot / sample_data["filename"]
            pointcloud = nuscenes.LidarPointCloud.from_file(str(data_file_path))
            points = pointcloud.points[:3].T

            fields = ("x", "y", "z")
            types = (
                np.float32,
                np.float32,
                np.float32,
            )

            pc = pypcd4.PointCloud.from_points(points, fields, types)

            # strip .bin extension from filename
            new_path = str(data_file_path.parent / data_file_path.stem)
            pc.save(new_path)

            event = URIEvent(
                uri=config.base_url + "/" + new_path,
                timestamp=timestamp,
            )
            scene.streams[sensor_name].stream.events.append(event)


def log_radars(
    scene: CompositeScene,
    first_radar_tokens: list[str],
    nusc: nuscenes.NuScenes,
    max_timestamp_us: float,
) -> None:
    """Log radar data to the scene"""
    for first_radar_token in first_radar_tokens:
        current_camera_token = first_radar_token
        last_logged_timestamp = -10000
        while current_camera_token != "":
            sample_data = nusc.get("sample_data", current_camera_token)
            if max_timestamp_us < sample_data["timestamp"]:
                break
            sensor_name = sample_data["channel"]

            if sensor_name not in scene.streams:
                scene.streams[sensor_name] = EventStream(
                    id=sensor_name,
                    stream=PCDStream(
                        events=[], frame_of_reference_id=sensor_name + "-calibration"
                    ),
                )

            timestamp = sample_data["timestamp"] * 1e-6 - first_timestamp
            if hz > 0:
                timestamp *= hz
                timestamp = floor(timestamp)
            if hz > 0 and timestamp - last_logged_timestamp < 1.0:
                current_camera_token = sample_data["next"]
                continue
            last_logged_timestamp = timestamp

            data_file_path = nusc.dataroot / sample_data["filename"]
            current_camera_token = sample_data["next"]
            # write_asset(data_file_path)
            event = URIEvent(
                uri=config.base_url + "/" + str(data_file_path),
                timestamp=timestamp,
            )
            scene.streams[sensor_name].stream.events.append(event)


def log_sensor_calibration(
    scene: CompositeScene, sample_data: dict[str, Any], nusc: nuscenes.NuScenes
) -> None:
    """Log sensor calibration (pinhole camera, sensor poses, etc.) to the scene"""
    sensor_name = sample_data["channel"]
    calibrated_sensor_token = sample_data["calibrated_sensor_token"]
    calibrated_sensor = nusc.get("calibrated_sensor", calibrated_sensor_token)
    rotation = (
        Rotation.from_quat(calibrated_sensor["rotation"], scalar_first=True)
        .as_matrix()
        .transpose()
        .flatten()
        .tolist()
    )

    id = sensor_name + "-calibration"
    scene.streams[id] = EventStream(
        id=id,
        stream=FORStream(events=[]),
    )
    position = sub(calibrated_sensor["translation"], first_position)
    event = FOREvent(
        id=id,
        parent_FOR="ego_vehicle",  # "ego_vehicle",
        position=position,
        rotation=rotation,
    )
    scene.streams[id].stream.events.append(event)

    if len(calibrated_sensor["camera_intrinsic"]) != 0:
        intrinsic = calibrated_sensor["camera_intrinsic"]
        camera_id = sensor_name + "-camera"
        scene.streams[camera_id] = EventStream(
            id=camera_id,
            stream=CameraStream(
                events=[],
                frame_of_reference_id=id,  # might be "root"
            ),
        )

        event = CameraParamsEvent(
            timestamp=0,
            width_px=1600,
            height_px=900,
            intrinsics=CameraIntrinsics(
                fx=intrinsic[0][0],
                fy=intrinsic[1][1],
                ox=intrinsic[0][2],
                oy=intrinsic[1][2],
                s=intrinsic[0][1],
            ),
            extrinsics=CameraExtrinsics(
                position=(0, 0, 0),
                rotation=(0, 0, 1, -1, 0, 0, 0, -1, 0),
            ),
        )
        scene.streams[camera_id].stream.events.append(event)


def log_annotations(
    scene: CompositeScene,
    location: str,
    first_sample_token: str,
    nusc: nuscenes.NuScenes,
    max_timestamp_us: float,
) -> None:
    """Log 3D cuboids to the scene"""

    scene.streams["anns"] = EventStream(
        id="anns",
        stream=ModelStream(events=[], camera_id=None),
    )

    current_sample_token = first_sample_token
    last_logged_timestamp = -10000
    while current_sample_token != "":
        sample_data = nusc.get("sample", current_sample_token)
        if max_timestamp_us < sample_data["timestamp"]:
            break

        timestamp = sample_data["timestamp"] * 1e-6 - first_timestamp
        if hz > 0:
            timestamp *= hz
            timestamp = floor(timestamp)
        if hz > 0 and timestamp - last_logged_timestamp < 1.0:
            current_sample_token = sample_data["next"]
            continue
        last_logged_timestamp = timestamp

        ann_tokens = sample_data["anns"]
        geometries = []
        for ann_token in ann_tokens:
            ann = nusc.get("sample_annotation", ann_token)

            width, length, height = ann["size"]

            # Convert rotation to euler angles
            rotation = Rotation.from_quat(ann["rotation"], scalar_first=True).as_euler(
                "XYZ"
            )

            geometries.append(
                CuboidGeometry(
                    pose=Pose(
                        position=sub(ann["translation"], first_position),
                        orientation=rotation,
                    ),
                    size=(length, width, height),
                )
            )

        event = ModelEvent(
            timestamp=timestamp,
            geometries=geometries,
        )
        scene.streams["anns"].stream.events.append(event)

        current_sample_token = sample_data["next"]


def download_file(url: str, dst_file_path: pathlib.Path) -> None:
    """Download file from url to dst_fpath."""
    dst_file_path.parent.mkdir(parents=True, exist_ok=True)
    print(f"Downloading {url} to {dst_file_path}")
    response = requests.get(url, stream=True)
    with tqdm.tqdm.wrapattr(
        open(dst_file_path, "wb"),
        "write",
        miniters=1,
        total=int(response.headers.get("content-length", 0)),
        desc=f"Downloading {dst_file_path.name}",
    ) as f:
        for chunk in response.iter_content(chunk_size=4096):
            f.write(chunk)


def untar_file(
    tar_file_path: pathlib.Path, dst_path: pathlib.Path, keep_tar: bool = True
) -> bool:
    """Untar tar file at tar_file_path to dst."""
    print(f"Untar file {tar_file_path}")
    try:
        with tarfile.open(tar_file_path, "r") as tf:
            tf.extractall(dst_path)
    except Exception as error:
        print(f"Error unzipping {tar_file_path}, error: {error}")
        return False
    if not keep_tar:
        os.remove(tar_file_path)
    return True


def download_minisplit(root_dir: pathlib.Path) -> None:
    """
    Download nuScenes minisplit.

    Adopted from <https://colab.research.google.com/github/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/nuscenes_tutorial.ipynb>
    """
    MINISPLIT_URL = "https://www.nuscenes.org/data/v1.0-mini.tgz"

    zip_file_path = pathlib.Path("./v1.0-mini.tgz")
    if not zip_file_path.is_file():
        download_file(MINISPLIT_URL, zip_file_path)
    untar_file(zip_file_path, root_dir, keep_tar=True)


if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Visualizes the nuScenes dataset ")
    parser.add_argument(
        "--root-dir",
        type=pathlib.Path,
        default=DATASET_DIR,
        help="Root directory of nuScenes dataset",
    )
    parser.add_argument(
        "--scene-name",
        type=str,
        default="scene-0061",
        help="Scene name to visualize (typically of form 'scene-xxxx')",
    )
    parser.add_argument(
        "--dataset-version", type=str, default="v1.0-mini", help="Scene id to visualize"
    )
    parser.add_argument(
        "--seconds",
        type=float,
        default=float("inf"),
        help="If specified, limits the number of seconds logged",
    )
    parser.add_argument(
        "--all",
        "-A",
        action="store_true",
        help="If specified, logs all scenes",
    )
    parser.add_argument(
        "--hz",
        type=float,
        default=0.0,
        help="Limit the sample rate",
    )
    args = parser.parse_args()

    # ensure_scene_available(
    #     root_dir=args.root_dir,
    #     dataset_version=args.dataset_version,
    #     scene_name=args.scene_name,
    # )

    nusc = nuscenes.NuScenes(
        version=args.dataset_version, dataroot=args.root_dir, verbose=False
    )

    scene_names: list[str] = [args.scene_name]

    if args.all:
        scene_names = [s["name"] for s in nusc.scene]

    scenes = [
        (
            log_nuscenes(
                nusc, scene_name, max_time_sec=args.seconds, sample_hz=args.hz
            ),
            scene_name,
        )
        for scene_name in scene_names
    ]
    write_upload_json(scenes)

PCD Ontologies

PCD Projects support the following object label types:

Cuboids
Segmentation
Polylines
Keypoints

Project Settings

Configure Label Editor templates to streamline the annotation and review experience for your Taskers.

Settings and Controls

Pin new issue

Hot key: Ctrl + ICreates a new issue on the data unit.

Center content

Hot key: Shift + CThe POV centers on the object (vehicle, robot, drone) that captured the PCD.

Use geometric center

Use this feature together with Center content.

ON: All points in the PCD workspace (on all axis) are averaged. Using Center content brings the POV to the center of all axis in the workspace.
OFF: Using Center content brings to POV to the object that captured the PCD.

Zoom in

Hot key: Shift + up arrow.Zooms into the workspace.

Zoom out

Hot key: Shift + down arrowZooms out of the workspace.

Ground height

Specifies the “ground height” in the workspace. You can specify any value between the highest and lowest points, of the PCD, on the X axis.

Radius indicators

Hot key: Option + RRadius indicators are useful guides when annotating in 3D space. For example, you might only want to annotate anything that comes within 3 meters of your object (vehicle, robot, drone). You would then use a radius of three and only annotate anything within that radius.

ON: Displays one or more radii centered on the object (vehicle, robot, drone) that captured PCD. You can select the color used for all radii.
OFF: Hides all the radii.

Toggle merged cloud point view

Hot key: Option + M

ON: Displays all PCD from all frames in the workspace at once.
OFF: Displays the PCD for that specific frame.

Scene slicer

Hot key: Option + Shift + SThe Scene slicer is a planar point cloud data selection guard. Place and rotate a Scene slicer to confine point cloud data selection. Point cloud data on the opposite side cannot be selected for annotation.

View options

Display or hide various elements in the PCD space.

Show top view: Displays view from the top in the right hand work area.
Show left view: Displays view from the left in the right hand work area.
Show right view: Displays view from the right in the right hand work area.
Show control hints: Displays workspace navigation hints.
Show camera switcher: Displays video views for the Scene.

Editor Settings - Scene

Input device

Specify using a mouse or trackpad to rotate or pan in the workspace.

We STRONGLY recommend using a mouse to rotate or pan in the workspace.

Display grid

Show or hide a grid to aid in annotating PCD.

Display camera helpers

Show or hide a camera helper lines to aid in annotating PCD.

Point cloud size

Specify the size of points in the workspace.

Point cloud colour

Specify the color of PCD in the workspace based on a number of options.

Display point clouds

ON: Show all PCD in the workspace.
OFF: Hide all PCD in the workspace.

Display background

ON: Displays background in the workspace.
OFF: Hides background in the workspace.

Merged point cloud view

ON: Displays all PCD from all frames in the workspace at once.
OFF: Displays the PCD for that specific frame.

Point cloud opacity in 2D views

Specifies the opacity of PCD in 2D views.

The default value is 0. PCD does not display in 2D views with a value of 0.

Display cloud points from specific locations

Specify PCD source to display in the workspace.

Editor

Click into the PCD workspace to navigate inside the workspace. The keyboard hints display when you can navigate the workspace.

Move

Use the WASD keys to move along the X axis in the workspace.

Elevate

Use the QE keys to move along the Y axis in the workspace.

Rotate

Hold down the scroll wheel and move your mouse to rotate in the workspace.

Pan

Hold down the right mouse button and move your mouse to pan around the workspace.

Label and Review PCD Data

We strongly recommend that Taskers use a mouse when annotating or reviewing Scenes. Using a mouse makes annotating or reviewing Scenes significantly easier.

Click Start task or Initiate to annotate a PCD data unit. The Label Editor opens with a PCD data unit ready for annotation.
Use the Editor controls and Toolbar buttons to navigate the PCD workspace.
Use the General Settings to customize and streamline the PCD workspace.
Select an object label from the left-hand menu and begin annotating the PCD data unit.
Use your input device (mouse or trackpad) to create a label in the PCD workspace.
Select the label and adjust the label from the right-hand space and from the Toolbar.

Copy labels from one frame to another using Command + C and Command + V.
Copy labels from within the same frame using Command + C and Command + Shift + V.

Scene Formats

Encord supports various ways of importing/registering Scenes. All examples use the InputScene format as the root structure.

URL to PCD File

A Scene consisting of a single PCD file, in cloud storage: https://example.com/left_001.pcd This format automatically validates that the URL has a supported file extension

Stream of PCD Files without Timestamps

This format consists of multiple PCD files organized into a stream structure. For multiple PCD files without timing information, Encord assigns implicit timestamps of 1, 2, 3. At every time point T, Encord displays the last PCD available at or before T.

{
  "left_lidar": {
    "type": "point_cloud",
    "events": [
      {
        "uri": "https://example.com/left_001.pcd"
      },
      {
        "uri": "https://example.com/left_002.pcd"
      }
    ]
  }
}

Stream of PCD Files with Timestamps

The main difference between datasets that have timestamps and that do not is that Encord treats items in “frames” or sees time as continuous. If temporal information is missing from the PCD stream you can add the timestamp:

{
  "left_lidar": {
    "type": "point_cloud",
    "events": [
      {
        "uri": "https://example.com/left_001.pcd",
        "timestamp": 1634567890.123
      },
      {
        "uri": "https://example.com/left_002.pcd",
        "timestamp": 1634567890.223
      }
    ]
  }
}

A number of timestamp formats are supported (see Timestamp Formats).

Multiple PCD Streams (Left and Right Sensors)

Two synchronized PCD streams from different sensors:

{
  "left_lidar": {
    "type": "point_cloud",
    "events": [
      {
        "uri": "https://example.com/left_001.pcd",
        "timestamp": 1634567890.123
      },
      {
        "uri": "https://example.com/left_002.pcd",
        "timestamp": 1634567890.223
      }
    ]
  },
  "right_lidar": {
    "type": "point_cloud",
    "events": [
      {
        "uri": "https://example.com/right_001.pcd",
        "timestamp": 1634567890.125
      },
      {
        "uri": "https://example.com/right_002.pcd",
        "timestamp": 1634567890.225
      }
    ]
  }
}

Add Frame of Reference for Ego Vehicle

Add a frame of reference stream to represent an ego vehicle’s pose over time:

{
  "ego_vehicle": {
    "type": "frame_of_reference",
    "id": "ego_vehicle",
    "events": [
      {
        "timestamp": 1634567890.123,
        "pose": {
          "position": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0,
            "w": 1.0
          }
        }
      },
      {
        "timestamp": 1634567890.223,
        "pose": {
          "position": {
            "x": 1.5,
            "y": 0.2,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.1,
            "w": 0.995
          }
        }
      }
    ]
  },
  "left_lidar": {
    "type": "point_cloud",
    "frameOfReference": "ego_vehicle",
    "events": [
      {
        "uri": "https://example.com/left_001.pcd",
        "timestamp": 1634567890.123
      },
      {
        "uri": "https://example.com/left_002.pcd",
        "timestamp": 1634567890.223
      }
    ]
  },
  "right_lidar": {
    "type": "point_cloud",
    "frameOfReference": "ego_vehicle",
    "events": [
      {
        "uri": "https://example.com/right_001.pcd",
        "timestamp": 1634567890.125
      },
      {
        "uri": "https://example.com/right_002.pcd",
        "timestamp": 1634567890.225
      }
    ]
  }
}

Sensor-Specific Frames of Reference with Calibration

Create individual frames of reference for each sensor, with calibration relative to the ego vehicle:

{
  "ego_vehicle": {
    "type": "frame_of_reference",
    "id": "ego_vehicle",
    "events": [
      {
        "timestamp": 1634567890.123,
        "pose": {
          "position": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0,
            "w": 1.0
          }
        }
      },
      {
        "timestamp": 1634567890.223,
        "pose": {
          "position": {
            "x": 1.5,
            "y": 0.2,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.1,
            "w": 0.995
          }
        }
      }
    ]
  },
  "left_lidar_frame": {
    "type": "frame_of_reference",
    "id": "left_lidar_frame",
    "parentForId": "ego_vehicle",
    "events": [
      {
        "timestamp": 1634567890.123,
        "pose": {
          "position": {
            "x": 0.0,
            "y": 0.5,
            "z": 1.8
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0,
            "w": 1.0
          }
        }
      }
    ]
  },
  "right_lidar_frame": {
    "type": "frame_of_reference",
    "id": "right_lidar_frame",
    "parentForId": "ego_vehicle",
    "events": [
      {
        "timestamp": 1634567890.123,
        "pose": {
          "position": {
            "x": 0.0,
            "y": -0.5,
            "z": 1.8
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0,
            "w": 1.0
          }
        }
      }
    ]
  },
  "left_lidar": {
    "type": "point_cloud",
    "frameOfReference": "left_lidar_frame",
    "events": [
      {
        "uri": "https://example.com/left_001.pcd",
        "timestamp": 1634567890.123
      },
      {
        "uri": "https://example.com/left_002.pcd",
        "timestamp": 1634567890.223
      }
    ]
  },
  "right_lidar": {
    "type": "point_cloud",
    "frameOfReference": "right_lidar_frame",
    "events": [
      {
        "uri": "https://example.com/right_001.pcd",
        "timestamp": 1634567890.125
      },
      {
        "uri": "https://example.com/right_002.pcd",
        "timestamp": 1634567890.225
      }
    ]
  }
}

Image Stream without Camera Parameters

Add an image stream using an existing camera reference:

{
  "ego_vehicle": {
    "type": "frame_of_reference",
    "id": "ego_vehicle",
    "events": [
      {
        "timestamp": 1634567890.123,
        "pose": {
          "position": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0,
            "w": 1.0
          }
        }
      },
      {
        "timestamp": 1634567890.223,
        "pose": {
          "position": {
            "x": 1.5,
            "y": 0.2,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.1,
            "w": 0.995
          }
        }
      }
    ]
  },
  "left_lidar_frame": {},
  "right_lidar_frame": {},
  "left_lidar": {},
  "right_lidar": {},
  "front_camera_images": {
    "type": "image",
    "frameOfReference": "ego_vehicle",
    "camera": "front_camera_id",
    "events": [
      {
        "uri": "https://example.com/image_001.jpg",
        "timestamp": 1634567890.123
      },
      {
        "uri": "https://example.com/image_002.jpg",
        "timestamp": 1634567890.223
      }
    ]
  }
}

Image Stream with Camera Calibration

Add full camera calibration parameters with the image stream:

{
  "ego_vehicle": {
    "type": "frame_of_reference",
    "id": "ego_vehicle",
    "events": [
      {
        "timestamp": 1634567890.123,
        "pose": {
          "position": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.0,
            "w": 1.0
          }
        }
      },
      {
        "timestamp": 1634567890.223,
        "pose": {
          "position": {
            "x": 1.5,
            "y": 0.2,
            "z": 0.0
          },
          "rotation": {
            "x": 0.0,
            "y": 0.0,
            "z": 0.1,
            "w": 0.995
          }
        }
      }
    ]
  },
  "left_lidar_frame": {},
  "right_lidar_frame": {},
  "left_lidar": {},
  "right_lidar": {},
  "front_camera_images": {
    "type": "image",
    "frameOfReference": "ego_vehicle",
    "camera": {
      "widthPx": 1920,
      "heightPx": 1080,
      "intrinsics": {
        "type": "simple",
        "fx": 1000.0,
        "fy": 1000.0,
        "ox": 960.0,
        "oy": 540.0,
        "model": {
          "type": "pinhole"
        }
      },
      "extrinsics": {
        "position": {
          "x": 0.0,
          "y": 0.0,
          "z": 2.0
        },
        "rotation": {
          "x": 0.0,
          "y": 0.0,
          "z": 0.0,
          "w": 1.0
        }
      }
    },
    "events": [
      {
        "uri": "https://example.com/image_001.jpg",
        "timestamp": 1634567890.123
      },
      {
        "uri": "https://example.com/image_002.jpg",
        "timestamp": 1634567890.223
      }
    ]
  }
}

Format Details

Pose Representations

Poses can be specified in multiple formats. Named Position + Quaternion:

{
  "position": {
    "x": 1.0,
    "y": 2.0,
    "z": 3.0
  },
  "rotation": {
    "x": 0.0,
    "y": 0.0,
    "z": 0.0,
    "w": 1.0
  }
}

Named Position + Euler Angles:

{
  "position": {
    "x": 1.0,
    "y": 2.0,
    "z": 3.0
  },
  "rotation": {
    "x": 0.1,
    "y": 0.2,
    "z": 0.3
  }
}

4x4 Affine Transform Matrix (column-major):

//(@formatter:off)
[
  1, 0, 0, 0,
  0, 1, 0, 0,
  0, 0, 1, 0,
  1, 2, 3, 1
]

Timestamp Formats

Encord supports multiple timestamp formats:

Unix timestamp (float): 1634567890.123
Unix timestamp (int): 1634567890
ISO datetime string: “2021-10-18T10:31:30.123Z”
Time-only string: “10:31:30.123”

Scene Configuration

You can specify coordinate system conventions:

{
  "worldConvention": {
    "x": "right",
    "y": "forward",
    "z": "up"
  },
  "cameraConvention": {
    "x": "right",
    "y": "down",
    "z": "forward"
  },
  "content": {...} // Your scene content here
}

PCD Concepts

Sensor Data Streams

Streams comprise a sequence of messages coming out of a sensor (LiDAR, camera, accelerometer) at discrete moments in time.

Stream Rendering and Data Access

Encord renders the latest available data per stream, where at any given time point T, the most recent data available at or before T from that stream is displayed. For streams without explicit timestamps, Encord assigns implicit sequential timestamps (1, 2, 3, etc.), allowing for consistent temporal ordering while maintaining flexibility in data ingestion.

Frame of Reference Hierarchies

Hierarchical Coordinate System Organization

Frame of reference hierarchies establish spatial relationships between different coordinate systems in a tree structure, with a root frame at the top. A typical hierarchy includes:

World Frame: The global reference frame, often representing a fixed point in the environment
Ego Vehicle Frame: The coordinate system of the primary platform (car, robot, etc.)
Sensor Frames: Individual coordinate systems for each sensor, positioned relative to the vehicle frame

Static vs Dynamic Transformations

Frame relationships can be either static (fixed relative position/orientation) or dynamic (changing over time). Static transforms are used for rigidly mounted sensors, while dynamic transforms represent moving parts or the motion of the entire system through the world. Dynamic transforms are represented as messages in a stream.

Coordinate System Conventions

Different domains use different coordinate system conventions. The format allows specification of both world and camera coordinate conventions:

World Convention: Typically right-handed systems where axes x,y,z represent directions like “right,” “forward,” and “up”
Camera Convention: Often follows computer vision conventions where axes might represent “right,” “down,” and “forward”

Camera Calibration and Image Distortion

Intrinsic Camera Parameters

To project 3D world points onto the 2D image plane, we need camera calibration that involves determining both intrinsic and extrinsic parameters. Intrinsic parameters are specific to the camera hardware and include:

Focal Length (fx, fy): The distance between the camera lens and the image sensor, typically measured in pixels
Principal Point (cx, cy): The coordinates of the image center where the optical axis intersects the image plane
Skew Coefficient: Accounts for non-rectangular pixels (rarely used in modern cameras)

Extrinsic Camera Parameters

Extrinsic parameters define the camera’s position and orientation in 3D space relative to the world coordinate system:

Rotation Matrix (R): Describes the camera’s orientation using a 3x3 rotation matrix
Translation Vector (t): Specifies the camera’s position in world coordinates

Lens Distortion Correction

Camera lenses can introduce distortions that cause straight lines to appear curved in images. The format supports distortion correction through distortion coefficients that model:

Radial Distortion: Caused by light rays bending more near the lens edges, creating barrel or pincushion effects
Tangential Distortion: Results from lens misalignment with the image sensor

Data Format Architecture

The format supports two primary data organization approaches: Message-Based JSON: Every stream and message within a stream is in a big JSON scene file. The data of point clouds or images themselves are URIs to external files. Container-Based Storage: Single files (MCAP, ROS bag, DB3) contain multiple sensor streams and their messages and include the point cloud and image data

Get Started

General

Index

Annotate

Active

Other

Point Cloud Data Projects

Point Cloud Data Supported

Data for PCD Projects

PCD Ontologies

Project Settings

Settings and Controls

Editor Settings - Scene

Editor

Label and Review PCD Data

Scene Formats

URL to PCD File

Stream of PCD Files without Timestamps

Stream of PCD Files with Timestamps

Multiple PCD Streams (Left and Right Sensors)

Add Frame of Reference for Ego Vehicle

Sensor-Specific Frames of Reference with Calibration

Image Stream without Camera Parameters

Image Stream with Camera Calibration

Format Details

Pose Representations

Timestamp Formats

Scene Configuration

PCD Concepts

Sensor Data Streams

Stream Rendering and Data Access

Frame of Reference Hierarchies

Hierarchical Coordinate System Organization

Static vs Dynamic Transformations

Coordinate System Conventions

Camera Calibration and Image Distortion

Intrinsic Camera Parameters

Extrinsic Camera Parameters

Lens Distortion Correction

Data Format Architecture

Get Started

General

Index

Annotate

Active

Other

​Point Cloud Data Supported

​Data for PCD Projects

​PCD Ontologies

​Project Settings

​Settings and Controls

​Toolbar

​Editor Settings - Scene

​Editor

​Label and Review PCD Data

​Scene Formats

​URL to PCD File

​Stream of PCD Files without Timestamps

​Stream of PCD Files with Timestamps

​Multiple PCD Streams (Left and Right Sensors)

​Add Frame of Reference for Ego Vehicle

​Sensor-Specific Frames of Reference with Calibration

​Image Stream without Camera Parameters

​Image Stream with Camera Calibration

​Format Details

​Pose Representations

​Timestamp Formats

​Scene Configuration

​PCD Concepts

​Sensor Data Streams

​Stream Rendering and Data Access

​Frame of Reference Hierarchies

​Hierarchical Coordinate System Organization

​Static vs Dynamic Transformations

​Coordinate System Conventions

​Camera Calibration and Image Distortion

​Intrinsic Camera Parameters

​Extrinsic Camera Parameters

​Lens Distortion Correction

​Data Format Architecture

Point Cloud Data Supported

Data for PCD Projects

PCD Ontologies

Project Settings

Settings and Controls

Toolbar

Editor Settings - Scene

Editor

Label and Review PCD Data

Scene Formats

URL to PCD File

Stream of PCD Files without Timestamps

Stream of PCD Files with Timestamps

Multiple PCD Streams (Left and Right Sensors)

Add Frame of Reference for Ego Vehicle

Sensor-Specific Frames of Reference with Calibration

Image Stream without Camera Parameters

Image Stream with Camera Calibration

Format Details

Pose Representations

Timestamp Formats

Scene Configuration

PCD Concepts

Sensor Data Streams

Stream Rendering and Data Access

Frame of Reference Hierarchies

Hierarchical Coordinate System Organization

Static vs Dynamic Transformations

Coordinate System Conventions

Camera Calibration and Image Distortion

Intrinsic Camera Parameters

Extrinsic Camera Parameters

Lens Distortion Correction

Data Format Architecture