> ## Documentation Index
> Fetch the complete documentation index at: https://docs.encord.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Physical AI Data Lifecycle

> Understand and design the end-to-end data lifecycle for Physical AI systems, from raw sensor ingestion to continuous model improvement.

# Physical AI data lifecycle

Physical AI systems depend on **high-fidelity data pipelines** that preserve spatial, temporal, and multimodal context. This page outlines a proven lifecycle for managing data in robotics, autonomy, and real-world AI systems.

## 1. Data ingestion and synchronization

Physical AI starts with complex sensor inputs:

* Multi-camera video
* LiDAR / point clouds
* Audio, documents, telemetry, and metadata
* Time-synchronized sensor streams

The goal at this stage is to **ingest and organize raw data without losing context**.

**Recommended docs**

* [Register Cloud Data](/platform-documentation/Curate/add-files/index-register-cloud-data)
* [Files](/platform-documentation/Curate/index-files)
* [Supported Data](/platform-documentation/General/general-supported-data)

***

## 2. Structuring datasets for iteration

Once ingested, data should be structured so teams can iterate quickly:

* Group related sensor streams
* Preserve timelines across modalities
* Attach metadata for environment, conditions, and scenarios

This enables efficient filtering and targeted labeling later.

**Recommended docs**

* [Data Groups](/end-to-end/Features/e2e-data-groups)
* [Custom Metadata](/platform-documentation/Curate/custom-metadata/index-metadata-schema)
* [Datasets](/platform-documentation/Annotate/annotate-datasets/annotate-datasets)

***

## 3. Intelligent data curation

Not all data should be labeled.

Curation ensures effort is spent on:

* Edge cases
* Rare failure modes
* Under-represented scenarios
* High-impact samples

Use filtering, embeddings, and collections to *intentionally* select what matters.

**Recommended docs**

* [Getting Started with Index](/platform-documentation/Curate/index-getting-started)
* [Embedding Plots](/platform-documentation/Curate/embedding-plots)
* [Collections](/platform-documentation/Curate/curation-basics)

***

## 4. Annotation and review

For Physical AI, annotation must respect:

* Temporal continuity
* Cross-sensor consistency
* 3D and spatial constraints
* Evolving label definitions

Annotation workflows should be paired with structured review and QA from the start.

**Recommended docs**

* [Get Started with Annotate](/platform-documentation/Annotate/annotate-gettingstarted/data-annotation)
* [Annotate & Review](/platform-documentation/Annotate/annotate-label-editor/annotate-label-editor-annotate)
* [Create a Project](/platform-documentation/GettingStarted/gettingstarted-create-project)

***

## 5. Evaluation and feedback loops

Model outputs should feed directly back into the data pipeline:

* Identify failure cases
* Compare predictions vs ground truth
* Re-prioritize data for re-labeling

This closes the loop between deployment and training.

**Recommended docs**

* [Quality Metrics](/platform-documentation/Validation/label-validation-basics#quality-metrics)
* [Model Evaluation](/platform-documentation/Validation/active-how-to/active-model-predictions-eval)
* [Analytics View](/platform-documentation/Validation/label-validation-basics#analytics-view)

***

## 6. Continuous improvement at scale

At scale, Physical AI programs require:

* Automation for repetitive tasks
* Measurable quality standards
* Distributed teams with clear roles
* Auditable workflows

This is where automation and agent-based workflows provide leverage.

**Recommended docs**

* [Agents](/platform-documentation/Annotate/automated-labeling/annotate-agents-overview)

***

## Key takeaway

A strong Physical AI system isn’t just a model — it’s a **data flywheel**:

> Ingest → Curate → Annotate → Evaluate → Refine → Repeat

The faster and more intentionally you move through this loop, the faster your models improve.
