Metadata

Docent supports metadata at multiple levels:

Collection metadata attached to the collection itself
Agent run metadata attached to an AgentRun
Transcript group metadata attached to a TranscriptGroup
Transcript metadata attached to a Transcript

Any metadata should be JSON serializable. When metadata is rendered/stored, Docent converts it to JSON-compatible values using Pydantic’s serializer (which supports common Python collections and nested Pydantic models).

Choosing a metadata level

Use collection metadata for information shared by the entire collection, such as dataset provenance, eval configuration, environment, or model family.
Use agent run metadata for values that vary run to run, especially scores or other fields you want to analyze across a collection.
Use transcript group or transcript metadata for finer-grained context within a single run.

Collection metadata

Collection metadata lives on the collection rather than on individual runs. It is a good fit for collection-wide configuration and provenance.

From the Python SDK

from docent import Docent

client = Docent()
collection_id = "..."

client.update_collection_metadata(
    collection_id,
    {
        "dataset": "helpdesk_jan_2026",
        "config": {
            "model": "gpt-5",
            "prompt_version": "v3",
        },
    },
)

metadata = client.get_collection_metadata(collection_id)

metadata_after_delete, not_found = client.delete_collection_metadata_keys(
    collection_id,
    ["config.prompt_version"],
)

Updates are deep-merged into the existing collection metadata, so patching config.model does not remove unrelated keys under config. Deletions support dot paths for nested keys.

From tracing

from docent.trace import collection_metadata, initialize_tracing

initialize_tracing("customer-support-evals")

collection_metadata(
    {
        "dataset": "helpdesk_jan_2026",
        "environment": "staging",
        "config": {
            "model": "gpt-5",
            "prompt_version": "v3",
        },
    }
)

You can call collection_metadata() any time after initialize_tracing(). Unlike agent_run_metadata(), it does not require an active agent run or transcript context.

Agent run, transcript group, and transcript metadata

We recommend including information about metrics / scores in metadata, as well as other information about the agent or task setup. Scoring fields are useful for tracking metrics, like task completion or reward, but they are a convention rather than a required schema. Neither AgentRun nor Transcript enforces required metadata keys. Here’s an example of what a typical agent run metadata dict might look like:

metadata = {
    # Optional conventional fields
    "scores": {"reward_1": 0.1, "reward_2": 0.5, "reward_3": 0.8},
    # Custom fields
    "episode": 42,
    "policy_version": "v1.2.3",
    "training_step": 12500,
}

If you’re using Inspect, docent.loaders.load_inspect also contains a load_inspect_log function which reads the standard scoring and metadata information from Inspect logs and copies them into Docent metadata.

Get Started

Agent Skills

Tutorials

Core Concepts

Self-Hosting

Metadata

Metadata

Choosing a metadata level

Collection metadata

From the Python SDK

From tracing

Agent run, transcript group, and transcript metadata

Get Started

Agent Skills

Tutorials

Core Concepts

Self-Hosting

​Metadata

​Choosing a metadata level

​Collection metadata

​From the Python SDK

​From tracing

​Agent run, transcript group, and transcript metadata

Metadata

Choosing a metadata level

Collection metadata

From the Python SDK

From tracing

Agent run, transcript group, and transcript metadata