Skip to main content

Metadata

Docent supports metadata at multiple levels:
  • Collection metadata attached to the collection itself
  • Agent run metadata attached to an AgentRun
  • Transcript group metadata attached to a TranscriptGroup
  • Transcript metadata attached to a Transcript
Any metadata should be JSON serializable. When metadata is rendered/stored, Docent converts it to JSON-compatible values using Pydantic’s serializer (which supports common Python collections and nested Pydantic models).

Choosing a metadata level

  • Use collection metadata for information shared by the entire collection, such as dataset provenance, eval configuration, environment, or model family.
  • Use agent run metadata for values that vary run to run, especially scores or other fields you want to analyze across a collection.
  • Use transcript group or transcript metadata for finer-grained context within a single run.

Collection metadata

Collection metadata lives on the collection rather than on individual runs. It is a good fit for collection-wide configuration and provenance.

From the Python SDK

from docent import Docent

client = Docent()
collection_id = "..."

client.update_collection_metadata(
    collection_id,
    {
        "dataset": "helpdesk_jan_2026",
        "config": {
            "model": "gpt-5",
            "prompt_version": "v3",
        },
    },
)

metadata = client.get_collection_metadata(collection_id)

metadata_after_delete, not_found = client.delete_collection_metadata_keys(
    collection_id,
    ["config.prompt_version"],
)
Updates are deep-merged into the existing collection metadata, so patching config.model does not remove unrelated keys under config. Deletions support dot paths for nested keys.

From tracing

from docent.trace import collection_metadata, initialize_tracing

initialize_tracing("customer-support-evals")

collection_metadata(
    {
        "dataset": "helpdesk_jan_2026",
        "environment": "staging",
        "config": {
            "model": "gpt-5",
            "prompt_version": "v3",
        },
    }
)
You can call collection_metadata() any time after initialize_tracing(). Unlike agent_run_metadata(), it does not require an active agent run or transcript context.

Agent run, transcript group, and transcript metadata

We recommend including information about metrics / scores in metadata, as well as other information about the agent or task setup. Scoring fields are useful for tracking metrics, like task completion or reward, but they are a convention rather than a required schema. Neither AgentRun nor Transcript enforces required metadata keys. Here’s an example of what a typical agent run metadata dict might look like:
metadata = {
    # Optional conventional fields
    "scores": {"reward_1": 0.1, "reward_2": 0.5, "reward_3": 0.8},
    # Custom fields
    "episode": 42,
    "policy_version": "v1.2.3",
    "training_step": 12500,
}
If you’re using Inspect, docent.loaders.load_inspect also contains a load_inspect_log function which reads the standard scoring and metadata information from Inspect logs and copies them into Docent metadata.