Engineering

Performance Data Model#

This file expands the performance tables listed in data-model.md.

Agent Profiles#

  • id
  • manager_id
  • name
  • agent_type
  • provider
  • model
  • status
  • created_at
  • updated_at

Agent profiles are managed-agent records. They let Performance group sessions, score, token volume, and cost by the developer responsible for that agent.

Agent Spans#

  • id
  • developer_id
  • session_id
  • task_id
  • agent_profile_id
  • span_type
  • name
  • status
  • provider
  • model
  • input_tokens
  • output_tokens
  • cache_read_tokens
  • cache_write_tokens
  • total_tokens
  • cost_usd
  • telemetry_source
  • cost_source
  • verified
  • raw_usage_hash
  • source_event_id
  • submitted_cost_usd
  • cost_mismatch
  • latency_ms
  • created_at

Agent spans store usage and performance counters only. Spans are parsed from structured Claude/Codex/OpenAI harness, provider, proxy, or OTel payloads and keep a SHA-256 hash of the raw payload for auditability. Do not store raw prompts, model responses, or full telemetry payloads in these rows.

cost_source = 'unknown' means token counters are verified but the server cannot produce a trusted cost. That can happen because no trusted pricing rule or provider estimate exists, or because the source sent only an aggregate total_tokens value without input/output/cache buckets. Performance rollups keep those tokens as unpricedTokens; aggregate-only rows also roll up as splitUnknownTokens so the UI can say token split unknown instead of implying the model price itself is missing. Spend totals and cost-per-task use priced spend only and surface unknown cost as a warning, not as free usage. Codex local sync submits split response.completed usage counters only; if a Codex source cannot provide the split counters, the companion should skip the Codex usage event instead of creating an aggregate-only row.

Derived Views#

Performance trend, weekly operating pulse, and activity-grid data is derived from tasks, work_events, usage_events, agent_sessions, and agent_spans.

Daily grid rollups include completed tasks, work events, usage events, spans, tokens, cost, and active developers. No separate dashboard state is stored.

Performance leaderboards, team scorecards, operating-target attainment, managed-agent registry health, agent insight reports, spend allocation, selectable developer profiles, manager drilldowns, review cards, per-developer task ledgers, manager briefs, live monitors, alert triage, and action reports are derived at read time from the same records.

They store no manager notes or recommendations as source-of-truth state.

View as .md