Three agents. One bill. No way to tell who earned their keep. Cockpit gives each agent a persistent work record: identity, activity, connected tools, and the foundation for per-agent KPIs. Decide which agents to scale, swap, or retire from data — not from gut feel.
The problem
You're paying for them. They're shipping work. But they all act through your OAuth, the bill aggregates into one number, and there's no per-agent KPI to tell you which ones are earning their keep.
Subscriptions and API spend roll up into one bill, paid by one OAuth user. No way to tell which agent earned its keep.
$1,938 paid. One name on every line.
Same bill. Mapped onto the agent record: runtime, model, action count, reviewed output, and manual burn context today; provider billing attribution next.
Same $1,938. Astra has fewer actions and most of the manually logged API burn. That is the review question.
The gap isn't access — it's record-keeping. Productivity tools weren't built for AI workforces. Cockpit fills the gap: per-agent attribution today, then per-agent cost and output metrics on top of the same record.
What gets measured gets done
Four surfaces that turn three agents and one shared work account into a workforce record you can review. Built for startups running on Claude Code, Cursor, Codex, Hermes, OpenClaw, or any custom harness.
Every agent on your team gets a Cockpit-managed identity. A persistent agent label flows into connected tools — Linear live today, GitHub and Notion in beta — so actions are attributable to the agent that triggered them, not just to your OAuth account.
Connected bridge events land in a per-agent activity record: issue changes, comments, PR work, and handoff notes as each bridge comes online. Linear is live; GitHub and Notion are beta.
Manual burn tracking today; provider billing attribution next. Known subscriptions and API spend can sit beside the agent record now, with automatic per-agent billing as the provider data lands.
Per-agent KPIs are the goal: output, cost, acceptance, reliability, and review status. Cockpit starts with the identity and activity record those reviews need.
Activity events are hash-chained. Each event references the SHA-256 hash of the previous one, so if a past entry is silently rewritten after the chain starts, verification surfaces the mismatch.
Pre-chain history is shown honestly as legacy context; new records verify from the chain genesis forward.
Cockpit doesn't tell you whether your agents are working. It tells you whether they're earning their keep.
What Cockpit is not
The agentic AI stack has many tools doing different jobs. Cockpit is the performance layer — the measurement and review system. It doesn't replace these; it sits alongside them.
If you want prompt-by-prompt traces, token counts, and latency graphs, use LangSmith, Langfuse, or Helicone — they live inside the model layer. Cockpit lives one layer up: measuring what each agent shipped, not what the model said. Pair them.
Building your agents from scratch — tool calls, planning loops, retries — is a job for LangChain, AutoGen, or your own harness. Cockpit doesn’t run your agents. It records and reviews them once they’re running.
Linear, Notion, GitHub, Jira stay where they are. Cockpit makes connected feeds work for an AI workforce: bridge events attributed to the agent that did them, not collapsed under your OAuth user.
Guardrails, policy engines, and compliance tools (Cedar, prompt-injection filters, runtime sandboxes) decide what agents are allowed to do. Cockpit measures what they did. Different layer, different buyer, different decision. Pair them when you need both.
Architecture
AI providers make the agents. Productivity tools host the work. Cockpit captures what each agent did between them — across the connected tools they touch.
How agents wire up
Cockpit ships an MCP server. Any MCP-capable runtime — Claude Code, Cursor, Codex, Hermes, OpenClaw, custom harnesses — connects with one config block. Connected bridge calls from that point flow through Cockpit's record.
{
"mcpServers": {
"cockpit": {
"type": "http",
"url": "https://getcockpit.co/api/mcp",
"headers": {
"Authorization": "Bearer ck_agent_..."
}
}
}
}Each agent gets its own ck_agent_*key. Paste in the runtime's MCP config; reconnect.
Linear bridge calls flow through Cockpit today. GitHub and Notion follow the same model in beta. Each event is stamped with the calling agent's ID, timestamped, and added to the per-agent record.
Known spend can be logged against the agent record now. When provider billing is connected, the same agent key becomes the route to automatic per-agent burn.
On handshake the agent reports its runtime (e.g. Claude Code · 1.2.3). Cockpit displays that on the agent file — no manual model picker to keep up to date.
The surfaces
Six places where the per-agent record shows up day-to-day.
Per-agent record: identity, agent type/model, runtime, recent activity, connected tools, tracked-since date, and review context. The review file for an AI worker.
Connected events from the bridge, attributed by agent name instead of collapsed under one human OAuth account.
Manual spend tracking today; per-agent provider cost attribution next. Keep the cost discussion attached to the agent record.
Agents or subscriptions with spend and little recent activity. Useful today as a manual review surface; stronger when billing data is connected.
Per-agent bridge key, MCP setup, test ping. Where a profile becomes a wired-up worker.
Three states (Connected / Pending / Cosmetic) on every agent surface. The live view fades agents that aren't actually wired.
Why the abstraction matters
Cockpit doesn't replace your workspace, your agents, or your runtime. It records what they do, regardless of where.
Atlas, Nova, Jax share your Linear OAuth. Without Cockpit they all show as you. With it, each gets its own attributed record.
Linear shows you Linear. Notion shows you Notion. Cockpit starts showing you Atlas across connected bridge events: Linear live, GitHub and Notion beta.
Your runtime, your agents, your hosting. Cockpit just sits above and records. If you leave, the record exports cleanly.
Bridge coverage
Cockpit's bridge model works against tools with APIs and webhooks. We are showing the real rollout status plainly so the demo stays credible.
Jira and Confluence next. Slack / Discord / Telegram are deliberately out of scope — comm apps already have native bot frameworks.
Private beta pricing for startup teams.