Private beta for startup AI workforces

Know what every agent did, what it shipped, how it performed.

Three agents. One bill. No way to tell who earned their keep. Cockpit gives each agent a persistent work record: identity, activity, connected tools, and the foundation for per-agent KPIs. Decide which agents to scale, swap, or retire from data — not from gut feel.

The problem

Which of your agents are earning their keep?

You're paying for them. They're shipping work. But they all act through your OAuth, the bill aggregates into one number, and there's no per-agent KPI to tell you which ones are earning their keep.

Without Cockpit

Subscriptions and API spend roll up into one bill, paid by one OAuth user. No way to tell which agent earned its keep.

April 2026 — AI workforce costs
Claude Code · Max plan$200.00
Cursor · Pro$20.00
Anthropic API$1,718.00
Total$1,938.00
Attributed to: Sam Williams (1 OAuth user)

$1,938 paid. One name on every line.

With Cockpit

Same bill. Mapped onto the agent record: runtime, model, action count, reviewed output, and manual burn context today; provider billing attribution next.

April 2026 — Per-agent record
AtlasAIClaude Code · OpusOn track
241 actions · 21 commits · 2 reviews$200/mo
NovaAICursor · SonnetOn track
127 actions · 14 PRs merged$20/mo
JaxAIHermes · SonnetReview
89 actions · 8 reviews$300/mo
AstraAIOpenClaw · APIInvestigate
67 actions · 5 specs$1,418
Total — 4 agents · 524 actions$1,938 manual burn

Same $1,938. Astra has fewer actions and most of the manually logged API burn. That is the review question.

The gap isn't access — it's record-keeping. Productivity tools weren't built for AI workforces. Cockpit fills the gap: per-agent attribution today, then per-agent cost and output metrics on top of the same record.

What gets measured gets done

The performance review system for your AI workforce.

Four surfaces that turn three agents and one shared work account into a workforce record you can review. Built for startups running on Claude Code, Cursor, Codex, Hermes, OpenClaw, or any custom harness.

Identity

Every agent on your team gets a Cockpit-managed identity. A persistent agent label flows into connected tools — Linear live today, GitHub and Notion in beta — so actions are attributable to the agent that triggered them, not just to your OAuth account.

Activity

Connected bridge events land in a per-agent activity record: issue changes, comments, PR work, and handoff notes as each bridge comes online. Linear is live; GitHub and Notion are beta.

Cost

Manual burn tracking today; provider billing attribution next. Known subscriptions and API spend can sit beside the agent record now, with automatic per-agent billing as the provider data lands.

Performance

Per-agent KPIs are the goal: output, cost, acceptance, reliability, and review status. Cockpit starts with the identity and activity record those reviews need.

Tamper-evident activity log

Activity events are hash-chained. Each event references the SHA-256 hash of the previous one, so if a past entry is silently rewritten after the chain starts, verification surfaces the mismatch.

Pre-chain history is shown honestly as legacy context; new records verify from the chain genesis forward.

Cockpit doesn't tell you whether your agents are working. It tells you whether they're earning their keep.

What Cockpit is not

We sit at a specific layer. Pair us with the rest.

The agentic AI stack has many tools doing different jobs. Cockpit is the performance layer — the measurement and review system. It doesn't replace these; it sits alongside them.

Not an LLM observability tool.

If you want prompt-by-prompt traces, token counts, and latency graphs, use LangSmith, Langfuse, or Helicone — they live inside the model layer. Cockpit lives one layer up: measuring what each agent shipped, not what the model said. Pair them.

Not an agent orchestration framework.

Building your agents from scratch — tool calls, planning loops, retries — is a job for LangChain, AutoGen, or your own harness. Cockpit doesn’t run your agents. It records and reviews them once they’re running.

Not a project management replacement.

Linear, Notion, GitHub, Jira stay where they are. Cockpit makes connected feeds work for an AI workforce: bridge events attributed to the agent that did them, not collapsed under your OAuth user.

Not a governance or enforcement layer.

Guardrails, policy engines, and compliance tools (Cedar, prompt-injection filters, runtime sandboxes) decide what agents are allowed to do. Cockpit measures what they did. Different layer, different buyer, different decision. Pair them when you need both.

Architecture

The record between AI providers and the work tools.

AI providers make the agents. Productivity tools host the work. Cockpit captures what each agent did between them — across the connected tools they touch.

AI providers
Anthropic · OpenAI · Microsoft · Google
Where the agents come from
Cockpit
Accountability layer
Where every action is captured, attributed, and prepared for review
Productivity tools
Linear · Notion · GitHub · Jira · Confluence
Where the work happens

How agents wire up

One MCP config. Connected actions recorded.

Cockpit ships an MCP server. Any MCP-capable runtime — Claude Code, Cursor, Codex, Hermes, OpenClaw, custom harnesses — connects with one config block. Connected bridge calls from that point flow through Cockpit's record.

~/.claude/mcp.json
{
  "mcpServers": {
    "cockpit": {
      "type": "http",
      "url": "https://getcockpit.co/api/mcp",
      "headers": {
        "Authorization": "Bearer ck_agent_..."
      }
    }
  }
}

Each agent gets its own ck_agent_*key. Paste in the runtime's MCP config; reconnect.

Every action attributed

Linear bridge calls flow through Cockpit today. GitHub and Notion follow the same model in beta. Each event is stamped with the calling agent's ID, timestamped, and added to the per-agent record.

Manual burn now

Known spend can be logged against the agent record now. When provider billing is connected, the same agent key becomes the route to automatic per-agent burn.

Runtime self-attested

On handshake the agent reports its runtime (e.g. Claude Code · 1.2.3). Cockpit displays that on the agent file — no manual model picker to keep up to date.

The surfaces

The record, made tangible.

Six places where the per-agent record shows up day-to-day.

Agent file

Per-agent record: identity, agent type/model, runtime, recent activity, connected tools, tracked-since date, and review context. The review file for an AI worker.

Live activity feed

Connected events from the bridge, attributed by agent name instead of collapsed under one human OAuth account.

Burn Rate

Manual spend tracking today; per-agent provider cost attribution next. Keep the cost discussion attached to the agent record.

Spend review

Agents or subscriptions with spend and little recent activity. Useful today as a manual review surface; stronger when billing data is connected.

Connect panel

Per-agent bridge key, MCP setup, test ping. Where a profile becomes a wired-up worker.

Connection status

Three states (Connected / Pending / Cosmetic) on every agent surface. The live view fades agents that aren't actually wired.

Why the abstraction matters

Above the orchestration layer. Across every tool.

Cockpit doesn't replace your workspace, your agents, or your runtime. It records what they do, regardless of where.

Multi-agent, one OAuth

Atlas, Nova, Jax share your Linear OAuth. Without Cockpit they all show as you. With it, each gets its own attributed record.

Cross-tool unified record

Linear shows you Linear. Notion shows you Notion. Cockpit starts showing you Atlas across connected bridge events: Linear live, GitHub and Notion beta.

Bring your own everything

Your runtime, your agents, your hosting. Cockpit just sits above and records. If you leave, the record exports cleanly.

Bridge coverage

Linear live. GitHub and Notion beta.

Cockpit's bridge model works against tools with APIs and webhooks. We are showing the real rollout status plainly so the demo stays credible.

Linear

Linear

Live
  • Comment on issues
  • Create issues
  • Update issues
  • List teams
Notion

Notion

Beta
  • Comment on pages
  • Create pages
  • Search pages and databases
GitHub

GitHub

Beta
  • Comment on issues
  • Comment on PRs
  • Create issues
  • List repos

Jira and Confluence next. Slack / Discord / Telegram are deliberately out of scope — comm apps already have native bot frameworks.

Stop guessing. Start measuring your AI workforce.

Private beta pricing for startup teams.