$ backgroundclaude
blog · 2026-04-10 · 8 min read

Claude Code stream-json: the output format that changes everything

--output-format stream-json is the third value most scripts leave off. It's also what turns claude -pfrom a batch script into something you can build a live dashboard against. Here's what it actually emits, how to consume it, and why it's the quiet load-bearing feature behind every background agent worth shipping.

The three output formats

From the Claude Code docs, --output-format takes three values:

The first two are batch. The third is a stream. That is the entire article, but let's earn it.

What stream-json actually emits

To get token-level streaming, you need three flags together:

claude -p "Explain recursion" \
  --output-format stream-json \
  --verbose \
  --include-partial-messages

What comes out on stdout is newline-delimited JSON. Each line is one event with at minimum a type field, often a subtype, and the payload. Events cover everything from session start, tool calls and tool results, partial assistant messages (token deltas), permission requests, retries, and the final result.

A concrete event shape: system/api_retry

One of the most useful event shapes is the one Claude Code emits when an upstream API request fails and it's about to retry. From the docs, the shape is:

{
  "type": "system",
  "subtype": "api_retry",
  "attempt": 1,
  "max_retries": 5,
  "retry_delay_ms": 2000,
  "error_status": 429,
  "error": "rate_limit",
  "uuid": "…",
  "session_id": "…"
}

The error categories are documented: rate_limit, server_error, authentication_failed, billing_error, invalid_request, max_output_tokens, unknown. If you're building any kind of supervisor over long-running Claude Code jobs, this one event alone is the difference between “my run just hung for 90 seconds” and “my run is waiting out a rate limit, here's the progress bar.”

The jq one-liner worth memorizing

The docs ship this jq filter, and it's the single most useful starter consumer for stream-json. It tails the stream and prints assistant text tokens as they arrive, nothing else:

claude -p "Write a poem" \
  --output-format stream-json --verbose --include-partial-messages \
  | jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text'

The -r outputs raw strings, -j joins without newlines. Together they let tokens stream continuously to stdout the way a human would read them. Pipe that into tee, ts, a websocket, a Linear comment — whatever.

A minimal Node consumer

If jq isn't your thing, here's a completestream-json consumer in about 30 lines of Node. It handles line-buffering correctly (events can straddle chunk boundaries, which is the #1 bug in naive implementations) and dispatches on type:

import { spawn } from "node:child_process";

const child = spawn("claude", [
  "-p", "Review this diff",
  "--output-format", "stream-json",
  "--verbose",
  "--include-partial-messages",
]);

let buf = "";
child.stdout.on("data", (chunk) => {
  buf += chunk.toString("utf8");
  let nl;
  while ((nl = buf.indexOf("\n")) !== -1) {
    const line = buf.slice(0, nl);
    buf = buf.slice(nl + 1);
    if (!line.trim()) continue;
    let ev;
    try { ev = JSON.parse(line); } catch { continue; }
    onEvent(ev);
  }
});

child.on("exit", (code) => {
  if (buf.trim()) { try { onEvent(JSON.parse(buf)); } catch {} }
  process.exit(code ?? 0);
});

function onEvent(ev) {
  switch (ev.type) {
    case "stream_event":
      if (ev.event?.delta?.type === "text_delta") {
        process.stdout.write(ev.event.delta.text);
      }
      break;
    case "system":
      if (ev.subtype === "api_retry") {
        console.error(`[retry ${ev.attempt}/${ev.max_retries}] ${ev.error} in ${ev.retry_delay_ms}ms`);
      }
      break;
    case "result":
      console.error(`[done] session=${ev.session_id}`);
      break;
  }
}

The whole pattern fits on a napkin. Buffer chunks, split on newline, parse each line as JSON, dispatch. That's it. The hardest bug here is forgetting to carry the partial line across chunks — the loop above handles it.

Why this changes everything

Because once you have a typed stream of events, you can build things you can't build against batch output:

The production angle

Everything above is the “you could build this” version. The interesting question is what happens when you actually do. The answer is: you've built one tenth of a background agent, and the rest of it is webhooks, OAuth, worktree isolation, two-way sync with the issue tracker, and a UI that renders the stream you're now buffering.

That's the whole shape of the Linear background agent loop. stream-json is the engine that makes the agent narratable. The rest is a lot of glue nobody enjoys writing.

Takeaways

the glue, already shipped

Cyrus consumes stream-json for you.

Streams agent activity back to Linear and GitHub issues in real time. Isolated git worktrees per run, rich interactions including approvals, and a zero-cost Community self-hosted plan. BYOK across Claude, Codex, Cursor, and Gemini.

Try Cyrus free →