blog · 2026-04-10 · 8 min read

Claude Code stream-json: the output format that changes everything

--output-format stream-json is the third value most scripts leave off. It's also what turns claude -pfrom a batch script into something you can build a live dashboard against. Here's what it actually emits, how to consume it, and why it's the quiet load-bearing feature behind every background agent worth shipping.

The three output formats

From the Claude Code docs, --output-format takes three values:

text — plain prose, the default. Human-readable, machine-hostile.
json — one envelope at the end with result, session_id, and metadata. Great for scripts that want “what was the answer” as a single value.
stream-json — newline-delimited JSON. Each line is a standalone event, emitted as it happens. Great for anything that needs to know what Claude is doing, right now.

The first two are batch. The third is a stream. That is the entire article, but let's earn it.

What stream-json actually emits

To get token-level streaming, you need three flags together:

claude -p "Explain recursion" \
  --output-format stream-json \
  --verbose \
  --include-partial-messages

What comes out on stdout is newline-delimited JSON. Each line is one event with at minimum a type field, often a subtype, and the payload. Events cover everything from session start, tool calls and tool results, partial assistant messages (token deltas), permission requests, retries, and the final result.

A concrete event shape: `system/api_retry`

One of the most useful event shapes is the one Claude Code emits when an upstream API request fails and it's about to retry. From the docs, the shape is:

{
  "type": "system",
  "subtype": "api_retry",
  "attempt": 1,
  "max_retries": 5,
  "retry_delay_ms": 2000,
  "error_status": 429,
  "error": "rate_limit",
  "uuid": "…",
  "session_id": "…"
}

The error categories are documented: rate_limit, server_error, authentication_failed, billing_error, invalid_request, max_output_tokens, unknown. If you're building any kind of supervisor over long-running Claude Code jobs, this one event alone is the difference between “my run just hung for 90 seconds” and “my run is waiting out a rate limit, here's the progress bar.”

The jq one-liner worth memorizing

The docs ship this jq filter, and it's the single most useful starter consumer for stream-json. It tails the stream and prints assistant text tokens as they arrive, nothing else:

claude -p "Write a poem" \
  --output-format stream-json --verbose --include-partial-messages \
  | jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text'

The -r outputs raw strings, -j joins without newlines. Together they let tokens stream continuously to stdout the way a human would read them. Pipe that into tee, ts, a websocket, a Linear comment — whatever.

A minimal Node consumer

If jq isn't your thing, here's a completestream-json consumer in about 30 lines of Node. It handles line-buffering correctly (events can straddle chunk boundaries, which is the #1 bug in naive implementations) and dispatches on type:

import { spawn } from "node:child_process";

const child = spawn("claude", [
  "-p", "Review this diff",
  "--output-format", "stream-json",
  "--verbose",
  "--include-partial-messages",
]);

let buf = "";
child.stdout.on("data", (chunk) => {
  buf += chunk.toString("utf8");
  let nl;
  while ((nl = buf.indexOf("\n")) !== -1) {
    const line = buf.slice(0, nl);
    buf = buf.slice(nl + 1);
    if (!line.trim()) continue;
    let ev;
    try { ev = JSON.parse(line); } catch { continue; }
    onEvent(ev);
  }
});

child.on("exit", (code) => {
  if (buf.trim()) { try { onEvent(JSON.parse(buf)); } catch {} }
  process.exit(code ?? 0);
});

function onEvent(ev) {
  switch (ev.type) {
    case "stream_event":
      if (ev.event?.delta?.type === "text_delta") {
        process.stdout.write(ev.event.delta.text);
      }
      break;
    case "system":
      if (ev.subtype === "api_retry") {
        console.error(`[retry ${ev.attempt}/${ev.max_retries}] ${ev.error} in ${ev.retry_delay_ms}ms`);
      }
      break;
    case "result":
      console.error(`[done] session=${ev.session_id}`);
      break;
  }
}

The whole pattern fits on a napkin. Buffer chunks, split on newline, parse each line as JSON, dispatch. That's it. The hardest bug here is forgetting to carry the partial line across chunks — the loop above handles it.

Why this changes everything

Because once you have a typed stream of events, you can build things you can't build against batch output:

Live progress in the issue tracker. Every tool call, every file read, every model turn streamed as an update on the Linear or GitHub issue that triggered the run. The agent narrates itself.
Cancellation that actually works. If you can see a stream_event land, you can also decide to send SIGINT to the child process based on what you just saw. Batch output gives you pass/fail at the end; streams give you a kill switch.
Progress bars in CI.A 40-turn run is ~3 minutes of silence in batch mode. In streaming mode it's a ticker tape — your GitHub Actions annotation can update every few seconds with the current tool call.
Retry visibility. The system/api_retryevent means you can tell the difference between “Claude is thinking” and “Anthropic's API is rate-limiting us, here's the ETA.”
Replay and audit. Every event carries a session_id. Log the stream, you can reconstruct exactly what the agent did, step by step, months later. This is the audit trail your security team keeps asking about.

The production angle

Everything above is the “you could build this” version. The interesting question is what happens when you actually do. The answer is: you've built one tenth of a background agent, and the rest of it is webhooks, OAuth, worktree isolation, two-way sync with the issue tracker, and a UI that renders the stream you're now buffering.

That's the whole shape of the Linear background agent loop. stream-json is the engine that makes the agent narratable. The rest is a lot of glue nobody enjoys writing.

Takeaways

--output-format stream-json emits newline-delimited JSON events as they happen. Always pair it with --verbose and --include-partial-messages if you want token-level streaming.
Every event has a type; the interesting ones are stream_event (token deltas, tool calls) and system (with subtype api_retry etc.).
A correct consumer buffers bytes, splits on newline, parses per line. ~30 lines in Node.
If you find yourself wanting progress bars, cancellation, replay, or live issue comments, you're describing a background agent. Consider buying the webhooks and OAuth instead of writing them.

the glue, already shipped

Cyrus consumes stream-json for you.

Streams agent activity back to Linear and GitHub issues in real time. Isolated git worktrees per run, rich interactions including approvals, and a zero-cost Community self-hosted plan. BYOK across Claude, Codex, Cursor, and Gemini.

Try Cyrus free →