Claude Code stream-json: the output format that changes everything
--output-format stream-json is the third value most scripts leave off. It's also what turns claude -pfrom a batch script into something you can build a live dashboard against. Here's what it actually emits, how to consume it, and why it's the quiet load-bearing feature behind every background agent worth shipping.
The three output formats
From the Claude Code docs, --output-format takes three values:
text— plain prose, the default. Human-readable, machine-hostile.json— one envelope at the end withresult,session_id, and metadata. Great for scripts that want “what was the answer” as a single value.stream-json— newline-delimited JSON. Each line is a standalone event, emitted as it happens. Great for anything that needs to know what Claude is doing, right now.
The first two are batch. The third is a stream. That is the entire article, but let's earn it.
What stream-json actually emits
To get token-level streaming, you need three flags together:
claude -p "Explain recursion" \
--output-format stream-json \
--verbose \
--include-partial-messagesWhat comes out on stdout is newline-delimited JSON. Each line is one event with at minimum a type field, often a subtype, and the payload. Events cover everything from session start, tool calls and tool results, partial assistant messages (token deltas), permission requests, retries, and the final result.
A concrete event shape: system/api_retry
One of the most useful event shapes is the one Claude Code emits when an upstream API request fails and it's about to retry. From the docs, the shape is:
{
"type": "system",
"subtype": "api_retry",
"attempt": 1,
"max_retries": 5,
"retry_delay_ms": 2000,
"error_status": 429,
"error": "rate_limit",
"uuid": "…",
"session_id": "…"
}The error categories are documented: rate_limit, server_error, authentication_failed, billing_error, invalid_request, max_output_tokens, unknown. If you're building any kind of supervisor over long-running Claude Code jobs, this one event alone is the difference between “my run just hung for 90 seconds” and “my run is waiting out a rate limit, here's the progress bar.”
The jq one-liner worth memorizing
The docs ship this jq filter, and it's the single most useful starter consumer for stream-json. It tails the stream and prints assistant text tokens as they arrive, nothing else:
claude -p "Write a poem" \
--output-format stream-json --verbose --include-partial-messages \
| jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text'The -r outputs raw strings, -j joins without newlines. Together they let tokens stream continuously to stdout the way a human would read them. Pipe that into tee, ts, a websocket, a Linear comment — whatever.
A minimal Node consumer
If jq isn't your thing, here's a completestream-json consumer in about 30 lines of Node. It handles line-buffering correctly (events can straddle chunk boundaries, which is the #1 bug in naive implementations) and dispatches on type:
import { spawn } from "node:child_process";
const child = spawn("claude", [
"-p", "Review this diff",
"--output-format", "stream-json",
"--verbose",
"--include-partial-messages",
]);
let buf = "";
child.stdout.on("data", (chunk) => {
buf += chunk.toString("utf8");
let nl;
while ((nl = buf.indexOf("\n")) !== -1) {
const line = buf.slice(0, nl);
buf = buf.slice(nl + 1);
if (!line.trim()) continue;
let ev;
try { ev = JSON.parse(line); } catch { continue; }
onEvent(ev);
}
});
child.on("exit", (code) => {
if (buf.trim()) { try { onEvent(JSON.parse(buf)); } catch {} }
process.exit(code ?? 0);
});
function onEvent(ev) {
switch (ev.type) {
case "stream_event":
if (ev.event?.delta?.type === "text_delta") {
process.stdout.write(ev.event.delta.text);
}
break;
case "system":
if (ev.subtype === "api_retry") {
console.error(`[retry ${ev.attempt}/${ev.max_retries}] ${ev.error} in ${ev.retry_delay_ms}ms`);
}
break;
case "result":
console.error(`[done] session=${ev.session_id}`);
break;
}
}The whole pattern fits on a napkin. Buffer chunks, split on newline, parse each line as JSON, dispatch. That's it. The hardest bug here is forgetting to carry the partial line across chunks — the loop above handles it.
Why this changes everything
Because once you have a typed stream of events, you can build things you can't build against batch output:
- Live progress in the issue tracker. Every tool call, every file read, every model turn streamed as an update on the Linear or GitHub issue that triggered the run. The agent narrates itself.
- Cancellation that actually works. If you can see a
stream_eventland, you can also decide to sendSIGINTto the child process based on what you just saw. Batch output gives you pass/fail at the end; streams give you a kill switch. - Progress bars in CI.A 40-turn run is ~3 minutes of silence in batch mode. In streaming mode it's a ticker tape — your GitHub Actions annotation can update every few seconds with the current tool call.
- Retry visibility. The
system/api_retryevent means you can tell the difference between “Claude is thinking” and “Anthropic's API is rate-limiting us, here's the ETA.” - Replay and audit. Every event carries a
session_id. Log the stream, you can reconstruct exactly what the agent did, step by step, months later. This is the audit trail your security team keeps asking about.
The production angle
Everything above is the “you could build this” version. The interesting question is what happens when you actually do. The answer is: you've built one tenth of a background agent, and the rest of it is webhooks, OAuth, worktree isolation, two-way sync with the issue tracker, and a UI that renders the stream you're now buffering.
That's the whole shape of the Linear background agent loop. stream-json is the engine that makes the agent narratable. The rest is a lot of glue nobody enjoys writing.
Takeaways
--output-format stream-jsonemits newline-delimited JSON events as they happen. Always pair it with--verboseand--include-partial-messagesif you want token-level streaming.- Every event has a
type; the interesting ones arestream_event(token deltas, tool calls) andsystem(with subtypeapi_retryetc.). - A correct consumer buffers bytes, splits on newline, parses per line. ~30 lines in Node.
- If you find yourself wanting progress bars, cancellation, replay, or live issue comments, you're describing a background agent. Consider buying the webhooks and OAuth instead of writing them.
Cyrus consumes stream-json for you.
Streams agent activity back to Linear and GitHub issues in real time. Isolated git worktrees per run, rich interactions including approvals, and a zero-cost Community self-hosted plan. BYOK across Claude, Codex, Cursor, and Gemini.
Try Cyrus free →