Trace
Every prompt, tool call, retrieval, and token — laid out as a structured tree. Search by intent, not by log line.
Halt is the observability and intervention layer for production AI agents. See every step, kill bad runs in flight, and replay any execution down to the token.
Three primitives wired into your SDK. No proxies, no sidecars, no schema rewrites. Drop in one import and the rest is opinionated for you.
Every prompt, tool call, retrieval, and token — laid out as a structured tree. Search by intent, not by log line.
Define guardrails in code or in the UI. When an execution crosses a line, Halt severs the run mid-flight — under 80 ms.
Re-run any past trace against new prompts, models, or tools. Diff the outputs side-by-side. Ship the change with proof.
A structured filter language that maps 1:1 to SQL. Save common slices as views — your team sees the same picture you do.
Per-agent, per-customer, per-feature. Set hard caps and Halt enforces them at the SDK boundary.
Built-in evaluators for faithfulness, toxicity, and tool-correctness — or BYO function.
Slack, PagerDuty, Linear, or a plain HTTP endpoint. Stream new guardrail events to the same place your humans live.
Halt wraps your model client. Anything that goes through it gets a trace, a guardrail check, and a kill switch — automatically.
# pip install halt-sdk from halt import Halt from openai import OpenAI client = Halt.wrap(OpenAI(), project="support-prod") response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "refund order 8214"}], tools=tools, ) # every call now traced. guardrails enforced. cost tracked.
// npm i @halt/sdk import { Halt } from "@halt/sdk"; import OpenAI from "openai"; const client = Halt.wrap(new OpenAI(), { project: "support-prod" }); const res = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "refund order 8214" }], tools, });
# send a trace from any language curl https://ingest.halt.dev/v1/traces \ -H "Authorization: Bearer $HALT_KEY" \ -H "Content-Type: application/json" \ -d '{"agent":"support.v3","model":"gpt-4o","input":"refund 8214"}'
Traditional observability tells you something broke after the fact. By then your agent has refunded the wrong customer, looped through $40 of tokens, or written to prod.
No per-event billing. No retention games. Halt's guardrails and replay are included on every tier — because killing a bad run shouldn't cost extra.
Drop in the SDK, watch traces stream in. If the first guardrail saves you a Postmortem, the rest of the year is free.