Skip to content

Tool call approval

Scenario. The model wants to call sensitive tools (send_email, delete_user). Before they run, we ask the user. The reply is either “approve” (run it), “deny” (don’t run it; surface why), or nothing at all (treat as cancelled). Whatever the verdict, every function_call ends up with a matching function_call_output in history — the wire-protocol invariant every modern provider enforces.

The HITL integration is explicit stream composition. Approval helpers split calls into “run this” and “return this synthetic result”; the recipe then chooses how to combine Toolkit.executeAll with those rejected outputs. Nothing in the executor knows about approval policy.

Two transport flavors

Same primitives, different approval planner:

  • HTTP (primary) — approvals arrive synchronously bundled with the next request. fromApprovalMap(predicate, approvals)(calls) returns { approved, rejected }; missing entries synthesize cancelled. Pure function. No queue, no router fiber. The simplest path.
  • Queue (enhancement) — long-lived channel (WebSocket / SSE). fromVerdictQueue(predicate, queue)(calls) returns safe calls up front, emits ApprovalRequested events, and exposes a decision stream for gated calls as verdicts arrive.

Pick HTTP if your transport is request-shaped. Pick the queue variant if you’ve got a persistent connection and want a streaming UI.

HTTP variant (primary)

import { fromApprovalMap, type ApprovalMapEntry } from "@effect-uai/core/Resolvers"
import { toFunctionCallOutput } from "@effect-uai/core/Outcome"
import * as Toolkit from "@effect-uai/core/Toolkit"
streamUntilComplete<State, ToolEvent>((turn) =>
Effect.sync(() => {
const calls = Turn.functionCalls(turn)
// No requested tools means there is nothing to approve or execute.
if (calls.length === 0) return stop
const plan = fromApprovalMap(isSensitive, approvals)(calls)
const events = Stream.merge(
// Approved calls run; denied/missing approvals still become outputs.
Toolkit.executeAll(allTools, plan.approved),
Toolkit.outputEvents(plan.rejected),
)
return Toolkit.nextStateFrom(events, (results) =>
Turn.appendTurn(state, turn, results.map(toFunctionCallOutput)),
)
}),
)

approvals is a ReadonlyMap<string, ApprovalMapEntry> keyed by call_id. Entries are either { decision: "approve" } or { decision: "deny", reason?: string }. A gated call without an entry is placed in plan.rejected as a cancelled result. Toolkit.outputEvents turns those results into Output events, and nextStateFrom keeps them in history so the next provider request stays well-formed.

Reconciling history before the next request

If the previous request left orphan function_calls (user navigated away, server crashed, approvals timed out), the next request has to synthesize closure outputs before submitting. That’s an entry-point concern, not the recipe’s:

import {
cancelAllPending,
findUnansweredCalls,
} from "@effect-uai/core/HistoryCheck"
import { toFunctionCallOutput } from "@effect-uai/core/Outcome"
// In your HTTP route handler, before invoking httpConversation:
const stored = await store.load(req.sessionId)
const closures = cancelAllPending(stored, "user moved on")
const reconciledHistory = [
...stored,
...closures.map(toFunctionCallOutput),
Items.userText(req.body.message),
]
return httpConversation(req.body.approvals, { history: reconciledHistory })

Use whenever a checkpoint, timeout, or new user message could leave function_calls without matching outputs.

Queue variant (enhancement)

For long-lived connections where verdicts arrive over time. Safe calls start immediately. Gated calls emit ApprovalRequested, then each one waits until its specific verdict lands on the queue.

import { fromVerdictQueue } from "@effect-uai/core/Resolvers"
streamUntilComplete<State, ToolEvent>((turn) =>
Effect.sync(() => {
const calls = Turn.functionCalls(turn)
// No requested tools means there is nothing to approve or execute.
if (calls.length === 0) return stop
// `Stream.unwrap` supplies the Scope that fromVerdictQueue's router
// fiber lives in. Router stays alive as long as the consumer pulls
// events.
const events = Stream.unwrap(
Effect.gen(function* () {
const { approved, decisions, announce } = yield* fromVerdictQueue(
isSensitive,
verdicts,
)(calls)
return Stream.merge(
announce,
Stream.merge(
// Safe calls start immediately; gated calls resume as verdicts arrive.
Toolkit.executeAll(allTools, approved),
decisions.pipe(
Stream.flatMap((decision) =>
decision._tag === "Approved"
? Toolkit.executeAll(allTools, [decision.call])
: Stream.succeed(Toolkit.outputEvent(decision.result)),
),
),
),
)
}),
)
return Toolkit.nextStateFrom(events, (results) =>
Turn.appendTurn(state, turn, results.map(toFunctionCallOutput)),
)
}),
)

The consumer side typically taps ApprovalRequested events, posts verdicts onto the same queue, and renders Output results as they arrive.

What you get in state.history

Whatever the path — Value, denied, cancelled — every gated call ends up with a FunctionCallOutput carrying a structured payload:

Verdict / outcomeoutput JSON
Approved + executedThe tool’s own structured return value.
Denied{ "kind": "denied", "reason": "..." }
Cancelled (HTTP){ "kind": "cancelled" }
Tool execution error{ "kind": "execution_error", "reason": "..."}

History stays well-formed; the model reads the synthesized outputs on the next turn and self-corrects if needed.

Run it

Terminal window
OPENAI_API_KEY=sk-... pnpm tsx recipes/tool-call-approval/run.ts

The runner drives the queue variant since it’s the more visual demo (simulated user posting verdicts after a delay). The HTTP variant is better exercised by tests or in a real HTTP handler.

The full source lives next to this README at index.ts.