> ../patterns/approval_queue.md
§ 01 · The problem with "human in the loop"
Every serious agent deployment ends up with a human in the loop somewhere. The phrase has become a checkbox. It is treated as a safety property you either have or do not have. In production it is neither binary nor free.
A human in the loop has a cost, and the cost is latency plus attention. If the agent escalates everything, the human becomes the bottleneck the agent was supposed to remove. If the agent escalates nothing, the human is decoration. Most teams land in a worse place than either: the agent escalates an unpredictable mix of trivial and critical items, the human learns that most items are trivial, and they start approving without reading. Now you have a human in the loop who is functionally not in the loop. You have built a rubber stamp and called it governance.
The approval queue pattern exists to make the human expensive on purpose, and to spend that expense only where it changes the outcome.
§ 02 · The core idea
An approval queue is a durable, ordered list of proposed actions that the agent has decided it should not execute on its own authority. The agent does the reasoning and the drafting. It stops at the action. A human approves, edits, or rejects. The decision and the human who made it are recorded.
Three properties make this a pattern rather than a button:
◆ It is durable. An approval item survives a crash, a restart, and the human going home for the weekend. It lives in a table, not in memory.
◆ It is ordered and bounded. Items have priority and an age. An item that sits unapproved past its SLA is itself an event that triggers escalation, not silence.
◆ It is reversible at the boundary, not after. The human decides before the irreversible action, not after a notification that it already happened.
The unit of the queue is the proposed action, never the conversation. You are not asking a human to review a transcript. You are asking them to approve one specific, executable thing.
── The four fields ──
§ 03 · The four fields every approval item must carry
An approval item that contains only "the agent wants to do X, approve?" forces the human to reconstruct context they do not have. They will either over-investigate (slow) or approve blind (dangerous). Every item must carry four fields.
◆ The action. The exact, executable operation, in plain language and in its concrete form. Not "send outreach." Instead: "Send this email, shown in full, to this address." The human approves the artifact, not a description of it.
◆ The justification. Why the agent proposes this now. The trigger, the rule, the data that led here. One or two sentences. If the agent cannot state why, that is itself a reason to reject.
◆ The blast radius. What this action touches and how hard it is to undo. "One outbound email, not recallable" is a different decision than "one row update, fully reversible." The human is pricing risk. Give them the price.
◆ The confidence and the alternative. What the agent would do if this were rejected, and how sure it is. A low-confidence proposal with an obvious fallback is a fast approve or fast reject. A high-confidence proposal with no fallback deserves the human's full attention.
These four fields turn a vague ask into a decision a busy person can make in seconds without becoming a rubber stamp.
── What belongs in the queue ──
§ 04 · What goes in the queue, and what does not
The discipline is in what you do not route. An approval queue that contains everything is a denial-of-service attack on your own operators.
Route to the queue when at least one is true: the action is irreversible or expensive to undo, the action is externally visible (a customer sees it, money moves, a record leaves your system), or the agent's confidence is below a threshold you set per action class.
Do not route when the action is internal, reversible, and inside the agent's defined scope. Reading data, drafting, summarizing, updating a record the agent owns and can roll back: these are inside the boundary. If you route them, you train your operators to stop reading, and the one item that mattered slips through behind forty that did not.
The threshold is a dial, not a default. Set it per action class. Outbound customer communication might require approval at any confidence. An internal tag update might require approval only below 70 percent. Write the thresholds down. They are part of the system, not a runtime guess.
── The failure modes ──
§ 05 · The failure modes
Rubber stamp. The queue fills with low-stakes items, the operator approves in bulk, and the queue stops being a control. Fix: aggressively reduce what enters the queue, and measure approval time per item. If median approval time drops below the time it takes to read the action, your operators are not reading.
Queue as graveyard. Items pile up unapproved because no one owns the queue. The agent stalls or, worse, starts taking the unapproved actions because a timeout was wired to "proceed." Fix: every queue has an owner and an SLA, and the SLA breach escalates to a person, never to auto-approve.
Context starvation. Operators approve blind because the four fields are missing or thin. Fix: treat a proposal with a weak justification as a defect in the agent, not a judgment call for the human.
Silent scope creep. The set of actions that bypass the queue grows over time, each addition reasonable, until the agent is doing things no one is reviewing. Fix: the routing rules live in version control and are reviewed on the same cadence as the trust boundary.
── The file that holds the queue ──
§ 06 · The contract behind the queue
Create an APPROVALS.md in the agent repository. It lists every action class the agent can propose, the routing rule for each (always, never, or below a confidence threshold), the SLA for each priority level, the escalation target when an SLA is breached, and the owner of the queue. It records the last review date.
This is the document your operations lead reads on day one and your auditor reads on the worst day. An approval queue without a written contract is not a control. It is a habit, and habits drift.
── End of pattern ──
◆ A human in the loop is a cost. Spend it only where it changes the outcome.
◆ The unit of the queue is one executable action with four fields, never a transcript.
If your median approval time is shorter than the time to read the action, you do not have a human in the loop. You have a rubber stamp.
ORBIRESEARCH