_private/qwestly-docs/Features/qwestly-agent/human-in-the-loop.md
Table of Contents
Human-in-the-Loop (HITL) Patterns
How to design an agentic system where some actions require human approval before executing. Where the gating logic lives, how frameworks support it, and what this means for Qwestly's "update my profile" vs "tell me a fact" distinction.
1. The Core Question
Not all agent actions are equal:
"What do you know about me?" → Read-only. No human needed.
"Generate a draft About section" → Creative. Show result, no approval needed.
"Update my LinkedIn headline" → Write. **Must** get human approval first.
"Save this as my active card" → Write. Probably needs confirmation.
"Delete my profile data" → Destructive. **Must** get confirmation.
The question is: where does this gating logic live?
The naive answer is "in the tool definition" — but that's only part of the story. The full answer spans tool definitions, the agent loop, the conversation state, and the UI.
2. Three Levels of Human-in-the-Loop
Level 1: Tool-Level Gating (The Tool Refuses)
The tool itself checks whether it has user confirmation before executing.
@tool
async def update_linkedin_headline(user_id: str, headline: str) -> UpdateResult:
"""Update the user's LinkedIn headline.
IMPORTANT: This tool must only be called AFTER the user has explicitly
confirmed the new headline. If called without confirmation, return an error
asking the agent to first show the proposed change and get approval.
"""
return UpdateResult(
status="requires_confirmation",
preview=headline,
message="Please confirm you want to update your headline to this."
)
How it works in the agent loop:
1. User: "Update my headline to 'CTO at Qwestly'"
2. Agent calls update_linkedin_headline → returns requires_confirmation
3. Agent sees the result, asks user: "I'd suggest changing your headline to
'CTO at Qwestly'. Should I go ahead?"
4. User: "Yes, do it"
5. Agent calls update_linkedin_headline again → this time it has
confirmation context → tool executes
Pros: Simple. Self-documenting (the tool description tells the LLM the rule). Cons: Relies on the LLM following instructions correctly. If the LLM ignores the "must confirm first" instruction and calls the tool anyway, the tool returns an error — but you've already paid for the LLM call that made the wrong choice.
Level 2: Orchestrator-Level Gating (The Framework Pauses)
The agent framework has a built-in concept of "this action requires human approval" and can pause execution mid-stream, wait for user input, then resume.
This is LangGraph's interrupt_before pattern:
graph = StateGraph(AgentState)
graph.add_node("call_llm", ...)
graph.add_node("execute_tool", ...)
graph.add_node("human_review", ...) # <-- pauses here
# Before executing certain tools, route through human review
def route_after_llm(state):
tool_name = state["tool_calls"][0].name
if tool_name in DANGEROUS_TOOLS:
return "human_review"
return "execute_tool"
graph.add_conditional_edges("call_llm", route_after_llm, {
"human_review": "human_review",
"execute_tool": "execute_tool",
})
# human_review node pauses and waits for user input
graph.add_node("human_review", lambda state: interrupt("User approval needed"))
How it works:
1. User: "Update my headline to 'CTO at Qwestly'"
2. LLM decides to call update_linkedin_headline
3. Router sees "update_linkedin_headline" in the DANGEROUS_TOOLS list
4. Graph pauses at "human_review" node
5. User sees: "Proposed change: update headline to 'CTO at Qwestly'. Approve?"
6. User approves → graph resumes → tool executes
Pros: The LLM can still make the wrong choice, but the framework enforces the gate — no tool gets called without human approval, regardless of what the LLM does. This is more reliable than Level 1. Cons: More infrastructure. Requires state management (checkpointing the agent's state so it can resume after the pause).
Level 3: UI-Level Gating (The Application Enforces)
The UI intercepts the agent's proposed action and shows a confirmation dialog before the response reaches the user.
LLM response comes back as structured data, not as final text:
{
"type": "proposed_action",
"action": "update_linkedin_headline",
"params": {"headline": "CTO at Qwestly"},
"preview": "I suggest changing your headline to 'CTO at Qwestly'"
}
The UI renders this as:
┌─────────────────────────────────────┐
│ 💡 I suggest updating your headline │
│ │
│ "CTO at Qwestly" │
│ │
│ [Approve] [Edit] [Deny] │
└─────────────────────────────────────┘
Only if the user clicks "Approve" does the actual write happen.
Pros: Hardest to bypass. The user sees every proposed mutation in a structured UI, not buried in prose. Cons: More frontend work. The agent doesn't stream free-form text as easily — it needs to produce structured action objects that the UI can render.
3. Which Level for Which Action?
Not every action needs the same level of gate:
| Action | Level | Why |
|---|---|---|
| Read profile data | None (L0) | No side effects |
| Generate a draft card | Tool-level (L1) | Agent can create drafts freely; user reviews results |
| Suggest an About section | Tool-level (L1) | Same — drafts are free |
| Write to LinkedIn | Orchestrator (L2) | Side effect on an external platform. Must be gated. |
| Save card as "active" | Orchestrator (L2) | Changes user-facing state. Deserves a confirmation gate. |
| Delete profile data | Orchestrator (L2) or UI (L3) | Destructive. Higher bar. |
| Post to LinkedIn on user's behalf | UI (L3) | Highest stakes. Should require explicit structured approval, not just a "sure, go ahead" in chat. |
For Qwestly v0, Level 1 (tool-level) is the pragmatic start. The tool refuses unless the user has confirmed. It's simple, works within any framework, and is "good enough" for an MVP. Upgrade to Level 2 for write operations when the agent loop justifies the extra infrastructure.
4. Confirmation Tracking in Practice
The trick: how does the tool know the user has confirmed? You need to track confirmation state across turns.
Pattern: Conversation Context
@tool
async def update_linkedin_headline(ctx: RunContext, user_id: str, headline: str) -> UpdateResult:
"""Update LinkedIn headline. Requires user confirmation first."""
# Check if user already confirmed in this conversation
if ctx.messages and _has_recent_confirmation(ctx.messages, "headline", headline):
# User confirmed — execute
result = await linkedin_api.update_headline(user_id, headline)
return UpdateResult(status="done", new_headline=headline)
else:
# No confirmation — ask
return UpdateResult(
status="needs_confirmation",
preview=headline,
prompt="I can update your headline. Say 'yes' or 'confirm' to proceed."
)
The agent sees the needs_confirmation result, asks the user, and the next turn includes the user's confirmation in the conversation history. When the agent calls the tool again, the confirmation context is in the messages.
Pattern: Explicit Confirmation Parameter
@tool
async def update_linkedin_headline(
ctx: RunContext,
user_id: str,
headline: str,
user_confirmed: bool = False, # <-- explicit gate
) -> UpdateResult:
"""Update LinkedIn headline. Set user_confirmed=True only after the user
has explicitly agreed to the proposed headline."""
if not user_confirmed:
return UpdateResult(
status="needs_confirmation",
preview=headline,
)
# Execute
...
This makes the confirmation requirement visible in the tool's schema. The LLM literally has to pass user_confirmed=True. The tool description tells the LLM when to set it.
Caveat: An LLM can still set user_confirmed=True without actual confirmation if it's confused. This is why Level 2 (orchestrator-enforced gating) is more robust for high-stakes actions.
5. Framework-Specific Support
LangGraph — First-Class HITL
LangGraph has the most mature HITL support:
# Define which nodes require human approval
graph.add_node("human_approval", lambda state: state)
# Route dangerous tool calls through approval node
def should_approve(state):
for call in state["tool_calls"]:
if call["name"] in REQUIRES_APPROVAL:
return "human_approval"
return "execute_tool"
# The interrupt pauses execution and waits
graph.add_node("human_approval", lambda state: interrupt({
"action": "approve_tool_calls",
"tool_calls": state["tool_calls"],
}))
# Resume with user's decision
graph.add_conditional_edges(
"human_approval",
lambda state: "execute_tool" if state["approved"] else "reject"
)
Key features:
interrupt()pauses and persists the entire graph state (checkpointing)- Resume with new input — the graph picks up where it paused
- Works with LangSmith for a built-in approval UI
- Can interrupt before a node (
interrupt_before=["execute_dangerous_tool"])
Pydantic AI — No Built-In HITL, But Easy to Build
Pydantic AI doesn't have a built-in interrupt primitive. Instead, you use the tool returns a structured result that says "needs approval" pattern (Level 1):
class UpdateHeadlineResult(BaseModel):
status: Literal["done", "needs_confirmation"]
preview: str | None = None
confirmation_token: str | None = None
@agent.tool
async def update_linkedin_headline(
ctx: RunContext,
user_id: str,
headline: str,
) -> UpdateHeadlineResult:
if not _user_has_confirmed(ctx.messages, "headline", headline):
return UpdateHeadlineResult(
status="needs_confirmation",
preview=headline,
)
# ... execute
return UpdateHeadlineResult(status="done", ...)
You could build Level 2 on top by wrapping the agent run in a loop that checks for confirmation markers and pauses, but it's manual.
OpenAI Agents SDK — Guardrails
The SDK has output guardrails that run after the LLM responds but before the response goes to the user:
from agents import GuardrailFunctionOutput, Runner, Agent
async def linkedin_guardrail(ctx, agent, output):
"""Block any LinkedIn write operations from executing."""
if hasattr(output, 'tool_calls'):
for call in output.tool_calls:
if call.name in DANGEROUS_TOOLS:
return GuardrailFunctionOutput(
tripwire_triggered=True,
message="This requires human approval. Ask the user first."
)
return GuardrailFunctionOutput(tripwire_triggered=False)
agent = Agent(
name="Orchestrator",
instructions=...,
tools=[...],
output_guardrails=[linkedin_guardrail],
)
This is a hybrid of Level 1 and 2 — the guardrail checks tool calls before execution, but it's running in application code, not in a persisted state machine.
6. Designing the Approval UX
How the user sees and acts on requests for approval matters as much as the backend logic.
Pattern A: Inline Chat Confirmation (v0, simplest)
User: Update my headline to "CTO at Qwestly"
Agent: I can do that. Here's the proposed change:
Headline: "CTO at Qwestly"
Should I go ahead? (yes/no)
User: Yes, do it
Agent: ✅ Done! Your headline has been updated.
Implementation: Simple. The tool returns needs_confirmation, agent asks in plain text, user responds in chat.
Pros: Zero frontend work. Works with any chat UI. Cons: Easy for the user to miss or misread. No visual distinction between "info" and "approval required."
Pattern B: Structured Action Card (better UX)
┌─────────────────────────────────────────┐
│ ✏️ Profile Change Requested │
│ │
│ Field: Headline │
│ Current: "Engineer at startup" │
│ Proposed: "CTO at Qwestly" │
│ │
│ ├──────────┤ ├──────────┤ ├──────────┤ │
│ │ Approve │ │ Edit │ │ Deny │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────┘
Implementation: The agent returns a structured JSON block (not plain text). The frontend detects it and renders a card instead of raw text.
Pros: Clear visual distinction. Hard to accidentally approve. Buttons prevent ambiguity. Cons: More frontend work. You need your chat UI to support custom renderers for structured content.
Pattern D: Draft-Review-Approve (the rich review UX)
This is the pattern you're describing — more like a code review or document editing experience than a simple yes/no gate. It's what Qwestly needs for meaty artifacts like generated cards, About sections, or full profile rewrites.
The flow looks like this:
User: "Write me a new LinkedIn About section"
Agent generates draft ──────────────────┐
▼
┌──────────────────────────────────────┐
│ Draft: About Section │
│ │
│ +---------------------------------+ │
│ | Current: | │
│ | "Software engineer building | │
│ | cool things at a startup..." | │
│ +---------------------------------+ │
│ │
│ +---------------------------------+ │
│ │ Proposed: │ │
│ │ "CTO at Qwestly, where I │ │
│ │ lead the engineering org │ │
│ │ building AI-powered career │ │
│ │ tools that help professionals │ │
│ │ tell their stories..." │ │
│ +---------------------------------+ │
│ │
│ [Approve] [Edit Draft] [Try Again] │
│ [ Start typing feedback... ] │
└──────────────────────────────────────┘
User clicks "Edit Draft" or types "Make it more technical"
Agent revises ─────────────────────────┐
▼
┌────────────────────────────────────┐
│ Draft v2: About Section (revised) │
│ │
│ (shows updated text with diff │
│ highlighting if possible) │
│ │
│ [Approve] [Edit Again] [Try Again] │
└────────────────────────────────────┘
User approves → agent saves/publishes
This is distinct from Patterns A-C in several ways:
| Aspect | Simple confirm (A-C) | Draft-review-approve (D) |
|---|---|---|
| What's shown | A single proposed value | Full rendered content with diff |
| User action | Approve / Deny | Approve, edit inline, request revision, reject |
| Iteration | None (one-shot) | Multiple rounds of revision |
| Diff | "Old → New" text | Side-by-side or track-changes for complex content |
| Applies to | Simple fields (headline, name) | Long-form content (cards, About sections, full profiles) |
What the frontend needs to support
This is the most involved pattern on the frontend side. The chat UI needs to handle:
1. Structured action blocks
The agent response contains a JSON block the frontend renders as a review card, not as raw text. The chat UI needs a custom renderer for action: "review_draft" messages.
2. Diff rendering For iterative edits, show what changed — not just the new version. A simple approach is a two-column layout (current / proposed) or inline highlighting (green for additions, red for removals). For v0, even a simple "Version 1 → Version 2" text comparison works.
3. Inline feedback Let the user type feedback directly into the review card ("Make the second paragraph shorter") rather than sending a separate chat message. This keeps the iteration context-bound.
4. Version history Track which draft version the user is looking at. If the agent revises twice, the user should be able to compare v1, v2, and v3 — not just see the latest.
5. Partial acceptance For a Qwestly Card with 5 sections, the user might want to accept 3 sections, request changes on 1, and rewrite 1 themselves. This requires section-level granularity in the review UI.
The agent side
The agent needs to understand it's in a draft iteration loop:
@agent.tool
async def draft_about_section(
ctx: RunContext,
user_id: str,
feedback: str | None = None,
draft_version: int | None = None,
) -> DraftResult:
"""Generate or revise a LinkedIn About section draft.
First call: generate initial draft (no feedback, no draft_version).
Subsequent calls: revise based on user feedback (pass feedback + draft_version).
Never publish without the user explicitly approving after seeing the draft.
Returns:
DraftResult with draft text, version number, and whether user approved.
"""
profile = await get_user_profile(user_id)
if feedback:
# Revision: incorporate user feedback
draft = await revise_draft(profile, feedback, draft_version)
else:
# Initial generation
draft = await generate_initial_draft(profile)
return DraftResult(
status="draft_ready",
draft_text=draft,
version=(draft_version or 0) + 1,
requires_approval=True,
)
Implementation complexity
| Feature | v0 effort | Approach |
|---|---|---|
| Agent returns structured draft blocks | Low | Return JSON via structured output |
| Frontend renders review card | Medium | Custom chat message renderer |
| Diff view (current vs proposed) | Medium | Simple text diff library (diff-match-patch) |
| Inline feedback form | Medium | Text input embedded in the review card |
| Iteration loop (revise → re-render) | Medium | Agent maintains draft_version counter |
| Side-by-side / section-level review | High | Requires section-aware diff in the card format |
Recommendation: Build Pattern C (inline chat) for v0, graduate to Pattern D when users consistently want to iterate on generated content. The backend pattern supports both — the agent returns structured draft data regardless of whether the frontend renders it as text or as a card.
How this connects to versioning and evals
Every draft iteration is a data point. Store them:
{
"draft_id": "draft_abc123",
"user_id": "usr_42",
"tool": "draft_about_section",
"versions": [
{"v": 1, "text": "...", "user_feedback": null},
{"v": 2, "text": "...", "user_feedback": "Make it more technical"},
{"v": 3, "text": "...", "user_feedback": "Add mention of AI"},
],
"final_outcome": "approved", // or "rejected", "abandoned"
"approval_delay_seconds": 120
}
This data lets you:
- See which drafts users approve vs abandon (quality signal)
- See how many revision rounds a typical draft needs (UX improvement signal)
- Build an eval dataset from approved drafts (ground truth for regression testing)
- Identify when the agent consistently misses the mark on first drafts and needs prompt improvements
Pattern C: Two-Phase (recommended for Qwestly)
Combine A and B — use inline chat for v0, upgrade to action cards later:
Phase 1 (v0): Agent says "Can I update your headline to X?" in plain text.
User types "yes" or "no".
Phase 2 (later): Same backend, but the agent response includes structured
action blocks that the UI renders as approval cards.
The backend pattern doesn't change between phases — the tool always requires confirmation. The only difference is whether the frontend renders the approval request as text or as a card.
7. The Qwestly Classification
Here's the full matrix of what should and shouldn't require human approval:
| Action | HITL? | Level | Rationale |
|---|---|---|---|
| Retrieve profile info | ❌ No | — | Read-only. Safe. |
| Answer "what do you know about me" | ❌ No | — | Synthesizing existing data. Safe. |
| Generate a card draft | ❌ No | — | Creates a draft artifact. User can discard. |
| Generate an About section draft | ❌ No | — | Same — draft, not published. |
| Suggest headline / photo / summary | ❌ No | — | Suggestions are drafts. |
| Ingest LinkedIn data | ✅ Yes | L1 (tool) | Pulls external data into the system. Should notify user. |
| Save card as "active" | ✅ Yes | L1 (tool) | Changes user-facing state. Confirm before making it "live." |
| Update LinkedIn profile | ✅ Yes | L2 (orchestrator minimum) | Publishes to an external platform. Must be gated. Accepting legal/product risk. |
| Delete profile data | ✅ Yes | L2 or L3 | Destructive. Higher bar than writes. |
| Share data with third party | ✅ Yes | L3 (UI) | Sensitive. Should require explicit structured approval, not just a chat "yes." |
Draft vs. Published distinction
This is the key design principle for Qwestly:
- Drafts: Never need approval. The agent generates freely. The user reviews afterwards.
- Published / Saved / Shared: Always need approval. The agent proposes, the user disposes.
The tool description for a "publish" action should explicitly state this:
@tool
async def publish_linkedin_update(user_id: str, content: str) -> PublishResult:
"""POST content to LinkedIn. This is irreversible — it actually updates
the user's LinkedIn profile.
**CRITICAL**: Never call this tool unless the user has explicitly and
unambiguously confirmed they want this content published. Always show
the user the exact content first and wait for their confirmation.
"""
...
8. Summary: What to Build for Qwestly v0
| Component | Approach |
|---|---|
| Draft generation | No HITL. Agent generates freely. User can iterate. |
| Save/publish actions | Tool-level gating (L1). Tool returns "needs_confirmation" unless user has confirmed. |
| LinkedIn write actions | Tool-level gating + guardrail warning in the prompt. Upgrade to orchestrator-enforced when you add the state infrastructure. |
| Confirmation UX | Inline chat for v0. "Can I update X to Y? (yes/no)". Structured action cards later. |
| Who enforces the gate | The tool itself (v0). The framework (v1). |
The tool-level pattern is the pragmatic start for v0. It's simple, works with any framework (Pydantic AI, LangGraph, OpenAI SDK), and is "good enough" for an MVP. The risk is the LLM calling a dangerous tool without proper confirmation — mitigated by writing good tool descriptions and testing that the agent follows them. When you see that risk materialize in production, upgrade to orchestrator-level gating (Level 2) for those specific tools.