Hooks for safer Claude Code and Codex agents

Q: Does a hook replace human review?

No. Hooks intercept tool and completion events, but they do not evaluate product impact alone. Use hooks to deny mechanical risk and require evidence. Use human review for architecture, business behavior and merge decisions.

Hooks for coding agents are the layer that turns a security recommendation into an executable boundary. The agent still reasons, reads files, calls tools and tries to self-correct. The difference is that sensitive commands pass through predictable gates before touching network, secrets, shell, MCP or critical files.

In the current documentation, Claude Code organizes hooks into 3 cadences: once per session, once per turn and on every tool call inside the agentic loop (Claude Code Docs, "Hooks reference", retrieved 2026-07-05). That detail changes the architecture: security stops being a prompt and becomes an interception point.

Practical summary

Use PreToolUse to deny risk before execution.

Use PostToolUse to collect evidence and correct course.

Use Stop to require proof before the final summary.

Treat MCP, setup and network as permission boundaries.

Abstract flow shows a coding agent moving through gates, sandboxing and review with no visible text.

Why did hooks become a security layer?

In the current documentation, PreToolUse and PostToolUse run on every tool call inside the agentic loop (Claude Code Docs, "Hooks reference", retrieved 2026-07-05). Hooks became a security layer because the real risk appears when the agent tries to execute, not when it promises to follow instructions.

The practical problem is simple: strong agents follow context, setup notes, terminal errors and local documentation. That is excellent for productivity. It also creates a surface where untrusted content can push a dangerous action that looks like normal maintenance.

In June 2026, 0din published "Clone This Repo and I Own Your Machine", describing an attack where a normal-looking repository led an agent to open a reverse shell through setup, routine error handling and a payload fetched through DNS (0din, "Clone This Repo and I Own Your Machine", retrieved 2026-07-05). The point is not one specific tool. It is the risk class.

If you already built a harness for reliable agent PRs, hooks are the next natural step. The harness defines criteria. The hook enforces those criteria at the right moment.

Citation capsule: Hooks for coding agents reduce risk because they intercept action at execution time. Claude Code documentation separates hooks by session, turn and tool call, making it possible to deny commands before shell execution, record evidence after action and block loop completion without proof.

Where does the hook fit in the agentic loop?

In the current documentation, Claude Code's event table lists events such as SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, SubagentStart, Stop and SessionEnd (Claude Code Docs, "Hooks reference", retrieved 2026-07-05). The hook belongs where a decision changes cost, risk or evidence.

Think of the loop as four moments. First, the agent receives intent and context. Then it chooses a tool. Next, it observes the result. Finally, it decides whether to continue, correct itself or stop. You do not need hooks everywhere; you need hooks at the boundary where failure would be expensive.

PreToolUse is the permission gate. It can deny git push, loose curl, npm install in an unknown repo, .env reads and state-changing cloud commands. That decision should be deterministic, short and auditable.

PostToolUse is the evidence gate. It can run lint on an edited file, save a diff, require the nearest test or mark that an MCP tool returned external data. If the evidence fails, the agent gets a concrete reason to try a safer route.

Stop is the done gate. It should not become ceremony; it should prevent a final summary that says "done" without test evidence, diff context, residual risk and next step. This connects directly to PR evals in CI.

Abstract loop shows action, gate, evidence and retry for coding agents with no visible text.

Citation capsule: The right place for a hook is the decision boundary. PreToolUse blocks before action, PostToolUse turns results into evidence, and Stop prevents completion without proof. That split keeps the agent autonomous while removing implicit permission to cross risk without control.

What should be deny, ask or allow?

In the current documentation, Claude Code permissions use allow, ask and deny, and evaluate in the effective order deny, ask and allow (Claude Code Docs, "Configure permissions", retrieved 2026-07-05). That order matters because broad denial should not depend on the model's cooperation.

Use allow for repetitive, local and reversible commands. Tests, typecheck, reads and temporary artifact generation usually fit here. Even then, prefer exact commands. npm test is safer than allowing anything that starts with npm.

Use ask when the action can be legitimate but needs human intent. Updating a lockfile, installing a dependency, opening network access, running a local migration and changing a CI workflow are good candidates. The goal is not to freeze the agent; it is to mark a risk transition.

Use deny for boundaries that do not belong in a normal agentic session. Reading secrets, writing to .git, deploying, publishing a package, changing credentials, deleting broad directories and accessing an unexpected domain should fail early.

In my practice, I start with a small matrix. One row for shell. One for sensitive files. One for network. One for MCP. If the rule does not fit that matrix, it is probably still an opinion, not an operating policy.

Surface	Initial default	Reason
Local tests	Exact `allow`	Gives autonomy with low risk.
Install and network	Default `ask`	Setup can hide external payloads.
Secrets and credentials	Explicit `deny`	The agent does not need them to code.
Deploy and push	Local dev `deny`	External changes require a human.

Citation capsule: The allow, ask and deny matrix avoids confusing autonomy with unrestricted permission. Claude Code documentation says deny, ask and allow have defined precedence, so irreversible commands belong in deny, while local tests can receive narrow and auditable allow rules.

How does MCP change the threat?

In the 2025-06-18 specification, MCP defines 3 central roles: hosts, clients and servers, plus resources, prompts and tools exposed by servers (Model Context Protocol, "Specification", retrieved 2026-07-05). MCP changes the threat because an external tool becomes part of the agent loop.

MCP is excellent when it provides context on demand, such as codebase search, issue tracker data, read-only database access or internal documentation. The risk appears when a server combines write access, network, credentials and tool descriptions that the agent treats as trusted.

The specification itself warns that tools represent code execution paths and require caution, consent and authorization. That needs to become local policy. A read-focused MCP server should not become a write server just because the tool exists.

Start with a server allowlist. Give read permission to context servers. Require confirmation for tools that write, create issues, comment on PRs or trigger workflows. Block unknown servers in sensitive projects.

That pattern deepens the idea behind codebase RAG over MCP for agents. MCP is best when it reduces context and improves precision; it becomes dangerous when it expands authority without a gate.

Citation capsule: MCP is not only context; it is a tool surface. The 2025-06-18 specification defines hosts, clients and servers, and warns that tools can open execution paths. That is why MCP servers should enter the harness with allowlists, read scope and confirmation for writes.

How do you design a hook without security theater?

In the current documentation, hooks can return decisions such as denying a tool call in PreToolUse, while silent output keeps the normal permission flow (Claude Code Docs, "Hooks reference", retrieved 2026-07-05). A good hook is small, testable and intentionally boring.

Avoid hooks that try to "understand intent" through vague language. That belongs to the agent. The hook should check facts: command, path, domain, tool type, branch, changed file, secret presence or test evidence.

A shell PreToolUse can block broad deletion, network through shell and deploy. A PostToolUse can record command, summarized output, touched files and status. A Stop hook can deny completion if the agent changed code without a test or explanation.

The useful insight is to separate "the agent decides the plan" from "the harness decides permission." When those roles mix, you fall back to prompt trust. When they stay separate, the agent can be creative inside a clear box.

In long flows, context also becomes cost and risk. I use RemoteCode as a context extension for Claude Code and Codex in long agentic flows when I want the agent to go further without dumping the whole history into the main prompt; it is my own tool, so this is a contextual editorial reference.

Citation capsule: A good hook does not replace human judgment or agent reasoning. It verifies observable facts: command, path, domain, tool and evidence. That separation lets Claude Code, Codex and subagents work autonomously while the harness applies deterministic limits.

What is the minimum design to start?

In the current documentation, Claude Code settings have user, project, local and managed scopes, and managed scope has the highest precedence (Claude Code Docs, "Claude Code settings", retrieved 2026-07-05). The minimum design should start at the project level and move to managed policy when it becomes a team standard.

Start with three files or blocks: permissions, hooks and done criteria. Permissions define what belongs in allow, ask and deny. Hooks enforce the edges. Done criteria tell the Stop hook which evidence it must find.

For a TypeScript repository, I would start by allowing npm test and npm run typecheck, asking before npm install, denying .env reads, denying git push and blocking shell network in a freshly cloned repository.

Then add a PostToolUse that runs the nearest test when files under src/ change. If the test fails, the agent gets the error and tries to correct it. If there is no nearby test, it must state the reason in the final summary.

Abstract matrix shows permission, review and blocked zones for coding agents with no visible text.

This minimum pairs well with a lean AGENTS.md for coding agents. AGENTS.md guides behavior. Hooks and permissions enforce boundaries. CI confirms proof.

Citation capsule: A minimum hook design starts with narrow permissions, secret blocking, confirmation for network and a final evidence gate. Because Claude Code separates user, project, local and managed scopes, teams can test in the project before promoting policy to everyone.

FAQ about hooks for coding agents

Does a hook replace human review?

No. In the current documentation, hooks intercept events such as PreToolUse, PostToolUse and Stop, but they do not evaluate product impact alone. Use hooks to deny mechanical risk and require evidence. Use human review for architecture, business behavior and merge decisions.

Do I need hooks if I already have CI?

Yes, when the agent can execute actions before CI. CI catches the PR. The hook catches local commands, MCP tools, setup and session completion. In practice, hooks reduce damage before commit; CI validates the result afterward.

What belongs in AGENTS.md and what becomes a hook?

AGENTS.md should guide choices, commands and criteria. A hook should enforce limits. If the rule is "prefer the nearest test", write it in AGENTS.md. If the rule is "do not read .env", put it in a permission rule or hook. A prompt is not a security boundary.

Do hooks work with subagents?

Yes, when the tool exposes subagent events. Claude Code documentation lists SubagentStart and SubagentStop, along with tool events. Use those events to record fan-out, limit scope and require each subagent to return evidence, not just a conclusion.

How do I avoid slow hooks?

Use narrow matchers. Claude Code documentation lets you filter events by tool, agent or pattern. Do not run a heavy script on every call. Block before risk, collect evidence after writes and leave expensive tests to CI when local feedback does not change the decision.

Closing

Coding agents need autonomy, but autonomy without edges becomes accidental permission. Hooks solve that because they act at the point where the agent tries to cross a boundary: shell, network, secret, MCP, subagent, write and completion.

Start small. Block secrets, ask before network, allow exact tests and require proof before the final summary. Then connect the same pattern to your PR evals, codebase MCP and subagent fan-out. The agent stays fast. It just stops running loose where damage is expensive.

Sources consulted

Claude Code Docs, "Hooks reference", retrieved 2026-07-05, https://code.claude.com/docs/en/hooks
Claude Code Docs, "Configure permissions", retrieved 2026-07-05, https://code.claude.com/docs/en/permissions
Claude Code Docs, "Claude Code settings", retrieved 2026-07-05, https://code.claude.com/docs/en/settings
Model Context Protocol, "Specification 2025-06-18", retrieved 2026-07-05, https://modelcontextprotocol.io/specification/2025-06-18
0din, "Clone This Repo and I Own Your Machine", retrieved 2026-07-05, https://0din.ai/blog/clone-this-repo-and-i-own-your-machine

Hooks stop coding agents before the damage