Your AI Does Exactly What You Ask. That's the Problem.

By Haim Ari · 2026-03-24

Claude Code runs with your full user permissions — filesystem, shell, git, everything. The model will usually follow your safety instructions. Usually isn't good enough for production. Hooks are the deterministic enforcement layer that most developers skip, and the reason they eventually have a bad day.

A developer asked Claude to document their Azure OpenAI configuration.

Claude did exactly that. It wrote clear, readable documentation — the kind of documentation that would make any engineering manager nod approvingly. It included the connection details. The endpoint. The deployment name. And the actual API key, hardcoded directly into the markdown file.

The documentation got committed. It got pushed to the public repository. It sat there for eleven days. Then $30,000 in fraudulent API charges appeared on the account.

Claude didn't go rogue. It didn't misunderstand. It did precisely what it was asked: document the configuration. The developer's mental model of "document the config" didn't include "and obviously don't write the actual secret into a public file." Claude's didn't either — because that constraint wasn't expressed in a way Claude could enforce.

This is the core problem with how most developers think about AI coding safety. And it has a solution most of them haven't tried.

The Permission Model Nobody Explains

When you install Claude Code and run it on a project, here's what you're giving it:

Full filesystem read and write access, scoped to wherever it's running. Shell access — the ability to execute arbitrary commands with your user permissions. Git access to stage, commit, and push code. Network access for any MCP server you've configured.

This is the design. The AI needs these permissions to do useful work. You can't have an AI coding assistant that can't write files or run builds. The permissions that let Claude fix your database migration are the same permissions that let Claude accidentally drop the table.

A senior engineer operating with these same permissions applies fifteen years of accumulated judgment to every action. They've internalized, through painful experience, the list of things you don't do: you don't force-push to main, you don't store secrets in code, you don't run rm -rf on a production path without double-checking. That judgment is tacit knowledge, not written down anywhere, applied automatically to everything.

Claude has no such learned inhibitions. It has a remarkable ability to reason about code. It does not have the accumulated scar tissue of watching someone accidentally take down production.

Why "Just Tell It What Not To Do" Fails

The standard response to this is CLAUDE.md. You add a section: "Never push directly to main." "Never commit secrets or API keys." "Always run tests before committing." And Claude generally follows these instructions — genuinely, not grudgingly. The model is trying to help.

The keyword is generally.

The problem isn't that Claude ignores your CLAUDE.md. The problem is that language model behavior is probabilistic. Given a complex enough situation — a multi-step workflow, an unusual code pattern, a session that's accumulated enough context that earlier instructions have diluted — the probability of following any specific instruction is less than 1.0 — somewhere between 0.98 and 0.999.

For productivity features, this is fine. If Claude occasionally forgets your preferred naming convention, you catch it in code review. No harm done.

For safety-critical behaviors, this math breaks down immediately.

If you have 100 operations that each have a 0.2% chance of going wrong, you have about an 18% chance of something going wrong across those operations. Scale to 1,000 operations and the probability approaches certainty. And an AI coding assistant with your full user permissions doesn't get recoverable failures. A force-push to main, a dropped table, an API key in a public repository — these don't get undone because the model had a bad day.

"The AI usually follows safety instructions" is not a production safety model.

What Hooks Actually Are

Hooks are shell commands that Claude Code executes deterministically at specific points in its lifecycle — before a tool runs, after a tool runs, when you submit a prompt, when the session ends.

The critical word is deterministically. The model cannot override them. They run regardless of what Claude decides, regardless of how long the session has been running, regardless of how the context window looks. A hook is not a suggestion Claude chooses to follow. It is a constraint that executes outside of Claude's decision-making entirely.

There are five hook types that matter for production use:

PreToolUse fires before Claude executes any tool call — before writing a file, before running a command, before making a git commit. A PreToolUse hook can return a block signal, which prevents the tool from executing entirely. Claude cannot proceed past a blocking PreToolUse hook. This is the enforcement mechanism.

PostToolUse fires after a tool executes. It can't stop the action that already happened, but it can automatically repair, log, notify, or trigger follow-on actions. Auto-format every file after Claude edits it. Run your linter on every modified source file. Scan for secrets in every written file.

UserPromptSubmit fires before Claude processes your input. Useful for injecting context — time, date, current git branch, recent errors — into every session automatically.

Notification fires when Claude wants to alert you. Pipe these to your notification system or just log them.

Stop fires when Claude finishes a response. Good for triggering downstream processes — running your test suite, updating a log, notifying a team channel.

The combination of PreToolUse blocking and PostToolUse scanning creates a two-layer defense: prevent dangerous actions before they happen, catch security issues after writes before they can propagate.

The Five Hooks Every Production Team Should Have

Building the complete set of production hooks takes a few hours. Here's what actually matters:

The dangerous command blocker. A PreToolUse hook on Bash that scans the command for patterns that should never execute: rm -rf, DROP TABLE, TRUNCATE, --force on a git push to main or master. Return a block signal on match. This one hook prevents the category of irreversible infrastructure damage.

The secret scanner. A PostToolUse hook on file writes that scans the written content for patterns that look like API keys, tokens, connection strings, and passwords. Regex patterns for common formats — AWS keys start with AKIA, JWT tokens have a characteristic structure, most API keys follow length and character class patterns. If a match is found, overwrite the file with a placeholder and alert. This is what prevents the $30,000 documentation incident.

The branch protector. A PreToolUse hook on git commands that denies any push or force-push targeting main, master, or your production branches. Let Claude push to feature branches all day. Block production entirely. This is a two-line shell script and it eliminates an entire category of production incident.

The auto-formatter. A PostToolUse hook on file edits that runs your project's formatter — Prettier, Black, gofmt, rustfmt — on every file Claude modifies. This is pure quality of life: you never see a PR where Claude wrote correct code in the wrong style. The formatter runs, the file is already clean, and no time is spent on style comments in code review.

The test trigger. A PostToolUse hook on source file edits that runs the tests affected by the modified file. Use your test runner's path-based filtering. You get test feedback after every Claude edit, not after a batch of fifteen edits where you no longer know which one broke what.

None of these require complex infrastructure. They're shell scripts, typically under twenty lines each. The barrier isn't technical — it's awareness that they exist and matter.

The Architecture of Reliable Automated Systems

Here's the pattern that runs through every reliable automated system, from bank fraud detection to aircraft control systems: you separate intent from enforcement. The system declares what it intends to do. A separate layer enforces what it's actually allowed to do. The two layers don't consult each other — the enforcement layer is outside the intent layer's control.

This is exactly what hooks do for agentic development. Claude declares its intent through tool calls — "I want to run this bash command," "I want to write to this file," "I want to make this commit." The hook layer decides what is actually permitted. Claude cannot bypass a blocking PreToolUse hook any more than a bank transaction can bypass fraud detection by being really confident it's legitimate.

Your CLAUDE.md is the intent layer. You're saying: "here are the behaviors I want." It works remarkably well for things that are advisory — code style, architecture preferences, documentation standards. For things that are safety-critical, you want the enforcement layer. Hooks are deterministic. CLAUDE.md is probabilistic. Use each for what it's designed for.

The developers who have thought carefully about production agentic workflows all converge on the same insight: the problem isn't that the AI is trying to do the wrong thing. The problem is that "trying to do the right thing" isn't good enough when the downside is irreversible. You need a system where the wrong thing is architecturally impossible, not just unlikely.

What the Ecosystem Already Knows

The Claude Code community has already built extensive hook libraries. The awesome-claude-code repository on GitHub catalogs hundreds of community-contributed hooks — dangerous command blockers, secret scanners, branch protectors, notification systems, test runners, code formatters, audit loggers. The pattern is so consistent that it's become a community best practice: you don't run Claude Code on production codebases without hooks.

The Hookify plugin takes this further: you describe behaviors you want to prevent in plain English, and it generates the hook configuration for you. "Never modify files in the migrations folder without explicit confirmation." "Block any command containing 'production' in the path." "Scan all written files for hardcoded credentials." It converts your safety requirements into deterministic enforcement automatically.

This community infrastructure exists because the lesson has been learned, repeatedly, by the early adopters who ran AI coding assistants without it. The $30,000 incident isn't unique — it's representative of a category of incident that happens when developers trust probabilistic AI behavior to enforce safety-critical invariants.

The industry is past that learning phase. Hooks aren't an advanced feature for power users. They're the baseline for responsible production use.

The Bottom Line

Every capable AI coding assistant runs with your permissions. That's unavoidable — it's what makes them useful. The question is whether you've thought carefully about the difference between what the AI can do and what it should be allowed to do.

Probabilistic instructions cover a lot of ground. They handle style, architecture, workflow preferences — everything where "usually right" is acceptable because "occasionally wrong" is recoverable. They are structurally inadequate for safety-critical invariants. An AI operating at developer speed, across hundreds of operations per session, will eventually surface the tail of any probability distribution you rely on.

Hooks close that gap. They're deterministic enforcement over probabilistic behavior. They run outside Claude's decision-making, cannot be overridden by context or instructions, and handle exactly the cases where "usually" isn't good enough.

Five hooks. A few hours of setup. The difference between an AI development workflow you'd run on a production codebase and one you'd only trust in a sandbox.

The $30,000 documentation incident was preventable. So is yours.

This is part of a series on the Claude Code ecosystem. Previously: The LLM Isn't the Bottleneck Anymore. The Ecosystem Is. and At 50% Context, Your AI Starts Cutting Corners. Here's the Fix.