A lean AGENTS.md is a context file written to guide coding agents, not to summarize the whole repository. It should tell Codex, Claude Code and similar tools which commands matter, which risks must not pass, and when the task is ready for review.
In 2026, the official AGENTS.md site says the format is used by more than 60,000 open-source projects (AGENTS.md, "A simple, open format for guiding coding agents", 2026). That adoption explains the urgency: when the file becomes context clutter, the agent obediently follows instructions that make the task worse.
Practical summary
- Write only rules that change an agent decision.
- Put test commands near the right package.
- Convert repeated PR feedback into short rules.
- Validate the file with real tasks before expanding it.

Why did AGENTS.md become a context debate?
In 2026, the paper "Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?" found average inference cost above 20% when context files were added without improving overall task success (arXiv, "Evaluating AGENTS.md", 2026). The debate exists because agents follow instructions, even when those instructions create unnecessary work.
The point is not to abandon AGENTS.md. The point is to stop using it as an encyclopedia. The study found that agents follow file instructions, but repository overviews are not very useful when they repeat what file search, tests and package metadata already reveal.
In practice, a good AGENTS.md is closer to an operating contract than a product document. It says how to install dependencies, which test proves the change, which directory has special rules and which changes require human review.
This builds on context engineering for coding agents. That article focused on prompt budget. This one focuses on the file loaded at the beginning of the session and shaping every later decision.
Citation capsule: In 2026, AGENTS.md moved from prompt tip to engineering surface. The arXiv paper "Evaluating AGENTS.md" measured more than 20% higher cost with context files, so the file should record only rules that change execution, testing or review.
What belongs in the file, and what stays out?
In 2026, Codex customization documentation recommends keeping AGENTS.md small and starting only with instructions that matter (OpenAI Developers, "Customization", 2026). Add what prevents repeated mistakes; leave out what the agent can discover by reading the repository.
Include commands that are not obvious. If npm test fails without local Redis, say so. If the payments package uses another command, put that rule in the payments directory. If every PR needs a contract test, write the done criterion.
Leave out folder trees, long architecture summaries and generic quality advice. The agent can use rg, read package.json, open workflows and inspect tests. Repeating those signals in startup context adds noise.
When a task crosses several sessions, I use RemoteCode as a context extension for Claude Code and Codex in long agentic flows, especially when I want continuity without dumping the whole history into the main prompt. It is my own tool, so this is a contextual editorial reference.
A useful rule is simple: if an instruction does not change a command, limit, risk, patch style or acceptance criterion, it probably belongs in another document. AGENTS.md should point to that document only when the task needs it.
How do you write a minimal AGENTS.md?
In 2026, Codex documentation says it reads AGENTS.md from global scope, the project root and directories down to the current folder, with a combined default byte limit controlled by project_doc_max_bytes (OpenAI Developers, "Custom instructions with AGENTS.md", 2026). That favors short instructions close to the affected code.
Start with a small root file. Then move specific rules into directories. The payments service should not force a frontend agent to load webhook details. A design system package should not inherit a database migration rule.
One sufficient example:
# AGENTS.md
## Environment
- Use `npm ci` to install dependencies.
- Before editing TypeScript, run `npm run typecheck`.
## Done criteria
- Run the closest test to the changed file.
- If an API contract changes, update the integration test.
- If a test cannot run, explain the blocker in the final summary.
## Limits
- Do not edit applied migrations without human review.
- Do not add production dependencies without explaining impact.
Notice the missing folder map. The file says how to work. It does not narrate the repository. If the agent needs architecture context, point to a short document instead of copying the whole document into startup context.

Citation capsule: A minimal AGENTS.md helps Codex and Claude Code when it records commands, done criteria and permission limits. Codex documentation exposes
project_doc_max_bytesfor combined project instructions, which makes directory-level scope safer than an overgrown root file.
Where do CLAUDE.md, MEMORY.md and skills fit?
In 2026, Claude Code documentation says MEMORY.md loads the first 200 lines or 25 KB at conversation start, while CLAUDE.md files are loaded in full and work best below 200 lines (Claude Code Docs, "How Claude remembers your project", 2026). That changes the context architecture.
Use CLAUDE.md for persistent instructions Claude Code should always see. Use AGENTS.md when you want interoperability with Codex, Cursor, Gemini CLI and other agents. If the team uses several tools, keep one canonical source and bridge when needed.
Use memory for operational learning, not permanent policy. A debugging insight may start as memory. If it becomes repeated PR feedback, promote it to AGENTS.md or CLAUDE.md in the right directory.
Use skills when the instruction is large, rare or procedural. In 2026, Codex skills documentation says the initial skills list uses at most 2% of the model context window, or 8,000 characters when the window is unknown, and only loads the full SKILL.md after a skill is selected (OpenAI Developers, "Agent Skills", 2026). That is progressive disclosure applied to agents.
This separation pairs well with codebase RAG over MCP for coding agents. AGENTS.md defines how to work. MCP and skills deliver context on demand. Memory records learning that has not become stable policy yet.
How do you validate whether the file helps?
In 2026, the paper "On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents" analyzed 10 repositories and 124 pull requests; with AGENTS.md, it observed 28.64% lower median runtime and 16.58% fewer output tokens, with comparable completion behavior (arXiv, "On the Impact of AGENTS.md Files", 2026). The opposite signal from the other study shows that file content matters.
Validate with real repository tasks. Use small fixes, medium-risk refactors and changes that require tests. Run with and without AGENTS.md. Compare time, tokens, files read, tests run, regressions and final-summary quality.
The most important signal is negative: the file should not make the agent explore more than the task needs. If it reads broad documentation before a one-line change, there is bloat. If it runs irrelevant tests because of a generic rule, there is bloat.
Use a short rubric for agent PRs:
| Observed signal | Interpretation | AGENTS.md action |
|---|---|---|
| The same test was missed again. | A useful rule is missing. | Add the command near the package. |
| The agent read too many documents. | Routing is weak. | Point only to priority sources. |
| The patch touched the wrong layer. | Module boundary is unclear. | Record the directory limit. |
| The final summary omitted risk. | Done criteria are vague. | Require evidence and residual risk. |
This validation connects with PR evals for coding agents in CI. AGENTS.md guides behavior. The eval measures whether that behavior produced enough proof for the reviewer.
Citation capsule: Two 2026 studies on AGENTS.md found different signals: one measured more than 20% higher cost without broad success gains, while another measured 28.64% lower runtime across 124 PRs. The practical conclusion is to test the file in your own repository before expanding it.
How do you keep AGENTS.md alive without turning it into storage?
In 2026, Codex recommends treating AGENTS.md as a feedback loop: when the agent makes a wrong assumption, correct it and ask for a file update so future sessions inherit the rule (OpenAI Developers, "Customization", 2026). Maintenance should come from repeated errors, not documentation anxiety.
Review the file like code. Every new rule should answer an observed failure: forgotten test, wrong command, incompatible diff style, security risk or repeated PR feedback. If nobody can name the failure, the rule does not enter.
Also delete rules. When a command becomes the repository default, the special instruction loses value. When a package disappears, remove the block. When a skill or MCP tool handles a procedure better, let AGENTS.md point to the tool.

For teams using subagent fan-out, keep each rule in the smallest useful scope. A review agent needs risk criteria. A patch agent needs the test command. An inventory agent needs to know where to search. That extends the pattern from subagent fan-out for large code migrations.
Frequently asked questions about lean AGENTS.md
Should AGENTS.md explain the whole architecture?
No. In 2026, "Evaluating AGENTS.md" reported average cost above 20% when context files were added without broad success improvement. Explain exceptions, limits and commands. For full architecture, point to documents the agent can read on demand.
Can Codex and Claude Code share the same file?
Yes, with care. In 2026, the AGENTS.md format declares compatibility across a broad ecosystem and more than 60,000 open-source projects. For Claude Code, you can bridge with CLAUDE.md, but avoid duplicated rules that drift between tools.
What is the ideal AGENTS.md size?
There is no universal size. In 2026, Claude Code recommends targeting under 200 lines for CLAUDE.md, while Codex exposes project_doc_max_bytes for combined project instructions. Use size as a warning: the larger the file, the stronger the evidence should be.
When should I create directory-level AGENTS.md files?
Create them when a subdirectory has a different command, risk or rule. In 2026, Codex documentation says closer files appear later in the instruction chain and can override broader guidance. That avoids loading backend rules into frontend work.
How do I know my AGENTS.md improved the agent?
Measure real tasks. In 2026, a study with 10 repositories and 124 pull requests associated AGENTS.md with 28.64% lower runtime and 16.58% fewer output tokens. If your file does not reduce mistakes, useless reading or rework, it is not ready.
Closing
AGENTS.md is not a place to dump context. It is a small contract for repeated decisions: install, test, review, constrain tools and record exceptions. In coding agents, good context is not large context. It is context that changes the next action.
Start small, evaluate in CI and promote only rules born from real mistakes. Then connect the file to your code-agent harness for reliable PRs, PR evals and MCP tools that already deliver context on demand. The agent spends less attention on noise and more attention on the patch.
Sources consulted
- AGENTS.md, "A simple, open format for guiding coding agents", retrieved 2026-07-04, https://agents.md/
- arXiv, "Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?", retrieved 2026-07-04, https://arxiv.org/abs/2602.11988
- arXiv, "On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents", retrieved 2026-07-04, https://arxiv.org/abs/2601.20404
- OpenAI Developers, "Custom instructions with AGENTS.md", retrieved 2026-07-04, https://developers.openai.com/codex/guides/agents-md
- OpenAI Developers, "Customization", retrieved 2026-07-04, https://developers.openai.com/codex/concepts/customization
- OpenAI Developers, "Agent Skills", retrieved 2026-07-04, https://developers.openai.com/codex/skills
- Claude Code Docs, "How Claude remembers your project", retrieved 2026-07-04, https://code.claude.com/docs/en/memory
- OpenAI Developers, "Best practices", retrieved 2026-07-04, https://developers.openai.com/codex/learn/best-practices