The Synthesis

Ten laws.
Four operations.
One methodology.

Six parallel research agents. Two hundred plus sources. One consolidated view of what context engineering actually is, and how to do it well. This is the distilled output behind the nv:context skill.

Updated 2026.04.04 · 6 research agents · 200+ sources
01 · The definitive understanding

What context engineering actually is.

The consensus definition, distilled from Karpathy, Schmid, Chase, Osmani, and Anthropic. One sentence everyone in the field has independently converged on.

The systematic discipline of designing dynamic systems that provide the right information and tools, in the right format, at the right time, so an LLM can accomplish a task.
Consensus definition · Karpathy, Schmid, Chase, Osmani, Anthropic
It is not
Prompt engineering

That's a subset. Prompts are one layer inside a much larger information system.

It is not
Just RAG

Retrieval is one of four operations. Picking docs is the easy part of context.

It is not
A clever system prompt

The system prompt is one of seven components. Treating it as everything is the failure mode.

Most agent failures are not model failures. They are context failures.
Philipp Schmid · Google DeepMind
02 · The core paradox

Agents need more context. Performance degrades as context grows.

The single most counterintuitive finding in the entire research corpus. Every law that follows is a strategy for resolving it.

300>113K
A focused 300-token context outperforms an unfocused 113K-token context. FlowHunt · LongMemEval
40>90%
Using 40% of the context window outperforms using 90%. Dex Horthy
~150 max
Frontier LLMs follow roughly 150 to 200 instructions consistently. Then attention falls off. ETH Zurich
03 · The ten laws

Ten laws of context engineering.

Distilled from every source. Each one is a tested response to the core paradox. The skill itself condenses these into eight working rules. The synthesis is the fuller source.

01
Less Is More

LLM-generated config files reduce success by 3% and increase costs 20% or more. Human-written files only marginally help. Only include what agents cannot discover by reading your code.

The ETH Zurich Law
02
Commands Beat Prose

One executable command with full flags outperforms three paragraphs of description. One code snippet showing a pattern outperforms lengthy explanations.

Cross-source consensus
03
Context Is a Finite Resource

Every token in your config competes with the actual task. Frontier LLMs follow ~150 to 200 instructions consistently. Claude Code's system prompt already uses ~50. Target under 200 lines per root file.

The Attention Budget Law
04
Progressive Disclosure

Root file orients. Subdirectory files scope details. Skills load on demand. MCP retrieves dynamically. Never dump everything into one file.

The Architecture Law
05
Stable Prefixes, Dynamic Suffixes

Keep system instructions and tool definitions static. Append dynamic content at the end. Cached tokens cost 10x less. A single changed token at the start invalidates the entire cache.

The KV-Cache Law · Manus
06
The Three-Tier Boundary Framework

Every configuration must include three buckets: Always do, Ask first, and Never do. The single most common helpful constraint: "Never commit secrets."

Cross-source consensus
07
Deterministic Tools for Deterministic Tasks

Don't use LLMs for linting, formatting, or style enforcement. Use hooks and scripts. Reserve agent config for things only an LLM can do.

The Hook Law
08
File System as Extended Memory

Write state to files, read selectively. The todo.md pattern pushes objectives into recent attention. Compression should be reversible: keep URLs and paths even when dropping content.

The Manus Law
09
Isolation Over Sharing

Share memory by communicating, don't communicate by sharing memory. Sub-agents work best with focused, isolated contexts. Each one explores extensively but returns condensed summaries.

The Multi-Agent Law · Anthropic
10
Iterate on Observed Failures

Start minimal. Observe agent mistakes. Add instructions that prevent those specific mistakes. Prune regularly. Treat config like code.

The Living Document Law
04 · The four operations

Four universal operations.

Independently identified by three different teams. If you do anything to context, you are doing one of these four things.

Operation 01
Write

Persist information beyond a single turn.

Scratchpads, memories, files, todo.md

Operation 02
Select

Retrieve only the relevant information.

RAG, tool selection, memory retrieval, glob/grep

Operation 03
Compress

Reduce tokens while retaining meaning.

Summarization, observation masking, tool result clearing

Operation 04
Isolate

Partition context across boundaries.

Sub-agents, fresh context windows, scoped rules

Independently identified by Anthropic, LangChain, and Google ADK.

05 · The context stack

Seven components inside the window.

Philipp Schmid's framework. Every token your model sees belongs to exactly one of these seven slots. If you cannot name the slot, you cannot engineer it.

01
System Prompt · Instructions

Agent identity, behavior rules, boundaries. The persistent persona.

02
User Prompt

The current task or question. The thing the user actually wants done.

03
State · History

Conversation turns, tool outputs, intermediate results from earlier in the session.

04
Long-Term Memory

Cross-session facts, preferences, domain knowledge that survives a context reset.

05
Retrieved Information

RAG results, file contents, documentation pulled in for the specific task.

06
Available Tools

Tool descriptions and parameters. The actions the agent can take from here.

07
Structured Output

The expected response format or schema. Constraints on what comes back.

Framework by Philipp Schmid · Google DeepMind

06 · The file ecosystem

Seven config files. One universal baseline.

The current state of agent config files as of April 2026. Different tools, different formats, mostly the same purpose.

File Tool support Best for
AGENTS.md 25+ tools, Linux Foundation Universal baseline. The widest compatibility surface available.
CLAUDE.md Claude Code, GitHub Copilot Claude-specific features: @imports, skills, hooks integration.
.cursor/rules/*.mdc Cursor Glob-scoped rule activation per file pattern.
.github/copilot-instructions.md GitHub Copilot Copilot-specific instructions inside the repo.
GEMINI.md Gemini CLI Gemini-specific behavior and guardrails.
.windsurf/rules/*.md Windsurf Windsurf-specific rule files with cascade behavior.
CONVENTIONS.md Aider Aider's conventions file for repo-wide rules.

Pragmatic recommendation. Start with AGENTS.md (widest support) plus CLAUDE.md (deepest features). Keep overlap minimal. Generate other files only for tools you actually use.

07 · The evidence

Numbers worth remembering.

A handful of statistics from the corpus that change how you think about context engineering once you have read them.

3%

Auto-generated config files reduce agent success by 3% and increase costs by 20% or more vs no config at all. Human-written files only marginally help.

ETH Zurich · agent evaluation study
19% slower

Experienced developers are 19% slower with AI tools when context is poorly managed, despite feeling 24% faster. A 39-percentage-point perception gap.

METR · controlled developer study
10x

Cached tokens cost 10x less than uncached tokens. A single changed token at the start of your prompt invalidates the entire KV cache. Stable prefixes pay for themselves immediately.

Manus · production agent at scale
~150 instructions

Frontier LLMs reliably follow 150 to 200 instructions before attention falls off. Claude Code's own system prompt already burns about 50 of those. You have less budget than you think.

Cross-source · observed in evals
L1L2

The maturity level of most repositories on the L0 to L6 framework. Goal of nv:context: get teams to L5 or L6 (path-scoped rules, active maintenance, skills, MCP, dynamic capability loading).

L0–L6 maturity framework
50+

Tool calls per task at Manus, across four framework rebuilds. Their five production patterns (KV-cache, tool masking, file system memory, error preservation, diversity injection) are battle-tested at scale.

Manus · production engineering blog