AI Context Engineering: Practical Guide from Manus

Introduction

Context engineering represents the new frontier in developing effective AI agents. While models become increasingly powerful, the ability to manage context determines the success or failure of an agentic system. The Manus team learned this lesson through four complete framework reconstructions, developing principles that transformed their approach to agentic AI.

The Strategic Choice: Context vs Fine-Tuning

When the Manus team started their project, they faced a crucial decision: train an end-to-end model or build an agent based on in-context learning. They chose context engineering for one fundamental reason: iteration speed.

In the early days of NLP, fine-tuning required weeks per iteration. Today, context engineering enables improvements in hours rather than weeks, keeping the product orthogonal to underlying models. Like a boat floating on the rising tide of model progress, instead of being a fixed pillar on the seabed.

KV-Cache: The Most Important Metric

The KV-cache hit rate is the most critical metric for a production AI agent. It directly impacts both latency and costs, with dramatic effects: with Claude Sonnet, cached tokens cost 0.30 USD/MTok versus 3 USD/MTok for non-cached ones.

Strategies to Optimize KV-Cache

Keep the prompt prefix stable: Even a single different token can invalidate the entire cache
Make context append-only: Avoid modifications to previous actions or observations
Deterministic serialization: Ensure stable ordering of JSON keys
Explicit breakpoints: Strategically mark cache points when necessary

Mask Tools, Don't Remove Them

As the agent's capabilities expand, the action space becomes complex. The temptation to implement dynamic tools is strong, but it carries significant risks for KV-cache and model consistency.

Instead of dynamically removing tools, Manus uses a state machine that masks logits during decoding. This approach maintains context stability while controlling action selection through three modes:

Auto: The model chooses whether to call a function
Required: The model must call a function
Specified: The model must choose from a specific subset

File System as Extended Context

Even with 128K token context windows, real-world agents often hit limits. Observations can be huge, performance degrades with long contexts, and costs increase proportionally.

Manus treats the file system as the definitive context: unlimited, persistent, and directly operable. The model learns to use files not just as storage, but as structured, externalized memory. Compression strategies are always recoverable, maintaining references that allow information retrieval when needed.

Manipulating Attention Through Rehearsal

A distinctive behavior of Manus is the constant creation and updating of todo.md files during complex tasks. This isn't just aesthetic appeal, but a deliberate mechanism to manipulate the model's attention.

With an average of 50 tool calls per task, Manus risks losing focus on its objectives. By constantly rewriting the to-do list, it pushes the global plan into the recent attention span, avoiding "lost in the middle" problems and maintaining goal alignment.

Keep Errors in Context

Against the common instinct to hide errors, Manus maintains wrong paths in context. When the model sees a failed action and the resulting error observation, it implicitly updates its internal beliefs, reducing the likelihood of repeating the same mistake.

Error recovery is one of the clearest indicators of true agentic behavior, even though it remains underrepresented in academic benchmarks that focus on success under ideal conditions.

Avoiding the Few-Shot Trap

Language models are excellent imitators and tend to follow patterns in context. In repetitive tasks, this can lead to drift and over-generalizations. The solution is to introduce structured diversity: variations in serialization patterns, alternative formulations, and small changes in order or formatting.

Conclusion

Context engineering is an emerging but already essential science for agentic systems. How you model context defines agent behavior: speed, recovery capability, and scalability. These lessons, learned through millions of real interactions, offer practical guidance for those developing AI agents in the real world.

FAQ

What is context engineering for AI agents?

Context engineering is the discipline that deals with designing and optimizing how AI agents manage information during task execution, directly influencing performance and costs.

Why is KV-cache so important for AI agents?

KV-cache dramatically reduces latency and costs: with Claude Sonnet, cached tokens cost 0.30 USD/MTok versus 3 USD/MTok for non-cached ones, a 10x difference.

How do you optimize KV-cache hit rate?

By keeping the prompt prefix stable, making context append-only, ensuring deterministic serialization, and marking explicit breakpoints when necessary.

Why not dynamically remove tools from AI agents?

Dynamically removing tools invalidates KV-cache and confuses the model when previous actions reference tools no longer defined in the current context.

How do you use the file system as extended context?

By treating the file system as structured, externalized memory where the agent can write and read information on demand, overcoming traditional context window limitations.

Why keep errors in AI agent context?

Errors provide valuable evidence that allows the model to update its internal beliefs and reduce the likelihood of repeating the same mistakes in the future.

Context Engineering for AI Agents: 6 Key Lessons from Manus

Introduction

The Strategic Choice: Context vs Fine-Tuning

KV-Cache: The Most Important Metric

Strategies to Optimize KV-Cache

Mask Tools, Don't Remove Them

File System as Extended Context

Manipulating Attention Through Rehearsal

Keep Errors in Context

Avoiding the Few-Shot Trap

Conclusion

FAQ

What is context engineering for AI agents?

Why is KV-cache so important for AI agents?

How do you optimize KV-cache hit rate?

Why not dynamically remove tools from AI agents?

How do you use the file system as extended context?

Why keep errors in AI agent context?

Tag:

Related links:

Introduction

The Strategic Choice: Context vs Fine-Tuning

KV-Cache: The Most Important Metric

Strategies to Optimize KV-Cache

Mask Tools, Don't Remove Them

File System as Extended Context

Manipulating Attention Through Rehearsal

Keep Errors in Context

Avoiding the Few-Shot Trap

Conclusion

FAQ

What is context engineering for AI agents?

Why is KV-cache so important for AI agents?

How do you optimize KV-cache hit rate?

Why not dynamically remove tools from AI agents?

How do you use the file system as extended context?

Why keep errors in AI agent context?

Tag:

Related links:

Related Articles

The AI Orchestration Era: How Agents Will Transform Work (Without Replacing Workers)

Code Execution with MCP: Building More Efficient AI Agents

AI Agents: Still a Decade Away From Actually Working