News

Code Execution with MCP: Building More Efficient AI Agents

Article Highlights:
  • Code execution with MCP reduces token consumption by up to 98.7% compared to direct tool calls
  • Agents load only necessary tools on-demand, avoiding upfront definition overload
  • Data filtering and transformation happen in the execution environment, not in model context
  • State persistence enables agents to resume interrupted work and develop reusable skills
  • Privacy-preserving approach automatically tokenizes PII data flowing between tools
  • MCP is the industry's de facto standard for connecting intelligent agents to external systems
  • Loops, conditionals, and control flow in code are more efficient than chaining tool calls
Code Execution with MCP: Building More Efficient AI Agents

Introduction

Code execution with MCP (Model Context Protocol) represents a critical breakthrough for modern AI agents. Instead of loading all tool definitions into context and passing every intermediate result through the model, agents write code to interact with MCP servers. This approach reduces token consumption by up to 98.7%, slashing costs and latency.

Since Anthropic launched MCP in November 2024, adoption has been exponential: thousands of community MCP servers, SDKs for all major languages, and the industry's de facto standard for connecting agents to tools and data.

What Is the Model Context Protocol

The Model Context Protocol is an open universal standard connecting AI agents to external systems. Traditionally, each agent-to-tool integration required custom development, creating fragmentation and duplicated effort. MCP eliminates this by offering a standardized interface: developers implement MCP once in their agent and gain access to an entire ecosystem of ready-made integrations.

The Challenge: Inefficiency in Tool Loading

As MCP usage scales, agents must manage hundreds or thousands of tools distributed across dozens of servers. Two critical challenges emerge:

  • Tool definition overload: Most MCP clients load all definitions upfront in context. For agents with thousands of tools, this means processing hundreds of thousands of tokens before reading a single request.
  • Intermediate result token consumption: Every tool result must pass through the model. In real scenarios (e.g., download transcript from Google Drive and attach to Salesforce record), large documents transit the context twice.

Concrete example: a two-hour meeting transcript can generate an extra 50,000 tokens. Additionally, with complex data, models are more prone to errors when copying information between tool calls.

The Solution: Code Execution with MCP

Code execution transforms MCP servers into coded APIs rather than direct tool calls. The agent writes code to interact with servers, loading only necessary tools and processing data in the execution environment before returning results.

One practical approach: generate a file tree of all available tools from connected MCP servers. Each tool corresponds to a file, for example:

servers/
├── google-drive/
│ ├── getDocument.ts
│ └── ...
├── salesforce/
│ ├── updateRecord.ts
│ └── ...
└── ...

The agent discovers tools by exploring the filesystem: listing the ./servers/ directory to find available servers, then reading specific tool files to understand interfaces. This reduces token usage from 150,000 to 2,000—a 98.7% savings in time and costs.

"LLMs are adept at writing code and developers should take advantage of this strength to build agents that interact with MCP servers more efficiently."

Anthropic, Blog Research

Key Benefits of Code Execution with MCP

1. Progressive Tool Disclosure

Models excel at navigating filesystems. Presenting tools as code on a filesystem allows models to read definitions on-demand rather than upfront. Alternatively, a search_tools function finds relevant definitions and loads only what's needed. A detail-level parameter lets the agent select the information granularity (name only, name and description, or full definition with schemas), preserving context.

2. Context-Efficient Tool Results

With large datasets, agents filter and transform results in code before returning them to the model. Example: fetching a 10,000-row spreadsheet. Without code execution: all 10,000 rows flow into context for manual filtering. With code execution: the agent filters in the execution environment, returning only relevant rows (e.g., pending orders). The agent sees 5 rows, not 10,000.

3. More Powerful Control Flow

Loops, conditionals, and error handling use familiar code patterns instead of chaining individual tool calls. Example: notify Slack of deployment by periodically checking channel history. The approach is more efficient than alternating between tool calls and sleep commands in the agent loop, also reducing "time to first token" latency.

4. Privacy-Preserving Operations

Intermediate results stay in the execution environment by default. The agent sees only what it explicitly logs or returns. For sensitive workloads, the MCP client can automatically tokenize PII data (email, phone, names). Real data flows from Google Sheets to Salesforce without transiting the model, preventing accidental logging or sensitive data processing.

5. State Persistence and Reusable Skills

With filesystem access, agents maintain state across operations. They write intermediate results to files to resume work and track progress. Additionally, agents persist code as reusable functions: once working code is developed, they save it for future use. A SKILL.md file creates structured skills referenceable by models, enabling agents to build a toolbox of higher-level capabilities over time.

Implementation Considerations and Trade-offs

Code execution introduces operational complexity. Running agent-generated code requires a secure execution environment with appropriate sandboxing, resource limits, and monitoring. These infrastructure requirements add operational overhead and security considerations that direct tool calls avoid. The benefits—reduced token consumption, lower latency, improved tool composition—must be weighed against these implementation costs.

Conclusion

MCP provides a foundational protocol for connecting agents to many tools and systems. However, when too many servers are connected, tool definitions and results consume excessive tokens, reducing agent efficiency. Code execution applies established software engineering patterns to agents, enabling them to use familiar programming constructs to interact with MCP servers far more efficiently. If you implement this approach, consider sharing your findings with the MCP community.

Frequently Asked Questions

What is the Model Context Protocol and how does it improve code execution in AI agents?

Model Context Protocol is an open standard connecting AI agents to external systems. Code execution with MCP transforms servers into coded APIs, allowing agents to load tools on-demand and process data locally, reducing token consumption by up to 98.7%.

How much token consumption is saved with code execution MCP?

In a real scenario, code execution reduces token consumption from 150,000 to 2,000 per operation—a 98.7% savings in time and operational costs.

How does progressive tool disclosure work in MCP?

Models explore the filesystem by listing available servers, then read only specific tool files they need. An optional search_tools function enables filtering by relevance and detail-level selection, preserving precious context.

What are the privacy benefits of code execution with MCP?

Code execution keeps intermediate results in the execution environment, enabling automatic PII tokenization. Sensitive data flows between tools without passing through the model, preventing accidental logging or data exposure.

How do agents maintain state with code execution MCP?

Agents write intermediate results to files to resume interrupted work and track progress. They can also persist code as reusable functions in "skills" folders, creating evolved higher-level capabilities over time.

What are the implementation costs of code execution MCP?

Agent-generated code execution requires a secure execution environment with sandboxing, resource limits, and monitoring. These infrastructure requirements add operational overhead and security considerations beyond direct tool calls.

What is the ideal use case for MCP with code execution?

Agents interacting with hundreds or thousands of tools distributed across dozens of servers. The approach is optimal when token savings and latency reduction offset operational complexity.

How does an agent filter large datasets with code execution MCP?

The agent loads the full dataset in the execution environment, filters/transforms it locally (e.g., pending orders), and returns only relevant results to the model, avoiding context inflation.

Introduction Code execution with MCP (Model Context Protocol) represents a critical breakthrough for modern AI agents. Instead of loading all tool Evol Magazine