News

OpenAI Launches AgentKit: Complete Toolkit for AI Agents

Article Highlights:
  • OpenAI launches AgentKit, complete suite for AI agent development, deployment and optimization
  • Agent Builder enables visual design of multi-agent workflows with drag-and-drop and full versioning
  • ChatKit allows embedding customized conversational experiences in apps and websites in under an hour
  • Connector Registry centralizes data management across multiple workspaces with pre-built connectors
  • New Evals features include datasets, trace grading, automated prompt optimization and third-party model support
  • Ramp slashed iteration cycles by 70% and launched buyer agent in two sprints rather than two quarters
  • Reinforcement fine-tuning available on o4-mini and in private beta for GPT-5 with custom tool calls
  • ChatKit and Evals generally available, Agent Builder in beta, all included with standard API pricing
OpenAI Launches AgentKit: Complete Toolkit for AI Agents

Introduction

OpenAI has announced the launch of AgentKit, a comprehensive suite of tools designed to simplify the development, deployment, and optimization of AI agents. This new platform represents a breakthrough for developers and enterprises who have previously had to navigate the complexity of fragmented tools, complex orchestration without versioning, and weeks of frontend work before launch.

AgentKit addresses these challenges with an integrated ecosystem that includes Agent Builder for visual workflow design, Connector Registry for centralized data management, and ChatKit for embedding conversational experiences into products. The platform builds on the Responses API and Agents SDK released in March, significantly expanding agentic development capabilities.

Market Context and Necessity

Since the release of the Responses API and Agents SDK in March, OpenAI has observed developers and enterprises building end-to-end agentic workflows for deep research, customer support, and other business-critical applications. Klarna built a support agent that handles two-thirds of all tickets, while Clay 10x'ed growth with a sales agent.

However, the building process remained complex: fragmented orchestration, custom connectors, manual evaluation pipelines, prompt tuning, and weeks of frontend development before deployment. AgentKit was created precisely to eliminate these obstacles and accelerate time-to-market.

Agent Builder: Visual Workflow Design

Agent Builder is a visual canvas that allows composing agent logic through drag-and-drop nodes, connecting tools, and configuring custom guardrails. It supports preview runs, inline eval configuration, and full versioning, ideal for fast iteration.

The team at Ramp transformed months of complex orchestration into just a few hours, slashing iteration cycles by 70% and launching a buyer agent in two sprints rather than two quarters. The visual canvas keeps product, legal, and engineering teams on the same page, dramatically accelerating development.

"Agent Builder transformed what once took months of complex orchestration, custom code, and manual optimizations into just a couple of hours. The visual canvas keeps product, legal, and engineering on the same page, slashing iteration cycles by 70% and getting an agent live in two sprints rather than two quarters."

Ramp

LY Corporation, a leading Japanese technology company, built a work assistant agent with Agent Builder in less than two hours, demonstrating how the platform enables engineers and subject matter experts to collaborate in one interface.

Connector Registry for Enterprise Governance

The Connector Registry consolidates data sources into a single admin panel across ChatGPT and the API. It includes all pre-built connectors like Dropbox, Google Drive, SharePoint, and Microsoft Teams, as well as third-party MCPs. This tool is essential for enterprises needing to govern and maintain data across multiple workspaces and organizations.

Guardrails for Agent Safety

Developers can enable Guardrails in Agent Builder, an open-source modular safety layer that protects agents against unintended or malicious behavior. Guardrails can mask or flag personal information, detect jailbreaks, and apply other safeguards, making it easier to build and deploy reliable, safe agents.

ChatKit: Embedding Conversational Experiences

ChatKit is a toolkit that simplifies embedding chat-based agents into products. It automatically handles streaming responses, conversation threads, model thinking visualization, and engaging in-chat experience design. It can be embedded into apps or websites and customized to match company theme or brand.

Canva saved over two weeks of development time building a support agent for their Canva Developers community, integrating it in less than an hour. The agent transforms how developers engage with documentation by converting it into a conversational experience.

"We saved over two weeks of time building a support agent for our Canva Developers community with ChatKit, and integrated it in less than an hour. This support agent will transform the way developers engage with our docs by turning it into a conversational experience, making it easy to build apps and integrations on Canva."

Canva

ChatKit already powers a range of use cases, from internal knowledge assistants and onboarding guides to customer support and research agents. HubSpot implemented a customer support agent using this technology.

New Evaluation Capabilities

Building reliable, production-ready agents requires rigorous performance evaluations. OpenAI launched Evals last year to help developers test prompts and measure model behavior. Four new capabilities have now been added:

  • Datasets: rapidly build agent evals from scratch and expand them over time with automated graders and human annotations
  • Trace grading: run end-to-end assessments of agentic workflows and automate grading to pinpoint shortcomings
  • Automated prompt optimization: generate improved prompts based on human annotations and grader outputs
  • Third-party model support: evaluate models from other providers within the OpenAI Evals platform

Carlyle has already seen significant results, cutting development time on their multi-agent due diligence framework by over 50% and increasing agent accuracy by 30%.

"The evaluation platform cut development time on our multi-agent due diligence framework by over 50%, and increased agent accuracy 30%."

Carlyle

Reinforcement Fine-Tuning for Advanced Performance

Reinforcement fine-tuning (RFT) allows developers to customize OpenAI's reasoning models. It is generally available on o4-mini and in private beta for GPT-5. OpenAI is working closely with dozens of customers to refine RFT for GPT-5 before wider release.

Two new features have been introduced in the RFT beta to push agent performance even further:

  • Custom tool calls: train models to call the right tools at the right time for better reasoning
  • Custom graders: set custom evaluation criteria for what matters most in your specific use case

Pricing and Availability

Starting today, ChatKit and the new Evals capabilities are generally available to all developers. Agent Builder is available in beta, and Connector Registry is beginning its beta rollout to some API, ChatGPT Enterprise and Edu customers with a Global Admin Console. All of these tools are included with standard API model pricing.

OpenAI plans to add a standalone Workflows API and agent deployment options to ChatGPT soon.

Conclusion

AgentKit represents a significant step in the evolution of AI agent development, consolidating previously fragmented tools into an integrated ecosystem. With Agent Builder, ChatKit, Connector Registry, and advanced evaluation capabilities, OpenAI provides developers and enterprises with the necessary tools to build, deploy, and optimize AI agents more efficiently and reliably. The results achieved by companies like Ramp, Canva, Carlyle, and LY Corporation demonstrate the platform's potential to drastically reduce development times and improve agent performance.

FAQ

What is OpenAI AgentKit?

AgentKit is OpenAI's comprehensive suite of tools for developing, deploying, and optimizing AI agents, including Agent Builder, ChatKit, Connector Registry, and advanced evaluation capabilities.

How does Agent Builder work for AI agents?

Agent Builder is a visual canvas that enables designing multi-agent workflows through drag-and-drop nodes, with support for versioning, preview runs, and safety guardrail configuration.

Which companies are already using AgentKit?

Companies like Ramp, Canva, Carlyle, LY Corporation, HubSpot, and Klarna are using AgentKit tools to develop AI agents for various business use cases.

Does ChatKit require frontend expertise to implement?

No, ChatKit simplifies embedding agentic conversational experiences by automatically handling streaming, threads, and reasoning visualization, significantly reducing implementation time.

How much does it cost to use AgentKit?

AgentKit tools are included with OpenAI's standard API model pricing, with no additional costs for ChatKit, Agent Builder, or Evals capabilities.

What are Guardrails in Agent Builder?

Guardrails is an open-source modular safety layer that protects agents from unintended behavior by masking personal information and detecting jailbreak attempts.

How do the new Evals features improve AI agent performance?

New features include datasets for rapid evaluations, trace grading for end-to-end assessment, automated prompt optimization, and third-party model support.

Is the Connector Registry available for all OpenAI customers?

The Connector Registry is in beta rollout for API, ChatGPT Enterprise, and Edu customers who have a Global Admin Console to manage domains and SSO.

Introduction OpenAI has announced the launch of AgentKit, a comprehensive suite of tools designed to simplify the development, deployment, and optimization Evol Magazine