What result did OpenAI achieve at IOI 2025?

OpenAI placed first among AI entries and roughly sixth among 330 human contestants, a gold‑level performance under IOI rules.

Was internet or RAG allowed during the contest?

No, the AI track ran under the same constraints as humans: no internet and no external retrieval during submissions.

What changed from 2024 to 2025 technically?

The shift was from more handcrafted pipelines to general reasoning models combined with stronger test‑time search and self‑verification, improving robustness and score.

How is this relevant for developer tooling?

Improvements in search and verification at test time suggest pathways to produce more reliable code generation and automated engineering assistance under operational constraints.

Which IOI rules most affect system evaluation?

The main constraints are 5 hours per contest day, a 50‑submission limit per problem, and evaluation on hidden tests without external aids.

OpenAI gold at IOI

Introduction

OpenAI gold informatics olympiad: OpenAI’s reasoning system won gold in the IOI AI track, ranking first among AI entries and roughly sixth among 330 human contestants. The result highlights significant progress in general-purpose models solving algorithmic tasks under contest constraints.

Context

The International Olympiad in Informatics (IOI) is a world finals for high‑school competitive programming: two contest days, three hard algorithmic problems per day, 5 hours per day, no internet, and auto‑grading on hidden tests. IOI 2025 had 330 contestants from 84 countries; total score max was 600 and the gold cutoff was 438.30.

Method used by OpenAI

The team used an ensemble of general‑purpose reasoning models rather than a custom IOI model. The system generated candidate programs, compiled and ran them in a restricted terminal, and submitted the best attempts under the 50‑submission cap and 5‑hour limit. There was no internet or RAG; the approach is essentially test‑time search plus self‑verification with real compilation and judging.

Operational notes

Ensemble of general reasoning models
Generation of program candidates and execution/compilation for verification
Selection constrained by 50 submissions per problem
Execution under IOI-style contest environment with no external retrieval

The jump from 2024

In 2024 OpenAI entered with a dedicated system ("o1-ioi") that scored 213 in live contest (~49th percentile). Later tests on 2024 problems with a general model ("o3") achieved 395.64 under the 50‑submission rule, clearing gold for that year. For 2025 the ensemble reported performance equivalent to top human ranks and first among AI entries.

Why this matters

Better search strategies, tighter feedback loops, and improved selection at test time enable general models to tackle unseen algorithmic problems within strict contest rules; that pathway also informs engineering tools that must produce reliable code under operational constraints.

Limitations and caveats

The IOI setting is controlled and measures specific contest skills; results do not imply general replacement of human developers. The approach relies on test‑time search and verification and does not by itself produce formal proofs of correctness beyond passing hidden tests.

FAQ

What result did OpenAI achieve at IOI 2025?
OpenAI placed first among AI entries and roughly sixth among 330 human contestants, a gold‑level performance under IOI rules.
Did OpenAI use a contest‑specific model for IOI?
No, the 2025 effort used an ensemble of general‑purpose reasoning models rather than a bespoke IOI model.
Was internet or RAG allowed during the contest?
No, the AI track ran under the same constraints as humans: no internet and no external retrieval during submissions.
What changed from 2024 to 2025 technically?
The shift was from more handcrafted pipelines to general reasoning models combined with stronger test‑time search and self‑verification, improving robustness and score.
How is this relevant for developer tooling?
Improvements in search and verification at test time suggest pathways to produce more reliable code generation and automated engineering assistance under operational constraints.
Which IOI rules most affect system evaluation?
The main constraints are 5 hours per contest day, a 50‑submission limit per problem, and evaluation on hidden tests without external aids.

OpenAI wins gold medal at Informatics Olympiad: how the system reached the podium

Introduction

Context

Method used by OpenAI

Operational notes

The jump from 2024

Why this matters

Limitations and caveats

FAQ

Tag:

Introduction

Context

Method used by OpenAI

Operational notes

The jump from 2024

Why this matters

Limitations and caveats

FAQ

Tag:

Related Articles

GPT-5.2-Codex: OpenAI Redefines Agentic Coding and Defensive Cybersecurity

The AI Land Grab: Why Google, OpenAI, and Perplexity Are giving It All Away in India

OpenAI Pivots to Platform: Third-Party Apps and MCP Support in ChatGPT