OpenAI launches a pair of AI reasoning models, o3 and o4-mini

OpenAI introduced on Wednesday the launch of o3 and o4-mini, new AI reasoning fashions designed to pause and work by means of questions earlier than responding.

The corporate calls o3 its most superior reasoning mannequin ever, outperforming the corporate’s earlier fashions on assessments measuring math, coding, reasoning, science, and visible understanding capabilities. In the meantime, o4-mini gives what OpenAI says is a aggressive trade-off between worth, velocity, and efficiency — three components builders typically take into account when selecting an AI mannequin to energy their functions.

Not like earlier reasoning fashions, o3 and o4-mini can generate responses utilizing instruments in ChatGPT resembling internet looking, Python code execution, picture processing, and picture era. Beginning as we speak, the fashions, plus a variant of o4-mini known as “o4-mini-high” that spends extra time crafting solutions to enhance its reliability, can be found for subscribers to OpenAI’s Professional, Plus, and Group plans.

The brand new fashions are a part of OpenAI’s effort to beat out Google, Meta, xAI, Anthropic, and DeepSeek within the cutthroat international AI race. Whereas OpenAI was first to launch an AI reasoning mannequin, o1, opponents rapidly adopted with variations of their very own that match or exceed the efficiency of OpenAI’s lineup. In truth, reasoning fashions have begun to dominate the sphere as AI labs look to eke extra efficiency out of their programs.

O3 practically wasn’t launched in ChatGPT. OpenAI CEO Sam Altman signaled in February that the corporate meant to commit extra assets to a classy different that included o3’s expertise. However aggressive strain seemingly spurred OpenAI to reverse course in the long run.

OpenAI says that o3 achieves state-of-the-art efficiency on SWE-bench verified (with out customized scaffolding), a take a look at measuring coding skills, scoring 69.1%. The o4-mini mannequin achieves related efficiency, scoring 68.1%. OpenAI’s subsequent finest mannequin, o3-mini, scored 49.3% on the take a look at, whereas Claude 3.7 Sonnet scored 62.3%.

OpenAI claims that o3 and o4-mini are its first fashions that may “assume with photographs.” In observe, customers can add photographs to ChatGPT, resembling whiteboard sketches or diagrams from PDFs, and the fashions will analyze the photographs throughout their “chain-of-thought” part earlier than answering. Due to this newfound capacity, o3 and o4-mini can perceive blurry and low-quality photographs and may carry out duties resembling zooming or rotating photographs as they cause.

Past image-processing capabilities, o3 and o4-mini can run and execute Python code straight in your browser through ChatGPT’s Canvas characteristic, and search the online when requested about present occasions.

Along with ChatGPT, all three fashions — o3, o4-mini, and o4-mini-high — will probably be out there through OpenAI’s developer-facing endpoints, the Chat Completions API and Responses API, permitting engineers to construct functions with the corporate’s fashions at usage-based charges.

OpenAI is charging builders a comparatively low worth for o3, given its improved efficiency, at $10 per million enter tokens (roughly 750,000 phrases, longer than the Lord of the Rings sequence) and $40 per million output tokens. For o4-mini, OpenAI is charging the identical as o3-mini, $1.10 per million enter tokens and $4.40 per million output tokens.

Within the coming weeks, OpenAI says it plans to launch o3-pro, a model of o3 that makes use of extra computing assets to provide its solutions, solely for ChatGPT Professional subscribers.

OpenAI CEO Sam Altman has indicated o3 and o4-mini could also be its final stand-alone AI reasoning fashions in ChatGPT earlier than GPT-5, a mannequin that the corporate has stated will unify conventional fashions like GPT-4.1 with its reasoning fashions.