OpenAI launches new o3-mini model – here’s how free ChatGPT users can try it

Must Read
bicycledays
bicycledayshttp://trendster.net
Please note: Most, if not all, of the articles published at this website were completed by Chat GPT (chat.openai.com) and/or copied and possibly remixed from other websites or Feedzy or WPeMatico or RSS Aggregrator or WP RSS Aggregrator. No copyright infringement is intended. If there are any copyright issues, please contact: bicycledays@yahoo.com.

On the final day of OpenAI’s 12 days of ‘shipmas,’ the corporate unveiled its newest fashions, o3 and o3-mini, which excel at reasoning and even outperform o1 on a collection of benchmarks, together with math and science. At launch, OpenAI CEO Sam Altman mentioned o3 was slated to drop on the finish of January, and at present, the corporate made good on its promise. 

o3-mini

On Friday, OpenAI launched its o3-mini mannequin, probably the most cost-efficient mannequin in OpenAI’s reasoning collection, to the general public. Till now, that collection has been comprised of o1 and o1-mini. Like its predecessor, the mannequin is especially sturdy in science, math, and coding, based on the corporate. 

When o3-mini is chosen, it’s going to use medium reasoning effort, which balances velocity and accuracy. Whereas the unique o1 mannequin nonetheless has broader common data than o3-mini, the brand new mannequin’s main benefit is its quicker velocity and better efficiency in comparison with o1-mini.

Benchmark efficiency

When evaluating the efficiency of o3-mini to o1-mini, skilled testers discovered that o3-mini delivered extra correct, reasoned-through, and clearer responses than o1-mini. In line with the submit, they most well-liked o3-mini responses 56% of the time and noticed a 39% discount in main errors. 

Past human choice evaluations, in a number of STEM benchmarks, together with the Competitors Math (AIME 2024), PhD-level Science Questions (GPQA Diamond), and Competitors Code (Codeforces), o3-mini with medium reasoning — which is what ChatGPT customers will get by default — outperformed o1-mini.

Also notable is that o3-mini, with excessive reasoning effort within the benchmarks, got here near o1 efficiency, generally even surpassing it, as seen within the AIME 2024 above and Software program Engineering (SWE-bench Verified) benchmarks. The o3-mini mannequin with medium reasoning effort matched o1’s efficiency within the Codeforces benchmark. 

Security

OpenAI assessed o3-mini’s security by public launch by jailbreak and disallowed content material evaluations. The corporate discovered that the mannequin considerably surpasses GPT-4o on the evaluations. OpenAI posted the analysis outcomes under and in addition launched an o3-mini System Card, a 37-page PDF that features the detailed outcomes of the evaluations. 

The right way to entry

All subscribers to OpenAI’s paid tiers, together with ChatGPT Plus, Workforce, and Professional, can entry OpenAI o3-mini beginning at present. Plus and Workforce customers now have 3 times the speed restrict, going from 50 messages per day with o1-mini to 150 messages per day. ChatGPT Enterprise entry is coming in per week. 

The o3-mini mannequin will exchange o1-mini within the mannequin picker, as it will be helpful for a similar duties, besides that have will now be improved with decrease latency and better fee limits. As a paid person, on the time of writing, I didn’t but have entry to the o3-mini, and am as an alternative nonetheless seeing the o1-mini possibility.

If you do not have a subscription, no worries: You possibly can see if o3-mini is definitely worth the hype out of your free account. All free ChatGPT customers need to do is click on on “Motive” within the message textbox or regenerate a response. OpenAI CEO Sam Altman confirmed free entry in a submit on X. Till now, all of the reasoning fashions have been stored behind a paywall; OpenAI didn’t specify any limitations across the new mannequin for Free customers. 

Latest Articles

How AI Agents Are Reshaping Security and Fraud Detection in the...

Fraud and cybersecurity threats are escalating at an alarming fee. Companies lose an estimated 5% of their annual income...

More Articles Like This