When OpenAI unveiled its o3 βreasoningβ AI mannequin in December, the corporate partnered with the creators of ARC-AGI, a benchmark designed to check extremely succesful AI, to showcase o3βs capabilities. Months later, the outcomes have been revised, they usually now look barely much less spectacular than they did initially.
Final week, the Arc Prize Basis, which maintains and administers ARC-AGI, up to date its approximate computing prices for o3. The group initially estimated that the best-performing configuration of o3 it examined, o3 excessive, value round $3,000 to unravel a single ARC-AGI drawback. Now the Arc Prize Basis thinks that the price is way greater β probably round $30,000 per job.
The revision is notable as a result of it illustrates simply how costly at presentβs most refined AI fashions could find yourself being for sure duties, a minimum of early on. OpenAI has but to cost o3 β or launch it, even. However the Arc Prize Basis believes OpenAIβs o1-pro mannequin pricing is an inexpensive proxy.
For context, o1-pro is OpenAIβs costliest mannequin to this point.
βWe consider o1-pro is a better comparability of true o3 valueΒ β¦ because of quantity of test-time compute used,β Mike Knoop, one of many co-founders of the Arc Prize Basis, advised Trendster. βHowever that is nonetheless a proxy, and weβve saved o3 labeled as preview on our leaderboard to replicate the uncertainty till official pricing is introduced.β
A excessive value for o3 excessive wouldnβt be out of the query, given the quantity of computing assets the mannequin reportedly makes use of. Based on the Arc Prize Basis, o3 excessive used 172x extra computing than o3 low, the lowest-computing configuration of o3, to deal with ARC-AGI.
Furthermore, rumors have been flying for fairly a while about dear plans OpenAI is contemplating introducing for enterprise prospects. In early March, The Info reported that the corporate could also be planning to cost as much as $20,000 monthly for specialised AI βbrokers,β like a software program developer agent.
Some would possibly argue that even OpenAIβs priciest fashions will value properly below what a typical human contractor or staffer would command. However as AI researcher Toby Ord identified in a submit on X, the fashions might not be as environment friendly. For instance, o3 excessive wanted 1,024 makes an attempt at every job in ARC-AGI to realize its greatest rating.