Google’s SIMA 2 agent uses Gemini to reason and act in virtual worlds

Google DeepMind shared on Thursday a analysis preview of SIMA 2, the subsequent technology of its generalist AI agent that integrates the language and reasoning powers of Gemini, Google’s massive language mannequin, to maneuver past merely following directions to understanding and interacting with its setting.

Like a lot of DeepMind’s tasks, together with AlphaFold, the primary model of SIMA was skilled on a whole bunch of hours of online game information to discover ways to play a number of 3D video games like a human, even some video games it wasn’t skilled on. SIMA 1, unveiled in March 2024, may comply with primary directions throughout a variety of digital environments, nevertheless it solely had a 31% success charge for finishing advanced duties, in comparison with 71% for people.

“SIMA 2 is a step change and enchancment in capabilities over SIMA 1,” Joe Marino, senior analysis scientist at DeepMind, stated in a press briefing. “It’s a extra normal agent. It could possibly full advanced duties in beforehand unseen environments. And it’s a self-improving agent. So it could actually truly self-improve based mostly by itself expertise, which is a step in direction of extra general-purpose robots and AGI programs extra usually.”

DeepMind says SIMA 2 doubles the efficiency of SIMA 1Picture Credit:Google DeepMind

SIMA 2 is powered by the Gemini 2.5 flash-lite mannequin, and AGI refers to synthetic normal intelligence, which DeepMind defines as a system able to a variety of mental duties with the flexibility to study new expertise and generalize information throughout completely different areas.

Working with so-called “embodied brokers” is essential to generalized intelligence, DeepMind’s researchers say. Marino defined that an embodied agent interacts with a bodily or digital world through a physique — observing inputs and taking actions very similar to a robotic or human would — whereas a non-embodied agent may work together along with your calendar, take notes, or execute code.

Jane Wang, a senior workers analysis scientist at DeepMind with a background in neuroscience, instructed Trendster that SIMA 2 goes far past gameplay.

“We’re asking it to truly perceive what’s occurring, perceive what the person is asking it to do, after which be capable to reply in a commonsense means that’s truly fairly tough,” Wang stated.

Techcrunch occasion

San Francisco
|
October 13-15, 2026

By integrating Gemini, SIMA 2 doubled its predecessor’s efficiency, uniting Gemini’s superior language and reasoning talents with the embodied expertise developed by way of coaching.

Picture Credit:Google DeepMind

Marino demoed SIMA 2 in “No Man’s Sky,” the place the agent described its environment — a rocky planet floor — and decided its subsequent steps by recognizing and interacting with a misery beacon. SIMA 2 additionally makes use of Gemini to purpose internally. In one other recreation, when requested to stroll to the home that’s the colour of a ripe tomato, the agent confirmed its pondering — ripe tomatoes are crimson, due to this fact I ought to go to the crimson home — then discovered and approached it.

Being Gemini-powered additionally means SIMA 2 follows directions based mostly on emojis: “You instruct it 🪓🌲, and it’ll go chop down a tree,” Marino stated.

Marino additionally demonstrated how SIMA 2 can navigate newly generated photorealistic worlds produced by Genie, DeepMind’s world mannequin, accurately figuring out and interacting with objects like benches, bushes, and butterflies.

DeepMind says SIMA 2 is a self-improving agentPicture Credit:Google DeepMind

Gemini additionally allows self-improvement with out a lot human information, Marino added. The place SIMA 1 was skilled solely on human gameplay, SIMA 2 makes use of it as a baseline to supply a robust preliminary mannequin. When the staff places the agent into a brand new setting, it asks one other Gemini mannequin to create new duties and a separate reward mannequin to attain the agent’s makes an attempt. Utilizing these self-generated experiences as coaching information, the agent learns from its personal errors and progressively performs higher, basically educating itself new behaviors by way of trial and error as a human would, guided by AI-based suggestions as a substitute of people.

DeepMind sees SIMA 2 as a step towards unlocking extra general-purpose robots.

“If we consider what a system must do to carry out duties in the true world, like a robotic, I feel there are two elements of it,” Frederic Besse, senior workers analysis engineer at DeepMind, stated throughout a press briefing. “First, there’s a high-level understanding of the true world and what must be accomplished, in addition to some reasoning.”

If you happen to ask a humanoid robotic in your own home to go test what number of cans of beans you’ve within the cabinet, the system wants to grasp the entire completely different ideas — what beans are, what a cabinet is — and navigate to that location. Besse says SIMA 2 touches extra on that high-level conduct than it does on lower-level actions, which he refers to as controlling issues like bodily joints and wheels.

The staff declined to share a particular timeline for implementing SIMA 2 in bodily robotics programs. Besse instructed Trendster that DeepMind’s just lately unveiled robotics basis fashions — which might additionally purpose in regards to the bodily world and create multi-step plans to finish a mission — had been skilled otherwise and individually from SIMA.

Whereas there’s additionally no timeline for releasing greater than a preview of SIMA 2, Wang instructed Trendster the objective is to indicate the world what DeepMind has been engaged on and see what sorts of collaborations and potential makes use of are attainable.