Odyssey, a startup based by self-driving pioneers Oliver Cameron and Jeff Hawke, has developed an AI mannequin that lets customers “work together” with streaming video.
Out there on the internet in an “early demo,” the mannequin generates and streams video frames each 40 milliseconds. Through primary controls, viewers can discover areas inside a video, much like a 3D-rendered online game.
“Given the present state of the world, an incoming motion, and a historical past of states and actions, the mannequin makes an attempt to foretell the subsequent state of the world,” explains Odyssey in a weblog put up. “Powering it is a new world mannequin, demonstrating capabilities like producing pixels that really feel lifelike, sustaining spatial consistency, studying actions from video, and outputting coherent video streams for five minutes or extra.”
Quite a few startups and large tech firms are chasing after world fashions, together with DeepMind, influential AI researcher Fei-Fei Lee’s World Labs, Microsoft, and Decart. They imagine that world fashions may in the future be used to create interactive media, equivalent to video games and flicks, and run lifelike simulations like coaching environments for robots.
However creatives have combined emotions concerning the tech. A current Wired investigation discovered that sport studios like Activision Blizzard, which has laid off scores of employees, are utilizing AI to chop corners and fight attrition. And a 2024 research commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based movie, tv, and animation jobs might be disrupted by AI within the coming months.
For its half, Odyssey is pledging to collaborate with inventive professionals — not change them.
“Interactive video […] opens the door to thoroughly new types of leisure, the place tales might be generated and explored on demand, free from the constraints and prices of conventional manufacturing,” writes the corporate in its weblog put up. “Over time, we imagine all the things that’s video at this time — leisure, adverts, training, coaching, journey, and extra — will evolve into interactive video, all powered by Odyssey.”
Odyssey’s demo is a bit tough across the edges, which the corporate acknowledges in its put up. The environments the mannequin generates are blurry and distorted, and unstable within the sense that their layouts don’t at all times stay the identical. Stroll ahead in a single path for some time or flip round, and the environment would possibly out of the blue look completely different.
However the firm’s promising to quickly enhance upon the mannequin, which may presently stream video at as much as 30 frames per second from clusters of Nvidia H100 GPUs at the price of $1-$2 per “user-hour.”
“Wanting forward, we’re researching richer world representations that seize dynamics much more faithfully, whereas rising temporal stability and chronic state,” writes Odyssey in its put up. “In parallel, we’re increasing the motion area from movement to world interplay, studying open actions from large-scale video.”
Odyssey is taking a distinct method than many AI labs on the planet modeling area. It designed a 360-degree, backpack-mounted digital camera system to seize real-world landscapes, which Odyssey thinks can function a foundation for higher-quality fashions than fashions skilled solely on publicly out there knowledge.
Up to now, Odyssey has raised $27 million from buyers together with EQT Ventures, GV, and Air Avenue Capital. Ed Catmull, one of many co-founders of Pixar and former president of Walt Disney Animation Studios, is on the startup’s board of administrators.
Final December, Odyssey stated it was engaged on software program that enables creators to load scenes generated by its fashions into instruments equivalent to Unreal Engine, Blender, and Adobe After Results in order that they are often hand-edited.