Project Genie, Google DeepMind's experimental world model, turns image and text prompts into interactive, navigable environments. Co-leads Shlomi Fruchter and Jack Parker-Holder explain the key distinction: language models predict the next word; world models predict the next visual state based on what an agent does. Push a ball and it rolls. Walk into a room and lighting adjusts. No game engine — the model learns environment dynamics from data alone.
The researchers see three use cases: safe AI agent training (simulate before real-world deployment), interactive education (walk through ancient Rome in class), and game and film prototyping. Project Genie is available to Google AI Ultra subscribers in the US.
For developers, the agent training application is the key signal — world models are sandboxed environments where AI agents safely learn physical tasks before deployment.