Google DeepMind released Genie 2, a large foundation model, capable of generating a variety of 3D environments. It can transform a single image into interactive virtual world. It facilitates development of embodied agents. It paves the way for new, creative workflows. Its predecessor was Genie 1. It was a 2D model. Genie 2 is 3D. It employs autoregressive latent diffusion technology. It generates sequential frames in response to user actions.
Google’s SIMA, an AI agent, performs tasks in Genie 2 following NLP instructions.
The model can generate new content on the fly.
Google is confident that it is marching better on a path to AGI than its competitors.