Google’s Genie world model can now simulate real streets with Street View

We’ve all pulled up Street View on Google Maps to indicate a friend what our childhood home looked like, or dropped that little person icon onto the streets of Paris to see if we booked a hotel in a cool neighborhood. Imagine with the ability to do this, but in a more immersive, interactive way that permits you to really simulate the road and its environs, and even do things like adjust the weather or see what it will seem like in a “Day After Tomorrow” scenario.

That’s one in every of the goals of Google’s latest integration. Starting today, Google DeepMind is connecting Street View to Project Genie, the corporate’s general-purpose world model that may generate diverse, interactive environments. The brand new feature launched throughout the Google I/O developer conference. 

“It’s really powerful for each the agent [and robotics] use case and for humans to play with, and that’s all the time been the thesis of Genie,” Jack Parker-Holder, a research scientist on DeepMind’s open-endedness team, told TechCrunch.

He gave the instance of a brand new robot being deployed in London, which rarely sees the sun. Genie could, Parker-Holder says, simulate those scarce occasions when the sun glints off the Victorian housing, so the rays don’t shock the robot when it happens.

“Concurrently, you may say, ‘I’m going to Recent York City, but not this time of 12 months,’” he continued. “‘It’s going to be snowy. I need to see what that block looks like within the snow.’” 

Google has been collecting Street View data for 20 years via cars with cameras and individuals strapped with “tracker backpacks.” The tech giant has collected north of 280 billion images across 110 countries and 7 continents. 

“With Street View, we’ve imagery from a great quantity of the world,” Jack said. “You possibly can imagine how potentially powerful it’s to mix this wealthy source of real-world information and data with a capability to simulate worlds.”

Google released its latest world model Genie 3 for research preview last August and opened up access to the tool to Google AI Ultra subscribers within the U.S. in January, allowing customers to create interactive game worlds from text prompts or images. The goal is to make use of Genie for educational experiences, gaming, and robotics training. 

Genie 3 is already helping to power one in every of Waymo’s simulators to coach its self-driving cars on “exceedingly rare events” like tornadoes or casual elephant encounters. Adding Street View data to that might help Waymo prepare to launch in additional cities across the globe.

Waymo has its own simulator that it relied on to scale to 11 U.S. cities and test its AI driver in several more. The difference with Genie, says Parker-Holder, is that those are all from the automobile’s viewpoint. Street View allows for not only simulating a world anchored to an actual place, but additionally shifting the viewpoint to other sorts of agents, like a human or a robot. 

Google is launching Street View in Genie to some Ultra users in the USA starting today, with access rolling out at scale over time. Global Ultra users will gain access over the following few weeks, per the corporate.

The researchers’ goal is to place this latest capability into as many hands as possible, per Diego Rivas, a product manager at DeepMind. He cautioned that Street View particularly and Genie normally continues to be an experiment, so there’s much to enhance upon when it comes to accuracy.

Within the samples the Google team showed me — including an underwater simulation of a neighborhood I used to live in — the outcomes are impressive and recognizable, but still video game quality somewhat than photorealistic. The models are also not yet physics-aware, meaning they don’t yet understand cause and effect. For instance, in a simulation of a girl running through a snowy Joshua Tree, she ran throughout cacti and bushes.

Compare that to, say, Google’s image generator Nano Banana — which might now generate perfect text in infographics — or its video generator Veo — which understands that paper boats drift on water currents, smoke disperses into the air, and fabric drapes over forms. 

Physics isn’t hard-coded into these models; they learn it intuitively over time through passive commentary, as a living being would. 

“I feel for this type of model, it’s perhaps six to 12 months behind video when it comes to the accuracy and quality, so I feel it’s something we’ll solve,” Parker-Holder said. 

Jonathan Herbert, director of Google Maps who began on the Street View team as an intern 12 years ago, said that Genie can’t yet create a faithful reconstruction of a street. He thinks the true breakthrough is the AI’s spatial continuity. Should you turn 360 degrees, the AI appropriately remembers and simulates the environment behind you. From that time on, the model can construct a brand new environment on top of that.

“We’ve got long thought of how we are able to construct out the perfect and richest model of the world on top of Street View data,” Herbert said. “It’s definitely been an idea of ours to make use of Maps Data in latest ways and for brand new sorts of AI research for a reasonably very long time.”

Whenever you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.

Related Post

Leave a Reply