The Waymo World Model Hackernews Viewer

The Waymo World Model

872 points by xnx 15 hours ago | 529 comments

Comments

mattlondon 13 hours ago

Suddenly all this focus on world models by Deep mind starts to make sense. I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.

Google/Alphabet are so vertically integrated for AI when you think about it. Compare what they're doing - their own power generation , their own silicon, their own data centers, search Gmail YouTube Gemini workspace wallet, billions and billions of Android and Chromebook users, their ads everywhere, their browser everywhere, waymo, probably buy back Boston dynamics soon enough (they're recently partnered together), fusion research, drugs discovery.... and then look at ChatGPT's chatbot or grok's porn. Pales in comparison.

xnx 15 hours ago

> The Waymo World Model can convert those kinds of videos, or any taken with a regular camera, into a multimodal simulation—showing how the Waymo Driver would see that exact scene.

Subtle brag that Waymo could drive in camera-only mode if they chose to. They've stated as much previously, but that doesn't seem widely known.

yummypaint 9 hours ago

By leveraging Genie’s immense world knowledge, it can simulate exceedingly rare events—from a tornado to a casual encounter with an elephant—that are almost impossible to capture at scale in reality. The model’s architecture offers high controllability, allowing our engineers to modify simulations with simple language prompts, driving inputs, and scene layouts. Notably, the Waymo World Model generates high-fidelity, multi-sensor outputs that include both camera and lidar data.

How do you know the generated outputs are correct? Especially for unusual circumstances?

Say the scenario is a patch of road is densely covered with 5 mm ball bearings. I'm sure the model will happily spit out numbers, but are they reasonable? How do we know they are reasonable? Even if the prediction is ok, how do we fundamentally know that the prediction for 4 mm ball bearings won't be completely wrong?

There seems to be a lot of critical information missing.

caycep 10 hours ago

All this work is impressive, but I'd rather have better trains

ra7 14 hours ago

The novel aspect here seems to be 3D LiDAR output from 2D video using post-training. As far as I'm aware, no other video world models can do this.

IMO, access to DeepMind and Google infra is a hugely understated advantage Waymo has that no other competitor can replicate.

joshuamerrill 12 hours ago

It’s impressive to see simulation training for floods, tornadoes, and wildfires. But it’s also kind of baffling that a city full of Waymos all seemed to fail simultaneously in San Francisco when the power went out on Dec 22.

A power outage feels like a baseline scenario—orders of magnitude more common than the disasters in this demo. If the system can’t degrade gracefully when traffic lights go dark, what exactly is all that simulation buying us?

hazrmard 13 hours ago

cue the bell curve meme for learning autonomy:

                 ____.----.____
          ______/              \______
    _____/                            \_____
    ________________________________________

    (simulations)  (real world data)  (simulations)

Seems like it, no?

We started with physics-based simulators for training policies. Then put them in the real world using modular perception/prediction/planning systems. Once enough data was collected, we went back to making simulators. This time, they're physics "informed" deep learning models.

mellosouls 14 hours ago

Deepmind's Project Genie under the hood (pun intended). Deepmind & Waymo both Alphabet(Google) subsidiaries obv.

https://deepmind.google/blog/genie-3-a-new-frontier-for-worl...

Discussed here,eg.

Genie 3: A new frontier for world models (1510 points, 497 comments)

https://news.ycombinator.com/item?id=44798166

Project Genie: Experimenting with infinite, interactive worlds (673 points, 371 comments)

https://news.ycombinator.com/item?id=46812933

LowLevelBasket 38 minutes ago

Every time I'm in town I use a waymo. It's still a little weird to be a passenger with no driver

nelsondev 1 hour ago

Neat! What happens when the simulated data is hallucinated/incorrect?

In the example videos, the Golden Gate bridge with snow shows the bridge as 1 road, with total of 3 lanes. But in reality, it’s a split highway with divider, so 2 sides both have 3 lanes, 6 total lanes.

What happens when the car “learns” to drive on the simulated incorrect 3 lane example? For example will next time it goes on the real GG bridge hug to the rightmost lane?

AceJohnny2 12 hours ago

IIUC, there's a confusion of meaning for "World Model", between Waymo/Deepmind's which is something that can create a consistent world (for use to train Waymo's Driver), vs Yann LeCun/Advanced Machine Intelligence (AMI) which is something that can understand a world.

seydor 52 minutes ago

Why don't they call it 'the Matrix' and should i prepare for the plugs?

ok_dad 11 hours ago

I'd like to see Waymo have a few of their Drivers do some sim racing training and then compete in some live events. It wouldn't matter much to me if they were fast at all, I'd like to see them go into the rookie classes in various games and see how they avoid crashes from inexperienced players. I believe that it would be the ultimate "shitty drivers vs. AI" test.

phailhaus 12 hours ago

Finally I understand the use case for Genie 3. All the talk about "you can make any videogame or movie" seems to have been pure distraction from real uses like this: limited, time-boxed simulated footage.

nightpool 11 hours ago

Interesting, but it feels like it's going to cope very poorly with actually safety-critical situations. Having a world model that's trained on successful driving data feels like it's going to "launder" a lot of implicit assumptions that would cause a car to get into a crash in real life (e.g. there's probably no examples in the training data where the car is behind a stopped car, and the driver pulls over to another lane and another car comes from behind and crashes into the driver because it didn't check its blindspot). These types of subtle biases are going to make AI-simulated world models a poor fit for training safety systems where failure cannot be represented in the training data, since they basically give models "free reign" to do anything that couldn't be represented in world model training.

0xTJ 9 hours ago

Interesting, but I am very sceptical. I'd be interested in seeing actual verified results of how it handles a road with heavy snow, where the only lane references are the wheel tracks of other vehicles, and you can't tell where the road ends and the snow-filled ditch begins.

andrewdb 6 hours ago

So when will multiple Waymo cars communicate input data to one another to avoid the blind spots?

This would give the ability to see things other cars cannot see as well.

NullHypothesist 14 hours ago

I wonder if they can simulate the Beatles crossing the street at Abbey Road in the late '60s

SebastianSosa 9 hours ago

Very concerned with this direction of training “counterfactual events such as whether the Waymo Driver could have safely driven more confidently instead of yielding in a particular situation.” Seems dicey. This could lead in the direction to a less safe Waymo. Since the counterfactual will be generated, I suspect that that the generations will be biased towards survivor situations where most video footage in its training data will be from environments where people reacted well not those that ended in tragedy. Emboldening Waymo on generated best case data. THIS IS DANGEROUS!!!

RivieraKid 8 hours ago

The term "world model" seems almost meaningless. This is a world model in the same sense as ChatGPT is a world model. Both have some ability to model aspects of the real world.

joshfee 9 hours ago

It is great being able to generate a much larger universe of possibilities than what they can gather from real world data collection, but I'd be curious to learn how they check that the generated data is a superset of the possibility-space seen in the real world (e.g. confirm that their models closely match what is seen in the real world too)

ex-aws-dude 8 hours ago

I don't get how this solves the problem of edge cases with self driving

Even if you can generate simulated training data, don't you still have the problem where you don't even know what the edge cases you need to simulate are in the first place?

999900000999 13 hours ago

It doesn't look like they're going to open sources or anything, but I could imagine this would be great for city planning.

Or the most realistic game of SimCity you could imagine.

b_brief 8 hours ago

I would love to see more visibility into how this model’s simulation fidelity maps onto measurable safety improvements on public roads, especially in unusual edge conditions like partial sensor occlusion or atypical weather.

jrm4 12 hours ago

1. Still hard not to think that this is a huge waste of time as opposed to something that's a little more like a public transport train-ish thing, i.e. integrate with established infrastructure.

2. No seriously, is the filipino driver thing confirmed? It really feels like they're trying to bury that.

t1234s 11 hours ago

Could these world models be used to build some sort of endless GranTurismo type street racing game?

pcurve 11 hours ago

Dumb question - Why would Waymo disclose this much information to public and competitors?

fabmilo 12 hours ago

Very impressive work from Waymo. The driving with a tornado in the horizon example kind of struck my imagination, many people actually panic in such scenarios. I wonder though the compute requirements to run these simulations and producing so many data points.

heohk 5 hours ago

I'm curious how they simulate equipment failures, like a flat tire or something

mgaunard 14 hours ago

Still needs to be trained on the final boss: dense cities with narrow streets.

ActorNightly 12 hours ago

This is cool, but they are still not going about it the right way.

Its much easier to build everything into the compressed latent space of physical objects and how they move, and operate from there.

Everyone jumped on the end-2-end bandwagon, which then locks you into the input to your driving model being vision, which means that you have to have things like genie to generate vision data, which is wasteful.

londons_explore 9 hours ago

Do wayno models really use side cameras at only like 4 FPS?

Kapura 14 hours ago

Interesting that this should come out right as lawmakers are beginning to understand that Waymos have overseas operators making major decisions.

[*] https://futurism.com/advanced-transport/waymos-controlled-wo...

anigbrowl 10 hours ago

Seems relevant: Waymo exec admits remote operators in Philippines help guide US Robotaxis

https://news.ycombinator.com/item?id=46918043

tbmtbmtbmtbmtbm 8 hours ago

so insane that this is the direction things are going, instead of just reducing our reliance on cars

KeyBoardG 10 hours ago

and I literally just saw the other headline "Waymo says its robotaxis get help from remote workers in the Philippines"

nurettin 3 hours ago

> Simulation of the Waymo Driver evading a vehicle going in the wrong direction.

It really looks like waymo is the one going in the wrong direction and driving dangerously to evade traffic in this simulation.

PeterStuer 14 hours ago

Imagine driving in a Waymo 'out of a raging fire'.

Talk about edge cases.

But, what would you do? Trust the Waymo, or get out (or never get in) at the first sign of trouble?

LowLevelKernel 12 hours ago

Instructions to load it on WAYMAX simulator?

ge96 12 hours ago

What is the 5/3 tiles? Cameras?

threethirtytwo 10 hours ago

What if we put this mechanism of recording the world on people. We have mics listening to people talking to us and noises we hear.

Also we record body position actuation and self speech. As output then we put this on thousands of people to get as much data as Waymo gets.

I mean that’s what we need to imitate agi right? I guess the only thing missing is the memory mechanism. We train everything as if it’s an input and output function without accounting for memory.

LightBug1 9 hours ago

Have been seeing Waymo test vehicles regularly around central London recently, operating at speed.

For shits and giggles, I did stop randomly while crossing the road and acted like a jerk.

The Waymo did, in fact, stop.

Kudos, Waymo

01100011 12 hours ago

Nvidia has had this for years. What am I missing?

AndrewKemendo 13 hours ago

For whatever it’s worth World models is going to be the dominant computing structure of the future

I started working heavily on realizing them in 2016 and it is unquestionably (finally) the future of AI

tgrowazay 13 hours ago

This page crashes my browser.

Vivaldi 7.8.3931.63 on iOS 26.2.1 iPhone 16 pro

mempko 14 hours ago

One interesting thing from this paper is how big of a LiDaR shadow there is around the waymo car which suggests they rely on cameras for anything close (maybe they have radar too?). Seems LiDaR is only useful for distant objects.

jimt1234 14 hours ago

This might be relevant to the timing here: https://eletric-vehicles.com/waymo/waymo-exec-admits-remote-...

cratermoon 10 hours ago

Meanwhile. https://eletric-vehicles.com/waymo/waymo-exec-admits-remote-...

m0llusk 14 hours ago

Seems interesting, but why is it broken. Waymo repeatedly directed multiple automated vehicles into the private alley off of 5th near Brannan in SF even after being told none of them have any business there ever, period. If they can sense the weather and stuff then maybe they could put out a virtual sign or fence that notes what appears to be a road is neither a through way nor open to the public? I'm really bullish on automated driving long term, but now that vehicles are present for real we need to start to think about potentially getting serious about finding some way to get them to comply with the same laws that limit what people can do.

Vosporos 13 hours ago

The new frontier is manifestly the Phillipines.

devmor 14 hours ago

Wow, interesting timing for this PR blast considering the admission in the Senate Commerce Committee hearing. Not transparent at all!

turtlesdown11 14 hours ago

How many Filipinos, who do not have US drivers licenses, does it take to drive this new model?

add-sub-mul-div 15 hours ago

"Autonomous"

https://cybernews.com/news/waymo-overseas-human-agents-robot...

smcl 12 hours ago

The Waymo driving model: hire some guys in Philippines: https://futurism.com/advanced-transport/waymos-controlled-wo...

OGEnthusiast 15 hours ago

What's going to happen to all the millions of drivers who will lose their job overnight? In a country with 100 million guns, are we really sure we've thought this through?