nuReasoning is a reasoning-centric autonomous driving dataset from Motional and UCLA, containing roughly 20,000 real-world long-tail driving clips annotated with spatial, decision, and counterfactual reasoning. This post walks through loading and exploring it in FiftyOne, Voxel51's open-source multimodal data platform, so the reasoning behind every driving decision becomes something you can see and scrub through frame by frame.
Key takeaways
- nuReasoning is a large-scale, real-world autonomous-driving dataset built specifically for the hard cases — the rare, long-tail scenarios where a car has to reason, not just perceive. It pairs raw sensor data with human-verified spatial, decision, and counterfactual reasoning.
- FiftyOne is the open-source tool for annotating, visualizing, curating, and understanding multimodal datasets like nuReasoning — images, video, 3D, and metadata, side by side in one interactive view.
- Putting them together turns a folder of sensor logs and JSON into a clickable story: play all the camera views in sync, scrub the reasoning frame by frame, and pause on the critical moment to see the alternatives the car ruled out and why.
- The result is something you can't get from a benchmark score alone — an interpretable, visual window into machine decision-making on exactly the scenarios that matter most for safety.
What Is nuReasoning? The Dataset That Teaches Cars to Think
Most autonomous-driving datasets answer the question "what's in the scene?" — boxes, lanes, trajectories. nuReasoning answers a harder one: "what should the car do, and why?"
Built by
Motional in collaboration with
UCLA's Mobility Lab, nuReasoning is a reasoning-centric dataset and benchmark following in the lineage of nuScenes and nuPlan. It contains roughly 20,000 clips, each 20 seconds at 10 Hz, totaling about 105 hours of curated long-tail driving — the edge cases that break self-driving systems: vulnerable road users stepping into the street, work zones, stalled vehicles, traffic-control anomalies, emergency vehicles, even animals on the road.
These were mined from real driving logs across multiple U.S. cities including Las Vegas, Pittsburgh, Los Angeles, and Boston.
Every clip ships with synchronized multi-modal observations — multi-view cameras,
LiDAR, HD maps, ego-vehicle state, traffic-light context, and 3D object annotations — plus the part that sets it apart: human-verified reasoning annotations in three flavors.
- Spatial reasoning — what's around the car and where, grounded in geometry.
- Decision reasoning — the driving action taken (longitudinal and lateral) and the trace of why.
- Counterfactual reasoning — the alternative actions considered, each labeled by risk (safe, suboptimal, unsafe) with the reason it was or wasn't chosen.
Counterfactual annotations capture not just the action the vehicle took, but the alternatives it considered, each labeled safe, suboptimal, or unsafe, with the reason each was accepted or rejected. That counterfactual layer is the magic. It doesn't just record what the car did — it captures the options it weighed and rejected, which is exactly how a careful human driver thinks through a novel situation. The dataset's authors show that fine-tuning vision-language models on nuReasoning substantially improves driving-specific reasoning, and that adding reasoning supervision to vision-language-action training improves planning even when the text reasoning is switched off at inference. In other words, learning to reason makes the driving better, not just more explainable.
FiftyOne: The Visual Layer
FiftyOne, from Voxel51, is an open-source tool for building high-quality datasets and visual-AI models. It lets you visualize datasets, interpret models, evaluate performance, and surface data-quality issues far faster than scrolling through files by hand.
The core idea is simple but powerful: load your images, videos, 3D point clouds, labels, and metadata into a FiftyOne
Dataset, and explore it all in an interactive App — browsing samples, filtering by any field, and inspecting labels and predictions side by side. It's Python-first, runs locally, and is extensible through
plugins and custom panels, so you can shape it around whatever your data needs.
For a dataset as rich and multimodal as nuReasoning — many camera views, video, maps, and structured text annotations all tied to the same moments in time — that kind of unified, interactive view is exactly the right tool.
nuReasoning in FiftyOne: The Reasoning, Made Visible
A reasoning dataset is only as useful as your ability to see the reasoning. Raw nuReasoning clips are directories of images, LiDAR, pickled state, and JSON — not something you can intuitively explore. FiftyOne turns that into a living, navigable experience:
Synchronized multi-view playback. Each clip becomes a
grouped video dataset — one group per clip, one slice per camera view, plus a rasterized bird's-eye-view map. You play all eight cameras and the map in sync and flip between them to see the full context the car had.
Reasoning that updates as you scrub. The spatial, decision, and reasoning-trace annotations ride along as frame-level fields, so as you move through the clip, the sidebar shows the scene description, the driving decision, and the "why" — moment by moment.
The decision-frame reveal. Saved views let you jump straight to the critical keyframe. A custom reasoning panel then surfaces the counterfactual fork: the action taken alongside the alternatives that were rejected, color-coded by risk, each with its reason. That single beat — "here's what it did; here's the unsafe option it ruled out, and why" — is the whole story of why reasoning matters, made visible.
Perception grounded in geometry. Camera detections render as overlays and 3D object positions appear on the bird's-eye map, so the reasoning is anchored to the actual scene rather than floating as abstract text.
The payoff is interpretability you can demo. Instead of citing a benchmark number, you can open a rare scenario, scrub to the decision point, and literally watch a model's safety-oriented reasoning play out — something nuScenes and nuPlan, focused on perception and planning, were never built to show. For researchers, educators, and teams building trust in autonomous systems, that's a uniquely compelling way to explore what reasoning-driven driving really looks like.
The demo notebook in the resources below walks through the full setup. Load a handful of clips, scrub to a decision frame, and see for yourself how the car thinks.
Next steps
Explore nuReasoning
Get started with FiftyOne
Try it yourself. Install FiftyOne 1.17+ in a virtual environment (pip install "fiftyone>=1.17"), request access to nuReasoning, and load a handful of clips as a grouped video dataset. Then open the App, scrub to a decision frame, and see how the car thinks.
nuReasoning is intended for research and development in reasoning-centric autonomous driving. It should not be used as the sole basis for real-world autonomous-driving deployment, nor to identify people or infer sensitive attributes.
FAQ
What is nuReasoning?
nuReasoning is a reasoning-centric autonomous driving dataset from Motional and UCLA containing roughly 20,000 real-world long-tail driving clips — edge cases like vulnerable road users, work zones, stalled vehicles, and emergency vehicles. Each clip is 20 seconds at 10 Hz and annotated with spatial, decision, and counterfactual reasoning verified by humans.
How is nuReasoning different from nuScenes and nuPlan?
nuScenes and nuPlan are focused on perception and planning. nuReasoning adds a reasoning layer: not just what the car detected or where it drove, but what it decided, why, and what alternatives it ruled out. The counterfactual annotations are unique to nuReasoning.
What sensor modalities does nuReasoning include?
Multi-view cameras, LiDAR, HD maps, ego-vehicle state, traffic-light context, and 3D object annotations — all synchronized per clip.
Where was nuReasoning collected?
Real driving logs from Las Vegas, Pittsburgh, Los Angeles, and Boston.
How do I access nuReasoning?
What version of FiftyOne do I need?
FiftyOne 1.17 or later. Install via pip install "fiftyone>=1.17" in a virtual environment.
How does FiftyOne handle nuReasoning's multi-view camera data?
nuReasoning loads as a grouped video dataset: one group per clip, one slice per camera view, plus a rasterized bird's-eye-view map. All views play in sync in the FiftyOne App.