nuScenes — LLMpedia

nuScenes
Name	nuScenes
Developer	Motional and Aptiv (formerly nuTonomy)
Released	2019
Latest release	2019 (v1.0)
Programming language	Python
Platform	Linux
License	See Development and Licensing

Contents

Overview
Dataset Composition
Data Collection and Annotation
Benchmarking and Evaluation Metrics
Use Cases and Impact
Limitations and Criticisms
Development and Licensing

nuScenes

nuScenes is a large-scale multimodal autonomous driving dataset and benchmark released in 2019. It provides synchronized sensor streams, annotations, and evaluation tools intended for research in perception, tracking, prediction, and mapping. The dataset has been used by academic groups, industrial laboratories, and competition organizers to compare algorithms across standardized tasks.

Overview

nuScenes was introduced by researchers associated with Aptiv and Motional following earlier autonomous driving efforts such as Waymo Open Dataset and KITTI. The release aimed to complement datasets like Argoverse and Oxford RobotCar by offering full 360° coverage through multiple modalities, analogous in ambition to projects from Tesla and research groups at MIT and Stanford University. nuScenes emphasized multimodal fusion similar to approaches in works by Waymo, Uber ATG, and laboratories at Carnegie Mellon University.

Dataset Composition

The dataset contains urban driving scenes collected in multiple cities with sensors that mirror setups used by companies such as Aptiv and Motional. nuScenes includes 1,000 20-second scenes with synchronized streams from 6 cameras, 1 lidar, 5 radars, and GPS/IMU, comparable in scope to multimodal collections like Lyft Level 5 Dataset and Baidu ApolloScape. Annotations encompass 23 object classes for 3D bounding boxes and attributes, akin to labeling schemas used in KITTI and Pascal VOC projects. Metadata covers scene tokens, sample tokens, and calibrated sensor parameters, following data-management practices seen at Google Research and Facebook AI Research.

Data Collection and Annotation

Recording campaigns were performed in cities with complex urban layouts, reflecting challenges studied in datasets from Berkeley DeepDrive and Nuscenes' contemporaries such as ApolloScape and Cityscapes. Lidar sweeps were time-stamped and synchronized to camera frames, comparable to pipelines used by Waymo and Uber ATG. Human annotators produced per-frame 3D labels and tracking identities using protocols that resemble annotation efforts at Labelbox and Scale AI. Quality-control steps borrowed best practices from industry teams at Aptiv, Motional, and research groups at ETH Zurich.

Benchmarking and Evaluation Metrics

nuScenes defines multiple benchmark tasks: 3D detection, tracking, and prediction, with evaluation metrics inspired by prior challenges including KITTI Vision Benchmark Suite and metrics used in ImageNet and COCO. The primary detection metric is a composite score aggregating mean average precision and true-positive errors (translation, scale, orientation, velocity, and attribute), echoing multifaceted evaluations from COCO and ADE20K. Tracking is evaluated with CLEAR MOT metrics and identity-aware scores used in competitions like MOTChallenge. Prediction benchmarks incorporate displacement and miss-rate measures similar to those adopted by Argoverse.

Use Cases and Impact

nuScenes has been adopted widely in academia and industry for algorithm development at institutions such as MIT CSAIL, Stanford Artificial Intelligence Laboratory, and companies including Waymo, Cruise, and Zoox. It enabled comparisons of sensor-fusion models influenced by research from OpenAI and DeepMind as well as commercial perception stacks from Aptiv and Motional. The dataset featured in workshops at conferences like CVPR, ICCV, and NeurIPS and influenced curricula in labs at Carnegie Mellon University and University of California, Berkeley.

Limitations and Criticisms

Critiques of nuScenes echo concerns raised for other datasets such as KITTI and Cityscapes: geographic bias toward a limited set of urban environments, limited temporal horizon relative to long-term mapping projects like HERE Technologies, and class imbalance when compared to broader corpora like ImageNet. Other limitations noted by reviewers at venues such as ECCV and industrial reports include labeling noise, sensor configuration differences relative to production vehicles by Tesla and Lucid Motors, and licensing constraints that complicate commercial usage akin to issues discussed around Open Images.

Development and Licensing

nuScenes was developed by teams spun out of research at nuTonomy and later integrated into engineering groups at Aptiv and Motional, following corporate movements comparable to acquisitions involving Cruise and Zoox. The dataset is distributed under a custom license that imposes terms for academic and non-commercial use, similar in spirit to proprietary-but-sharable terms seen in datasets from Waymo and Lyft. Users seeking integration into production or derivative datasets are advised to consult the official licensing terms and community discussions involving legal teams at institutions such as Harvard University and Stanford University.

Category:Datasets for machine learning