Generated by GPT-5-mini| ApolloScape | |
|---|---|
| Name | ApolloScape |
| Subject | Autonomous driving dataset |
| Released | 2016–2018 |
| Creators | Tongji University, Apollo Research, Baidu |
| Formats | Image, video, LiDAR, segmentation masks, depth maps, bounding boxes |
| License | Academic / research |
ApolloScape
ApolloScape is a large-scale computer vision dataset developed to support research in autonomous driving, scene understanding, and urban perception. It was produced by researchers associated with Tongji University, Baidu, and groups working on the Apollo autonomous driving program, providing high-resolution images, dense annotations, and multimodal sensor data covering diverse urban environments. The collection and annotation efforts align ApolloScape with contemporaneous datasets such as KITTI, Cityscapes, and Mapillary Vistas while emphasizing scale and per-pixel labels for complex driving scenarios.
ApolloScape was introduced as a comprehensive benchmark for perception modules used in autonomous vehicles, robotics, and mapping projects by teams at Tongji University collaborating with Baidu Research and contributors from open-source initiatives like Apollo. It targets tasks frequently addressed in papers and challenges associated with venues such as CVPR, ICCV, ECCV, and workshops at NeurIPS by supplying diverse scenes from cities and highways. The dataset complements earlier efforts from groups behind Oxford RobotCar Dataset, Berkeley DeepDrive, and nuScenes by providing dense semantic segmentation, instance-level annotations, and stereo/optical flow resources suitable for benchmarking algorithms developed by labs like MIT CSAIL, Stanford AI Lab, and University of Oxford.
ApolloScape contains thousands of high-resolution frames captured with RGB cameras and LiDAR sensors across urban settings in China, including recordings in and around Shanghai, with varied illumination, weather, and traffic conditions encountered on roads also studied by projects at Tsinghua University and Zhejiang University. The corpus includes stereo pairs, sequential video clips, point clouds analogous to data used by teams at Waymo and Cruise, and panoramic sequences similar to those in Mapillary. Files and formats resemble standards employed by the PASCAL VOC and COCO communities but extend annotation density to support long-tail categories emphasized in research by Facebook AI Research, Google Research, and Microsoft Research.
Annotations in ApolloScape provide pixel-level semantic segmentation, instance segmentation masks, 3D bounding boxes, lane markings, and depth maps, enabling comparisons with outputs from methods published by groups at Carnegie Mellon University, University of California, Berkeley, and ETH Zurich. Label taxonomies include vehicle makes and classes akin to taxonomies used in datasets from Daimler projects, pedestrian and cyclist identities relevant to safety evaluations cited by National Highway Traffic Safety Administration studies, and road infrastructure elements that parallel schema from OpenStreetMap contributors. Annotation pipelines combined automated proposals and manual verification similar to workflows used by teams behind LabelMe and Amazon Mechanical Turk assisted efforts, with quality control practices inspired by publications from Google Brain and DeepMind.
ApolloScape supports benchmark tasks such as semantic segmentation, instance segmentation, lane detection, optical flow estimation, depth prediction, and 3D object detection, which are commonly evaluated in papers presented at venues including CVPR, ICCV, and ECCV. Metrics provided mirror community standards: mean Intersection over Union (mIoU) for segmentation used in PASCAL VOC comparisons, Average Precision (AP) for detection as in COCO benchmarks, End-Point Error (EPE) for flow tasks referenced in MPI Sintel evaluations, and depth RMSE for depth estimation tasks in line with assessments made in Make3D studies. Leaderboards enabled cross-comparisons with methods from institutions like Facebook AI Research, Google Research, and academic groups at University of Michigan.
Data acquisition employed instrumented vehicles equipped with synchronized camera rigs and LiDAR units, following protocols comparable to those used by KITTI teams and industry projects at Waymo and Cruise. Geolocation and temporal metadata align with mapping and SLAM efforts documented by researchers at ETH Zurich and EPFL. Processing steps included rectification, stereo calibration, point-cloud registration, and annotation projection techniques used in pipelines described by Stanford AI Lab and MIT CSAIL publications. Privacy-preserving measures and anonymization strategies followed precedents set by datasets managed by Mapillary and corporate research groups at Apple and Google.
Benchmark evaluations using ApolloScape have been cited in numerous academic articles and technical reports produced by teams at Tongji University, Baidu Research, and external groups such as Tsinghua University, Zhejiang University, and Shanghai Jiao Tong University. Results on segmentation and detection tasks illustrate performance trends discussed alongside models like ResNet, Mask R-CNN, DeepLab, and flow networks influenced by architectures from FlowNet and RAFT. Comparative studies often reference outcomes on ApolloScape when arguing improvements over baselines established on Cityscapes, KITTI, and nuScenes benchmarks.
Researchers from universities and industry labs including Tongji University, Baidu Research, MIT, Stanford University, and University of Oxford have used ApolloScape to develop and evaluate perception systems, publish papers at CVPR, ICCV, and ECCV, and inform autonomous driving stacks similar to Apollo and systems under development at Waymo, Cruise, and Zoox. The dataset influenced dataset design choices in later efforts by nuTonomy collaborators and inspired annotation practices referenced by projects at Microsoft Research and Facebook AI Research.
Category:Computer vision datasets