- Stanford Bunny: A Volumetric Method for Building Complex Models from Range Images [SIGGRAPH 1996]
- KITTI: Are we ready for autonomous driving? the KITTI vision benchmark suite [CVPR 2012]
- NYUV2: Indoor Segmentation and Support Inference from RGBD Images [ECCV 2012]
- FAUST: FAUST: Dataset and evaluation for 3D mesh registration [CVPR 2014]
- ICL-NUIM: A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM [ICRA 2014]
- Augmented ICL-NUIM: Robust Reconstruction of Indoor Scenes [CVPR 2015]
- ModelNet: 3d shapenets: A deep representation for volumetric shapes [
cls; CVPR 2015] - SUN RGB-D: Sun rgb-d: A rgb-d scene understanding benchmark suite [
det; CVPR 2015] - SHREC15: SHREC’15 Track: Non-rigid 3D Shape Retrieval [Eurographics 2015]
- ShapeNetCore: ShapeNet: An Information-Rich 3D Model Repository [
cls; arXiv 2015] - ShapeNet Part: A Scalable Active Framework for Region Annotation in 3D Shape Collections [
seg; SIGGRAPH Asia 2016] - SceneNN: SceneNN: A Scene Meshes Dataset with aNNotations [3DV 2016]
- Oxford RobotCar: 1 Year, 1000km: The Oxford RobotCar Dataset [IJRR 2016]
- Redwood: A large dataset of object scans [arXiv 2016]
- S3DIS: 3D Semantic Parsing of Large-Scale Indoor Spaces [CVPR 2016], Joint 2D-3D-Semantic Data for Indoor Scene Understanding [
seg; arXiv 2017] - 3DMatch: 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions [CVPR 2017]
- SUNCG: Semantic Scene Completion from a Single Depth Image [CVPR 2017]
- ScanNet: Scannet: Richly-annotated 3d reconstructions of indoor scenes [
seg,det; CVPR 2017 ] - Semantic3D: Semantic3D.net: A new Large-scale Point Cloud Classification Benchmark [arXiv 2017]
- SemanticKITTI: SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [
seg; ICCV 2019] - ScanObjectNN: Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data [ICCV 2019]
- PartNet: PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding [CVPR 2019]
- Completion3D: TopNet: Structural Point Cloud Decoder [
completion; CVPR 2019] - Argoverses: Argoverse: 3D Tracking and Forecasting with Rich Maps [CVPR 2019]
- Waymo Open Dataset: Scalability in Perception for Autonomous Driving: Waymo Open Dataset [CVPR 2020]
- nuScenes: nuScenes: A multimodal dataset for autonomous driving [
det,tracking; CVPR 2020] - SensatUrban: Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges [CVPR 2021], SensatUrban: Learning Semantics from Urban-Scale Photogrammetric Point Clouds [IJCV 2022]
- WAYMO OPEN MOTION DATASET: Large Scale Interactive Motion Forecasting for Autonomous Driving : The WAYMO OPEN MOTION DATASET [arXiv 2104]
- Panoptic nuScenes: Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking [arXiv 2109]
- BuildingNet: BuildingNet: Learning to Label 3D Buildings [ICCV 2021 Oral]
- ARKitScenes: ARKitScenes - A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data [NeurIPS 2021]
- CODA: CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving [arXiv 2203]
- STPLS3D: STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset [arXiv 2203]
- TO-Scene: A Large-scale Dataset for Understanding 3D Tabletop Scenes [arXiv 2203]
- Omni3D: Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild [arXiv 2207]
- Rope3D: Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task [CVPR 2022]
- DAIR-V2X: DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection [CVPR 2022]
- ONCE-3DLanes: ONCE-3DLanes: Building Monocular 3D Lane Detection [CVPR 2022]
- Ithaca365: Ithaca365: Dataset and Driving Perception under Repeated and Challenging Weather Conditions [CVPR 2022]
- OpenLane: PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark [ECCV 2022]
- HM3DSEM: Habitat-Matterport 3D Semantics Dataset [arXiv 2210]
- Objaverse: Objaverse: A Universe of Annotated 3D Objects [arXiv 2212]
- OmniObject3D: OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation [arXiv 2301]
- OpenOccupancy: OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception [arXiv 2303]
- V2V4Real: V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception [CVPR 2023]
- CVPR2023-Occupancy-Prediction-Challenge: https://github.com/CVPR2023-3D-Occupancy-Prediction/CVPR2023-3D-Occupancy-Prediction [CVPR2023 Challenge]
- WOMD-LiDAR: WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting [arXiv 2304]
- Occ3D: Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving [arXiv 2304]
- SSCBench: SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [arXiv 2306]
- UniG3D: UniG3D: A Unified 3D Object Generation Dataset [arXiv 2306]
- WaterScenes: WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset for Autonomous Driving on Water Surfaces [arXiv 2307]
普林斯顿ModelNet项目的目标是为计算机视觉、计算机图形学、机器人和认知科学领域的研究者们提供一个全面、干净的三维CAD模型集合, 该数据的主页地址https://modelnet.cs.princeton.edu, 数据最早发布在论文3D ShapeNets: A Deep Representation for Volumetric Shapes [CVPR 2015]上.
相关工作人员从数据中选择了常见的40类和10类构成数组子集, 分别表示为ModelNet40和ModelNet10, 且两个数据集都有orientation aligned的版本。实验中数据用到比较多的是ModelNet40, 有如下三种数据形式:
| 数据集 | modelnet40_normal_resampled.zip | modelnet40_ply_hdf5_2048.zip | ModelNet40.zip |
|---|---|---|---|
| 文件大小 | 1.71G | 435M | 2.04G |
| 内容 | point: x, y, z, normal_x, normal_y, normal_z; shape: 10k points |
point: x, y, z; normal_x, normal_y, normal_z; shape: 2048 points |
off格式, 具体参考这里 |
| 训练集 / 测试集 | 9843 / 2468 | 9840 / 2468 | 9844 / 2468 |
| 下载地址 | modelnet40_normal_resampled.zip | modelnet40_ply_hdf5_2048.zip | ModelNet40.zip |
ShapeNet数据集是一个有丰富标注的、大规模的3D图像数据集, 发布于ShapeNet: An Information-Rich 3D Model Repository [arXiv 2015], 它是普林斯顿大学、斯坦福大学和TTIC研究人员共同努力的结果, 官方主页为shapenet.org.ShapeNet包括ShapeNetCore和ShapeNetSem子数据集.
ShapeNet Part是从ShapeNetCore数据集选择了16类并进行语义信息标注的数据集, 用于点云的语义分割任务, 其数据集发表于A Scalable Active Framework for Region Annotation in 3D Shape Collections [SIGGRAPH Asia 2016], 官方主页为 ShapeNet Part. 数据包含几个不同的版本, 其下载链接分别为shapenetcore_partanno_v0.zip (1.08G)和shapenetcore_partanno_segmentation_benchmark_v0.zip(635M). 下面就第2个数据集segmentation benchmark进行介绍:
从下面表格可以看出, ShapeNet Part总共有16类, 50个parts,总共包括16846个样本。该数据集中样本呈现出不均衡特性,比如Table包括5263个, 而Earphone只有69个。每个样本包含2000多个点, 属于小数据集。该数据集中训练集12137个, 验证集1870个, 测试集2874个, 总计16881个。[注意, 这里和下面表格统计的(16846)并不一样, 后来发现是训练集、验证集和测试集有35个重复的样本]
| 类别 | nparts/shape | nsamples | 平均npoints/shape |
|---|---|---|---|
| Airplane | 4 | 2690 | 2577 |
| Bag | 2 | 76 | 2749 |
| Cap | 2 | 55 | 2631 |
| Car | 4 | 898 | 2763 |
| Chair | 4 | 3746 | 2705 |
| Earphone | 3 | 69 | 2496 |
| Guitar | 3 | 787 | 2353 |
| Knife | 2 | 392 | 2156 |
| Lamp | 4 | 1546 | 2198 |
| Laptop | 2 | 445 | 2757 |
| Motorbike | 6 | 202 | 2735 |
| Mug | 2 | 184 | 2816 |
| Pistol | 3 | 275 | 2654 |
| Rocket | 3 | 66 | 2358 |
| Skateboard | 3 | 152 | 2529 |
| Table | 3 | 5263 | 2722 |
| Total | 50 | 16846 | 2616 |
S3DIS是3D室内场景的数据集, 主要用于点云的语义分割任务。主页http://buildingparser.stanford.edu/dataset.html. (但官方主页我暂时访问不了了, 关于数据集背景的介绍性说明就不写了). 关于S3DIS的论文是Joint 2D-3D-Semantic Data for Indoor Scene Understanding [arXiv 2017]和3D Semantic Parsing of Large-Scale Indoor Spaces [CVPR 2016]. S3DIS从3个building的6个Area采集得到, Area1, Area3, Area6属于buidling 1, Area2和Area4属于building 2, Area5属于building 3. 常用的数据下载格式包括如下三种:
- Stanford3dDataset_v1.2_Aligned_Version.zip, 比如: RandLA-Net
- Stanford3dDataset_v1.2.zip, 比如: CloserLook3D
- indoor3d_sem_seg_hdf5_data.zip, 比如: PointNet
其中Stanford3dDataset_v1.2_Aligned_Version.zip和Stanford3dDataset_v1.2.zip都是完整场景的数据集, 每个点对应6个维度(x, y, z, r, g, b), 而indoor3d_sem_seg_hdf5_data.zip是对原始数据场景的切割,把大场景切割成1m x 1m的block: 完整数据集被切割成了23585个block, 每个block是4096个点, 每个点对应9个维度: 除了x, y, z, r, g, b信息外,剩余的3维是相对于所在大场景的位置(归一化坐标).
下面是由Stanford3dDataset_v1.2.zip数据统计得到的关于S3DIS的信息, 可能和论文中一些结果不太一致。S3DIS数据集由以上6个Area采集得到, 共包含272个场景, 可分为11种不同的场景(括号内为场景数量, 场景大小(点的数量)): office(156, 87w), conference room(11, 142w), hallway(61, 122w), auditorium(2, 817w), open space(1, 197w), lobby(3, 242w), lounge(3, 146w), pantry(3, 58w), copy room(2, 52w), storage(19, 35w) and WC(11, 70w). 根据语义信息, 上述场景被分成14个类别, 如下表所示. 可以看到不同的类别也是不均衡的, 比如wall有1547个, 但sofa只有55个.
| Total | column | clutter | chair | window | beam | floor | wall | ceiling | door | bookcase | board | table | sofa | stairs |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9833 | 254 | 3882 | 1363 | 168 | 159 | 284 | 1547 | 385 | 543 | 584 | 137 | 455 | 55 | 17 |
详细信息请查看3DMatch文件夹。