
MVTec Industrial 3D Object Detection Dataset (MVTec ITODD)
A Dataset for 3D Object Recognition in Industry

Abstract
The MVTec Industrial 3D Object Detection Dataset (MVTec ITODD) is a public dataset for 3D object detection and pose estimation with a strong focus on industrial settings and applications.
The dataset consists of
- 28 objects and 3500 labeled scenes containing instances of these objects
- Five sensors (two 3D sensors and three grayscale cameras) observing each scene
More information can be found in this PDF file.
Please note: License Terms
The data is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
In particular, it is not allowed to use the dataset for commercial purposes. If you are unsure whether or not your application violates the non-commercial use clause of the license, please contact us.
Attribution
If you use the dataset in scientific work, please cite:
Bertram Drost, Markus Ulrich, Paul Bergmann, Philipp Härtinger, and Carsten Steger. Introducing MVTec ITODD — A Dataset for 3D Object Recognition in Industry; in: IEEE International Conference on Computer Vision (ICCV), 2200-2208, October 2017.
Evaluate
The MVTec ITODD dataset is now one of the datasets of the BOP (Benchmark for 6D Object Pose Estimation) challenge.
Download the MVTec ITODD Dataset
Due to the size of the files, the download is split into multiple parts. All parts can be extracted into the same directory. The base package must be downloaded. Depending on which modalities your method operates on, the other packages can be downloaded as required.
Note that an evaluation on all data is preferred. For example, a method that uses 3D input data should be evaluated on both the high quality and the low quality 3D data, while a method that works on image data should be evaluated on all three cameras. However, you can also evaluate only on selected data, which will be mentioned in the result list and should be noted in any publication.
- Base Package (required): Contains the CAD models, calibration information, a few ground truth poses, and other information. This package always must be downloaded. Download Base Package (150 MB)
- 3D Data: Contains X, Y, and Z (i.e., range) images of each scene. Because the 3D data was acquired with stereo sensors, the download also contains the left camera image of each scene. The camera image has approximately the same viewpoint as the 3D / range data. Data of two sensors with different 3D quality are available:
- 2D Grayscale Images: Contains the images taken with the three grayscale cameras. Two images were taken with each: One with and one without a projected pattern that can be used, for example, for stereo reconstruction. Additionally, each image is provided in both its original form, which has optical distortions, and a rectified form.
- Download 2D Images without distortion, without projected pattern:
- Download 2D Images without distortion, with projected pattern:
- Download 2D Images with distortion, without projected pattern:
- Download 2D Images with distortion, with projected pattern:
If you have any questions or comments about the dataset, feel free to contact us via the form.