MVTec Densely Segmented Supermarket Dataset (MVTec D2S)

D2S dataset


The Densely Segmented Supermarket (D2S) dataset is a benchmark for instance-aware semantic segmentation in an industrial domain. It contains 21,000 high-resolution images with pixel-wise labels of all object instances. The objects comprise groceries and everyday products from 60 categories. The benchmark is designed such that it resembles the real-world setting of an automatic checkout, inventory, or warehouse system. The training images only contain objects of a single class on a homogeneous background, while the validation and test sets are much more complex and diverse. To further benchmark the robustness of instance segmentation methods, the scenes are acquired with different lightings, rotations, and backgrounds.

We ensure that there are no ambiguities in the labels and that every instance is labeled comprehensively. The annotations are pixel-precise and allow using crops of single instances for artificial data augmentation. The dataset covers several challenges highly relevant in the field, such as a limited amount of training data and a high diversity in the test and validation sets.

More info can be found in the corresponding paper and the video below.

Thumbnail for MVTec D2S dataset video

Please note: Once you watch the video, data will be transmitted to Youtube/Google. For more information, see Google Privacy.

Please note: License Terms

The data is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

In particular, it is not allowed to use the dataset for commercial purposes. If you are unsure whether or not your application violates the non-commercial use clause of the license, please contact us.

Download the MVTec D2S Dataset

For ease-of-use, the data is provided in the same format as the well-known COCO dataset (


  • Images: The ‘images’-folder contains all images, including the artificially augmented ones as described in the paper.
  • Annotations: Contains the annotations for different training and validation splits

D2S Amodal:


If you use the D2S dataset in scientific work, please cite:

Patrick Follmann, Tobias Böttger, Philipp Härtinger, Rebecca König, Markus Ulrich: MVTec D2S: Densely Segmented Supermarket Dataset; in: European Conference on Computer Vision (ECCV), 569-585, 2018. Download the paper

If you use the D2S Amodal dataset in scientific work, please cite:

Patrick Follmann, Rebecca König, Philipp Härtinger, and Michael Klostermann. Learning to See the Invisible: End-to-End Trainable Amodal Instance Segmentation; in: 2019 IEEE Winter Conference on Applications in Computer Vision (WACV 2019), January 2019. Download the paper

Test Set Evaluation

The D2S test annotations are not public. Please run the inference of your model on the test images and store the results in a json-file.

If you provide us with the result-json-file, we can evaluate your results for you on the test-set-annotations. For this purpose, please get in touch with us via the contact form below.

The results should be in the typical COCO-json-format, i.e. the json-file should contain a list of results (with image_id, category_id, segmentation (mask given as RLE) or bbox and score) using the structure as in the following example:

  • For instance segmentation masks (region given as RLE):
    [{u'image_id': 42,
       u'category_id': 18,
       u'segmentation': {u'counts': , u'size': [1440, 1920]},
       u'score': 0.959136}, ...]
  • For bounding box detection (bbox given as [x, y, w, h]):
    [{u'image_id': 42,
       u'category_id': 18,
       u'bbox': [258.15,41.29,348.26,243.78],
       u'score': 0.959136}, ...]

If you have any questions or comments about the dataset, feel free to contact us via this form.

Contact Form MVTec D2S

Contact Form MVTec D2S
Please note MVTec's privacy policy.