Anomaly Detection

List of Operators ↓

This chapter explains how to use anomaly detection based on deep learning.

With anomaly detection we want to detect whether or not an image contains anomalies. An anomaly means something deviating from the norm, something unknown.

image/svg+xml
A possible example for anomaly detection: Every pixel of the input image gets assigned a value that indicates how likely the pixel is to be an anomaly. The worm is not part of the worm-free apples the model has seen during training and therefore its pixels get a much higher score.

An anomaly detection model learns common features of images without anomalies. The trained model will infer, how likely an input image contains only learned features or if the image contains something different. Latter one is interpreted as an anomaly. This inference result is returned as a gray value image. The pixel values therein indicate how likely the corresponding pixels in the input image pixels show an anomaly.

General Workflow

In this paragraph, we describe the general workflow for an anomaly detection task based on deep learning.

Preprocess the data

This part is about how to preprocess your data.

  1. The information content of your dataset needs to be converted. This is done by the procedure

    • read_dl_dataset_anomaly.

    It creates a dictionary DLDatasetDLDatasetDLDatasetDLDatasetDLDataset which serves as a database and stores all necessary information about your data. For more information about the data and the way it is transferred, see the section “Data” below and the chapter Deep Learning / Model.

  2. Split the dataset represented by the dictionary DLDatasetDLDatasetDLDatasetDLDatasetDLDataset. This can be done using the procedure

    • split_dl_dataset.

  3. The network imposes several requirements on the images. These requirements (for example the image size and gray value range) can be retrieved with

    For this you need to read the model first by using

  4. Now you can preprocess your dataset. For this, you can use the procedure

    • preprocess_dl_dataset.

    In case of custom preprocessing, this procedure offers guidance on the implementation.

    To use this procedure, specify the preprocessing parameters as, e.g., the image size. Store all the parameter with their values in a dictionary DLPreprocessParamDLPreprocessParamDLPreprocessParamDLPreprocessParamDLPreprocessParam, for which you can use the procedure

    • create_dl_preprocess_param.

    We recommend to save this dictionary DLPreprocessParamDLPreprocessParamDLPreprocessParamDLPreprocessParamDLPreprocessParam in order to have access to the preprocessing parameter values later during the inference phase.

Training of the model

This part explains how to train a DL anomaly detection model.

  1. Set the training parameters and store them in the dictionary TrainingParamTrainingParamTrainingParamTrainingParamtrainingParam. This can be done using the procedure

    • create_dl_train_param.

  2. Train the model. This can be done using the procedure

    • train_dl_model.

    The procedure expects:

    • the model handle DLModelHandleDLModelHandleDLModelHandleDLModelHandleDLModelHandle

    • the dictionary DLDatasetDLDatasetDLDatasetDLDatasetDLDataset containing the data information

    • the dictionary TrainParamTrainParamTrainParamTrainParamtrainParam containing the training parameters

Evaluation of the trained model

In this part, we evaluate the anomaly detection model.

  1. Set the model parameters which may influence the evaluation.

  2. The evaluation can be done conveniently using the procedure

    • evaluate_dl_model.

    This procedure expects a dictionary GenParamEvalGenParamEvalGenParamEvalGenParamEvalgenParamEval with the evaluation parameters.

  3. The dictionary EvaluationResultsEvaluationResultsEvaluationResultsEvaluationResultsevaluationResults holds the desired evaluation measures.

Inference on new images

This part covers the application of a DL anomaly detection model. For a trained model, perform the following steps:

  1. Request the requirements the model imposes on the images using the operator

    or the procedure

    • create_dl_preprocess_param_from_model.

  2. Set the model parameter described in the section “Model Parameters” below, using the operator

  3. Generate a data dictionary DLSampleDLSampleDLSampleDLSampleDLSample for each image. This can be done using the procedure

    • gen_dl_samples_from_images.

  4. Every image has to be preprocessed the same way as for the training. For this, you can use the procedure

    • preprocess_dl_samples.

    When you saved the dictionary DLPreprocessParamDLPreprocessParamDLPreprocessParamDLPreprocessParamDLPreprocessParam during the preprocessing step, you can directly use it as input to specify all parameter values.

  5. Apply the model using the operator

  6. Retrieve the results from the dictionary 'DLResult'"DLResult""DLResult""DLResult""DLResult".

Data

We distinguish between data used for training, evaluation, and inference on new images.

As a basic concept, the model handles data by dictionaries, meaning it receives the input data from a dictionary DLSampleDLSampleDLSampleDLSampleDLSample and returns a dictionary DLResultDLResultDLResultDLResultDLResult and DLTrainResultDLTrainResultDLTrainResultDLTrainResultDLTrainResult, respectively. More information on the data handling can be found in the chapter Deep Learning / Model.

Classes

In anomaly detection there are exactly two classes:

These classes apply to the whole image as well as single pixels.

Data for training

This dataset consists only of images without anomalies and the corresponding information. They have to be provided in a way the model can process them. Concerning the image requirements, find more information in the section “Images” below.

The training data is used to train a model for your specific task. With the aid of this data the model can learn which features the images without anomalies have in common.

Data for evaluation

This dataset should include images without anomalies but it can also contain images with anomalies. Every image within this set needs a ground truth label image_label specifying the class of the image (see the section above). This indicates if the image shows an anomaly ('nok'"nok""nok""nok""nok") or not ('ok'"ok""ok""ok""ok").

Evaluating the model performance on finding anomalies can visually also be done on pixel level if an image anomaly_file_name is included in the DLSampleDLSampleDLSampleDLSampleDLSample dictionary. In this image anomaly_file_name every pixel indicates the class ID, thus if the corresponding pixel in the input image shows an anomaly (pixel value > 0) or not (pixel value equal to 0).

image/svg+xml image/svg+xml
(1) (2)
Scheme of anomaly_file_name. For visibility, gray values are used to represent numbers. (1) Input image. (2) The corresponding anomaly_file_name providing the class annotations, 0: 'ok'"ok""ok""ok""ok" (white and light gray), 2: 'nok'"nok""nok""nok""nok" (dark gray).
Images

The model poses requirements on the images, such as the dimensions, the gray value range, and the type. The specific values depend on the model itself. See the documentation of read_dl_modelread_dl_modelReadDlModelReadDlModelReadDlModel for the specific values of different models. For a read model they can be queried with get_dl_model_paramget_dl_model_paramGetDlModelParamGetDlModelParamGetDlModelParam. In order to fulfill these requirements, you may have to preprocess your images. Standard preprocessing of an entire sample, including the image, is implemented in preprocess_dl_samples. In case of custom preprocessing these procedure offers guidance on the implementation.

Model output

As training output, the operator train_dl_model_anomaly_datasettrain_dl_model_anomaly_datasetTrainDlModelAnomalyDatasetTrainDlModelAnomalyDatasetTrainDlModelAnomalyDataset will return a dictionary DLTrainResultDLTrainResultDLTrainResultDLTrainResultDLTrainResult with the best obtained error received during training and the epoch in which this error was achieved.

As inference and evaluation output, the model will return a dictionary DLResultDLResultDLResultDLResultDLResult for every sample. For anomaly detection, this dictionary includes the following extra entries:

image/svg+xml image/svg+xml
(1) (2)
Scheme of anomaly_imageanomaly_imageanomaly_imageanomaly_imageanomalyImage. For visualization purpose, gray values are used to represent numbers. (1) The anomaly_file_name providing the class annotations, 0: 'ok'"ok""ok""ok""ok" (white and light gray), 2: 'nok'"nok""nok""nok""nok" (dark gray) (2) The corresponding anomaly_imageanomaly_imageanomaly_imageanomaly_imageanomalyImage.

Specific Parameters

For an anomaly detection model, the model parameters as well as the hyperparameters are set using set_dl_model_paramset_dl_model_paramSetDlModelParamSetDlModelParamSetDlModelParam. The model parameters are explained in more detail in get_dl_model_paramget_dl_model_paramGetDlModelParamGetDlModelParamGetDlModelParam. As the training is done utilizing the full dataset at once and not batch-wise, certain parameters as e.g., 'batch_size_multiplier'"batch_size_multiplier""batch_size_multiplier""batch_size_multiplier""batch_size_multiplier" have no influence.

The model returns scores but classifies neither pixel nor image as showing an anomaly or not. For this classification, thresholds need to be given, setting the minimum score for a pixel or image to be regarded as anomalous. You can estimate possible thresholds using the procedure compute_dl_anomaly_thresholdscompute_dl_anomaly_thresholdscompute_dl_anomaly_thresholdscompute_dl_anomaly_thresholdscomputeDlAnomalyThresholds. Applying these thresholds can be done with the procedure threshold_dl_anomaly_resultsthreshold_dl_anomaly_resultsthreshold_dl_anomaly_resultsthreshold_dl_anomaly_resultsthresholdDlAnomalyResults. As results the procedure adds the following (threshold depending) entries into the dictionary DLResultDLResultDLResultDLResultDLResult of a sample:

anomaly_classanomaly_classanomaly_classanomaly_classanomalyClass

The predicted class of the entire image (for the given threshold).

anomaly_class_idanomaly_class_idanomaly_class_idanomaly_class_idanomalyClassId

ID of the predicted class of the entire image (for the given threshold).

anomaly_regionanomaly_regionanomaly_regionanomaly_regionanomalyRegion
Region consisting of all the pixels that are regarded as showing an anomaly (for the given threshold, see the illustration below).
image/svg+xml image/svg+xml
(1) (2)
Scheme of anomaly_regionanomaly_regionanomaly_regionanomaly_regionanomalyRegion. For visualization purpose, gray values are used to represent numbers. (1) The anomaly_imageanomaly_imageanomaly_imageanomaly_imageanomalyImage with the obtained pixel scores. (2) The corresponding anomaly_regionanomaly_regionanomaly_regionanomaly_regionanomalyRegion.

List of Operators

train_dl_model_anomaly_datasetTrainDlModelAnomalyDatasetTrainDlModelAnomalyDatasettrain_dl_model_anomaly_dataset
Train a deep learning model for anomaly detection.