Classify Image

Use this tool to categorize images based on a pre-trained classifier.

A classifier is a deep learning model that has been trained to assign a specific class out of a given set of classed to an image. For each class, a confidence value is determined indicating how likely the image belongs to the respective class. The class with the highest confidence value will be assigned to the image.

If a deep learning model file is available, you can immediately use it in this MERLIC tool. Otherwise, you first have to train a model.

Training the Deep Learning Model

You can use MVTec's Deep Learning Tool to train the deep learning model for this tool. The workflow is as follows:

  1. Define classes.
  2. Label the images accordingly.
  3. Train the model.

The training will result in a classifier file which can be used in this MERLIC tool to classify new images. The images will then be categorized into the previously defined classes.

While it is also possible to use MVTec HALCON to train the deep learning model, it is recommended to use the MVTec Deep Learning Tool. For more information on how to create a deep learning model, see the documentation of the MVTec Deep Learning Tool or watch the video tutorial for Deep Learning classification with the MVTec Deep Learning Tool.

Using a Classifier with "Out-of-Distribution Detection"

If you are using this MERLIC tool with a classifier that has been extended for "Out-of-Distribution Detection", the classifier is able to recognize incorrect classification of objects or samples that differ significantly from the training data, for example, an object made of metal if the system was only trained on glass bottles. An additional tool result Out-of-Distribution Score is returned which indicates the degree of deviation from the trained classes. The tool also provides an additional tool parameter Out-of-Distribution Threshold which allows you to adjust the threshold for identifying Out-of-Distribution samples.

In the example graphics below, the classifier was trained to distinguish three classes. In the first image, the class with the highest confidence value is "apple". Therefore, the image is recognized as "apple".

In the second image, the object differs significantly from the training data. In addition to the confidence values for the three classes, the network also indicates that the image does not belong to any of the three trained classes.

To extend a trained classifier for "Out-of-Distribution Detection" (OOD), you have to use MVTec HALCON which supports performing the extension starting with HALCON version 24.11. For more information on "Out-of-Distribution Detection", see the documentation of MVTec HALCON.

Support of Artificial Intelligence Acceleration Interfaces (AI²)

MERLIC comes with Artificial Intelligence Acceleration Interfaces (AI²) for the NVIDIA® TensorRT™ SDK and the Intel® Distribution of OpenVINO™ toolkit. Thus, you can use AI accelerator hardware as processing unit that is compatible with the NVIDIA® TensorRT™ or the OpenVINO™ toolkit to perform optimized inference on the respective hardware. This way, you can achieve significantly faster deep learning inference times. The respective hardware can be selected at the tool parameter "Processing Unit".

For more information on the installation and the prerequisites, see the AI² Interfaces for Tools with Deep Learning.

Parameters

Basic Parameters

Image

This parameter represents the image that should be classified.

Classifier File:

This parameter defines the so-called classifier, the deep learning model (.hdl file format), that should be used for classifying the images.

If you want to use a specific AI accelerator hardware supported by MERLIC's AI² interfaces, you can use a deep learning model that was optimized for the respective device. This improves the loading time of the MVApp and reduces the amount of memory that is needed for the model.

We recommend using the MVTec Deep Learning Tool to train the model. When exporting the model, you can choose to optimize the model for an AI² interface. However, if you still want to use MVTec HALCON for the training, you have to consider the following restrictions for the preprocessing parameters.

This tool only supports deep learning models that were trained with the default values for the following preprocessing parameters:

  • NormalizationType = "none"
  • DomainHandling = "full_domain"
Show Heatmap:

This parameter defines if a heatmap is shown or not. The heatmap indicates image parts which are important for the decision of the classifier. The default setting is 1. This means that the heatmap is shown by default. If the parameter is set to 0, the heatmap is not visible. The value of this parameter also determines if the heatmap is part of the image that is returned in the result "Displayed Image".

The heatmap is not available if an AI accelerator hardware is used as processing unit even if the parameter "Show Heatmap" is set to 1. As soon as a CPU, GPU, or VPU, that is compatible with the OpenVINO™ toolkit is selected at the parameter "Processing Unit", the heatmap will be disabled. The same also applies if an NVIDIA® GPU is selected.

Additional Parameters

Out-of-Distribution Threshold:

This parameter defines a threshold for identifying objects or samples that differ from the training data that were used for the classifier. If the "Out-of-Distribution Score" exceeds the defined threshold, the sample is predicted as "Out-of-Distribution". When choosing a classifier which supports "Out-of-Distribution Detection" (OOD), the threshold is initialized with the value that was determined for the classifier during the extension for OOD. If required, you can adjust the threshold manually at this parameter to a value in the range of 0.0 to 1.0.

This parameter can only be set when using a classifier that has been extended for "Out-of-Distribution Detection" (OOD). If a classifier without OOD is used, the parameter is grayed out and cannot be set.

Processing Unit:

This parameter defines the device used for processing the images. The parameter is set to "auto" by default. In this mode, MERLIC tries to choose a suitable GPU as processing unit because it usually performs better than the CPU. However, this requires at least 4 GB of available memory on the respective GPU. If no suitable GPU is found, the CPU is used as fallback.

You can also choose the processing unit manually. Click on the parameter to select the device from the list of all available processing units. If you are choosing a GPU as processing unit, we recommend to check that enough memory is available for the used deep learning model. Otherwise, undesirable effects such as slower inference times might occur.

MERLIC also supports the use of AI accelerator hardware that is compatible with the NVIDIA® TensorRT™ SDK or the OpenVINO™ toolkit:

  • NVIDIA® GPUs
  • CPUs, Intel® GPUs, Intel® VPUs (MYRIAD and HDDL) with support of the OpenVINO™ toolkit

The respective devices are marked either with the prefix "TensorRT(TM)" or "OpenVINO(TM)". If you select a device that supports NVIDIA® TensorRT™ or the OpenVINO™ toolkit, the memory will be initialized on the device via the respective plug-in for the AI² interface.

As soon as an AI accelerator hardware has been selected as processing unit, the optimization of the deep learning model is started. After the optimization, all parameters that represent model parameters will be internally set to read-only. Thus, their values cannot be changed anymore as long as the selected AI accelerator is used as processing unit. To change the parameters, you first have to change the processing unit to a different one without any AI acceleration. After setting the parameters, you can set the processing unit back to the respective AI accelerator hardware.

CPUs with support of the OpenVINO™ toolkit can be used without any additional installation steps. They will be automatically available in the list of available processing units. If multiple processing units with the same name are available, an index number is assigned to their name. The same applies to GPUs with support of the NVIDIA® TensorRT™.

To use GPUs and VPUs with the support of the OpenVINO™ toolkit as processing unit, the Intel® Distribution of OpenVINO™ toolkit must be installed on your computer and MERLIC must be started in an OpenVINO™ toolkit environment. See the topic AI² Interfaces for Tools with Deep Learning for more detailed information on the prerequisites.

Besides the optimization via AI accelerator hardware, MERLIC supports further dynamic optimizations via the NVIDIA® CUDA® Deep Neural Network (cuDNN). This optimization can be enabled via the MERLIC preferences in the MERLIC Creator. For more information, see the topic MERLIC Preferences.

Precision:

This parameter defines the data type that is used internally for the optimization of the deep learning model for inference, i.e., it defines the precision to which the model is converted to. It is set to "high" by default.

The following table shows the model precisions which are supported in this tool.

Value

Description

high

The deep learning model is converted to a precision of "float32".

medium

The deep learning model is converted to a precision of "float16".

Most processing units support both types of precisions. However, there might be some processing units that support only one of these precisions. In this case, only the supported precision will be available at the parameter as soon as the respective device has been selected at the parameter "Processing Unit". If the processing unit is selected automatically, i.e., if "Processing Unit" is set to "auto", only the precision "high" is available.

Number of Detected Classes:

This parameter defines the maximum number of classes that will be returned for the result "Detected Classes". The parameter is set to 1 by default. This means that only one class, i.e., the one with the highest confidence, will be returned in "Detected Classes". You can choose to query up to 5 classes. The resulting classes in "Detected Classes" will be sorted by their confidence values.

This parameter also defines the maximum number of values that are returned in the results "Confidences" and "Detected Class IDs" because their values refer to the classes in "Detected Classes". For more information about these results, see the respective sections below.

The selected parameter value does not influence the performance or run time, as it only controls the output of the result.

Results

Basic Results

Detected Classes:

This result returns the names of the classes to be distinguished. The number of classes that are returned depends on the parameter "Number of Detected Classes". The class names are returned as strings. In case of multiple classes, they are returned in a tuple sorted by the respective confidence values starting with the class with the highest confidence. The respective confidences and IDs of the classes are returned in the results "Confidences" and "Detected Class IDs".

Confidences:

This result indicates how likely the image belongs to each of the distinguished classes. The maximum confidence is 1. The highest confidence wins, i.e., determines the class. For example, if you have three possible classes [apple, banana, orange] and the resulting confidence values are [0.7, 0.2, 0.1], the current processed image is classified as "apple".

Tool State:

"Tool State" returns information about the state of the tool and thus can be used for error handling. For more information, see Tool State Result

Additional Results

Heatmap:

This result represents the heatmap as an image which indicates distinguishing features that are crucial for the classification.

Displayed Image:

This result represents the overlay of the processing image and the anomaly heatmap. As the processing image shows through the heatmap, you can see more clearly what the anomaly is and where it occurs in the image. However, the heatmap is only shown if the parameter "Show Heatmap" is set to 1. If "Show Heatmap" is set to 0 the resulting image returns only processing image without heatmap.

Detected Class IDs:

This result returns the IDs of all classes to be distinguished. The number of elements in the integer tuple depends on the parameter "Number of Detected Classes".

All Classes:

This result returns the names of all available classes that are defined in selected classifier.

All Class IDs:

This results returns the IDs of all available classes that are defined in selected classifier.

Training Image Width:

This result returns the image width that has been used for training the classifier.

Training Image Height:

This result returns the image height that has been used for training the classifier.

Used Processing Unit:

This result returns the processing unit that was used in the last iteration. You can use this result to check which processing unit was actually used if the parameter "Processing Unit" is set to "auto" or to check that the correct one was used.

Precision Data Type:

This result returns the data type that was used internally for the optimization of the deep learning model for inference. You can use this result to check if the correct precision was used in case any problems occur.

If the parameter "Precision" is set to "high", the deep learning model should be converted to a precision of "float32". Therefore, this result is expected to return the data type "float32". If the parameter "Precision" is set to "medium", the deep learning model should be converted to a precision of "float16". In this case, the expected value for this result is the data type "float16". In case any problem occurred during an iteration of your MVApp, you could check if this result returns a different data type than expected and also have a look at the log file for more information. See the topic Logging for more information about the log files.

Out-of-Distribution Score:

This result returns a value indicating how much the sample differs from the trained classes. The higher this score, the more likely it is that the sample is "Out-of-Distribution". If the score exceeds the threshold defined in "Out-of-Distribution Threshold", the sample is predicted as "Out-of-Distribution" and a warning will be shown at the tool.

This result is only provided when using a classifier that has been extended for "Out-of-Distribution Detection" (OOD). If a classifier without OOD is used, the result is empty.

Processing Time:

This result returns the duration of the most recent execution of the tool in milliseconds. The result is provided as additional result. Therefore, it is hidden by default but it can be displayed via the button beside the tool results. For more information see the section Processing Time in the tool reference overview.

Application Examples

This tool is used in the following MERLIC Vision App examples:

  • classify_pills.mvapp
  • classify_and_inspect_wood.mvapp