read_dl_model
— Read a deep learning model from a file.
read_dl_model( : : FileName : DLModelHandle)
The operator read_dl_model
reads a deep learning model.
Such models have to be in the HALCON format or in the ONNX format
(see the reference below). Restrictions apply to the latter.
As a result, the handle DLModelHandle
is returned.
The model is loaded from the file FileName
.
This file is thereby searched in the directory $HALCONROOT/dl/
as well as in the currently used directory.
The default HALCON file extension for deep learning networks is
'.hdl' .
Please note that the values of runtime specific parameters are not written
to file, see write_dl_model
.
As a consequence, when reading a model, these parameters are initialized
with their default value, see get_dl_model_param
.
For further explanations on deep learning models in HALCON, see the chapter Deep Learning / Model.
HALCON provides pretrained neural networks for classification and semantic segmentation. These neural networks are good starting points when training a custom network. They have been pretrained on a large image dataset. For anomaly detection, HALCON provides initial models.
The following network is provided for 3D Gripping Point Detection:
The network expects up to 5 images of type real
:
'image' : intensity (gray value) image
'x' : X-image (values need to increase from left to right)
'y' : Y-image (values need to increase from top to bottom)
'z' : Z-image (values need to increase from points close to the sensor to far points; this is for example the case if the data is given in the camera coordinate system)
'normals' : 2D mappings
Additionally, the network requires certain image properties (for all input images
mentioned above). The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values:
'image_width' : 640
'image_height' : 480
The network architecture allows changes concerning the image dimensions.
The following networks are provided for anomaly detection:
This neural network is designed to be memory and runtime efficient.
The network expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values:
'image_width' : 480
'image_height' : 480
'image_num_channels' : 3
'image_range_min' : -2
'image_range_max' : 2
The network architecture allows changes concerning the image dimensions, but the sizes 'image_width' and 'image_height' have to be multiples of 32 pixels, resulting in a minimum of 32 pixels.
This neural network is assumed to be better suited for more complex anomaly detection tasks. This comes at the cost of being more time and memory demanding.
The network expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values:
'image_width' : 480
'image_height' : 480
'image_num_channels' : 3
'image_range_min' : -2
'image_range_max' : 2
The network architecture allows changes concerning the image dimensions, but the sizes 'image_width' and 'image_height' have to be multiples of 32 pixels, resulting in a minimum of 32 pixels.
The following networks are provided for Global Context Anomaly Detection:
The network expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values:
'image_width' : 256
'image_height' : 256
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The following pretrained neural networks are provided for classification and usable as backbones for detection:
This neural network is designed for simple classification tasks. It is characterized by its convolution kernels in the first convolution layers, which are larger than those in other networks with comparable classification performance (e.g., 'pretrained_dl_classifier_compact.hdl' ). This may be beneficial for feature extraction.
This classifier expects the images to be of the type real
.
Additionally, the network is designed for certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the classifier has been trained:
'image_width' : 224
'image_height' : 224
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The network architecture allows changes concerning the image dimensions. 'image_width' and 'image_height' should not be less than 29 pixels. There is no maximum image size, but large image sizes will increase the memory demand and the runtime significantly. Changing the image size will reinitialize the weights of the fully connected layers and therefore makes a retraining necessary.
Note that one can improve the runtime for this network
by fusing the convolution and ReLU layers, see
set_dl_model_param
and the parameter
'fuse_conv_relu' .
This neural network is designed to be more memory and runtime efficient.
The classifier expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the classifier has been trained:
'image_width' : 224
'image_height' : 224
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
This network does not contain any fully connected layer. The network architecture allows changes concerning the image dimensions. 'image_width' and 'image_height' should not be less than 15 pixels.
This neural network has more hidden layers than 'pretrained_dl_classifier_compact.hdl' and is therefore assumed to be better suited for more complex classification tasks. This comes at the cost of being more time and memory demanding.
The classifier expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the classifier has been trained:
'image_width' : 224
'image_height' : 224
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The network architecture allows changes concerning the image dimensions. 'image_width' and 'image_height' should not be less than 47 pixels. There is no maximum image size, but large image sizes will increase the memory demand and the runtime significantly. Changing the image size will reinitialize the weights of the fully connected layers and therefore makes a retraining necessary.
This classifier is a small and low-power model, for what reason it is more suitable for mobile and embedded vision applications.
The classifier expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the classifier has been trained:
'image_width' : 224
'image_height' : 224
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The network architecture allows changes concerning the image dimensions. 'image_width' and 'image_height' should not be less than 32 pixels. There is no maximum image size, but large image sizes will increase the memory demand and the runtime significantly.
On the GPU, the network architecture can benefit greatly from special optimizations, without which the network can be significantly slower.
As the neural network 'pretrained_dl_classifier_enhanced.hdl' , this classifier is suited for more complex tasks. However, due to its special structure, it provides the advantage of making the training more stable and internally more robust. Compared to the neural network 'pretrained_dl_classifier_resnet50.hdl' it is less complex and has faster inference times.
The classifier expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the classifier has been trained:
'image_width' : 224
'image_height' : 224
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The network architecture allows changes concerning the image dimensions. 'image_width' and 'image_height' should not be less than 32 pixels. There is no maximum image size, but large image sizes will increase the memory demand and the runtime significantly. Despite the fully connected layer a change of the image size does not lead to a reinitialization of the weights.
As the neural network 'pretrained_dl_classifier_enhanced.hdl' , this classifier is suited for more complex tasks. However, due to its special structure, it provides the advantage of making the training more stable and internally more robust.
The classifier expects the images to be of the type real
.
Additionally, the network requires certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the classifier has been trained:
'image_width' : 224
'image_height' : 224
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The network architecture allows changes concerning the image dimensions. 'image_width' and 'image_height' should not be less than 32 pixels. There is no maximum image size, but large image sizes will increase the memory demand and the runtime significantly. Despite the fully connected layer a change of the image size does not lead to a reinitialization of the weights.
The following pretrained neural networks are provided for semantic segmentation:
This neural network is designed and pretrained for edge extraction. As a consequence this model is meant for two class problems with one class for edges and one for background.
This network expects the images to be of the type real
.
Additionally, the network is designed for certain image properties.
The corresponding values can be retrieved with
get_dl_model_param
. Here we list the default values with
which the model has been trained:
'image_width' : 512
'image_height' : 512
'image_num_channels' : 1
'image_range_min' : -127.0
'image_range_max' : 128.0
'num_classes' : 2
The network architecture allows changes concerning the image dimensions, but the sizes 'image_width' and 'image_height' have to be multiples of 16 pixels, resulting in a minimum of 16 pixels.
This neural network is designed to handle segmentation tasks with detailed structures and uses only few memory and is runtime efficient.
The network architecture allows changes concerning the image dimensions, but requires a minimum 'image_width' and 'image_height' of 21 pixels.
This neural network has more hidden layers than 'pretrained_dl_segmentation_compact.hdl' and is therefore better suited for segmentation tasks including more complex scenes.
The network architecture allows changes concerning the image dimensions, but requires a minimum 'image_width' and 'image_height' of 47 pixels.
The following pretrained neural networks are provided for Deep OCR:
This neural network is the default pretrained detection component of a Deep OCR model, but can be retrained, too. It is designed to detect words in images.
This network expects the images to be of the type real
.
Additionally, the network is designed for certain image properties. The
corresponding values can be retrieved with get_dl_model_param
.
Here we list the default values with which the model has been trained:
'image_width' : 1024
'image_height' : 1024
'image_num_channels' : 3
'image_range_min' : -127.0
'image_range_max' : 128.0
The network architecture allows changes concerning the image dimensions 'image_width' and 'image_height' .
This neural network is a more efficient pretrained network that can be used as detection component of a Deep OCR model. It is designed to detect words in images, and it can be retrained as well. This neural network is designed to be more memory and runtime efficient.
Regarding the input images and image dimensions, this network has the same requirements as the default model 'pretrained_deep_ocr_detection_compact.hdl' .
This neural network is the default pretrained recognition component of a Deep OCR model, but can be retrained, too. It is designed to recognize words in images that are cropped to a single word.
This network expects the images to be of the type real
.
Additionally, the network is designed for certain image properties. The
corresponding values can be retrieved with get_dl_model_param
.
Here we list the default values with which the model has been trained:
'image_width' : 120
'image_height' : 32
'image_num_channels' : 1
'image_range_min' : -1.0
'image_range_max' : 1.0
The network architecture allows changes concerning the image width 'image_width' . The image height 'image_height' cannot be changed. The parameter 'image_width' is very important: its value can be decreased or increased to adapt to the expected lengths of words, e.g., due to the average width per character. A bigger 'image_width' will consume more time and memory resources. The image width 'image_width' may be changed after training.
You can read in an ONNX model, but there are some points to consider.
Reading in ONNX models with read_dl_model
, some restrictions
apply:
Version 1.8.1 of the ONNX specification is supported. This means only operators until ONNX operator set version (OpSetVersion) 13 are supported. For operators with a higher OpSetVersion there is no guarantee that it can be supported. Further limitations are listed above.
Only 32 bit floating point tensors are supported.
Only models ending with a SoftMax layer are automatically recognized
as classifiers.
All other models are considered as generic model, thus
models of 'type' = 'generic' .
set_dl_model_param
can be used to change the model type.
The input graph nodes (images) must be of shape dimension 4: Number of images (='batch_size' ), 'num_channels' , 'image_height' , and 'image_width' .
After reading an ONNX model with read_dl_model
, some network
transformations are executed automatically:
Every non-global pooling layer with a resulting
feature map of size 1x1 is converted to a global pooling layer.
Doing so enables resizable input images.
For more information about pooling layer and possible modes of
operation, see the “Solution Guide on Classification”
.
Layer pairs consisting of a convolution layer without activation
and a directly connected activation layer with ReLU activation are
fused.
In order to so do, the output of the convolution layer is only used
as input for the activation layer.
As a result a convolution layer with activation mode ReLU is obtained.
For more information about layers and possible modes of operation,
see the “Solution Guide on Classification”
.
ONNX models with the following operations can be read by
read_dl_model
:
'Add'
:No restrictions.
'ArgMax'
:The following restrictions apply:
attribute 'axis' : The value must be 1.
attribute 'keepdims' : The value must be 1.
attribute 'select_last_index' : The value must be 0.
'AveragePool'
:The following restrictions apply:
attribute 'count_include_pad' : The value must be 0.
'BatchNormalization'
:No restrictions.
'Clip'
:The following restrictions apply:
attribute 'min' : The value must be 0.
attribute 'max' : The value must be greater than 0 and less than maximum float number.
'Concat'
:No restrictions.
'Constant'
:The following restrictions apply:
attribute 'sparse_value' : The attribute is not supported.
attribute 'value' : All entries in the tensor have to be identical.
attribute 'value_floats' : The attribute is not supported.
attribute 'value_ints' : The attribute is not supported.
attribute 'value_string' : The attribute is not supported.
attribute 'value_strings' : The attribute is not supported.
'Conv'
:The following restrictions apply:
attribute 'pads' : Padding values greater than or equal to kernel size are not supported.
'ConvTranspose'
:The following restrictions apply:
attribute 'dilations' : Only the value '(1, 1)' (no dilations) is supported.
attribute 'group' : Only the value 1 is supported (no grouped transposed convolution).
attribute 'kernel_shape' : Only symmetric kernel shapes are supported.
attribute 'output_padding' : See restrictions mentioned in
create_dl_layer_transposed_convolution
.
attribute 'output_shape' : The attribute is not supported.
attribute 'pads' : Padding values greater than or equal to kernel size are not supported.
attribute 'strides' : Only symmetric strides are supported.
'DepthToSpace'
:The following restrictions apply:
attribute 'mode' : The value must be 'CRD' .
'Dropout'
:No restrictions.
'Gemm'
:The following restrictions apply:
attribute 'alpha' : The value must be 1.
attribute 'beta' : The value must be 1.
attribute 'transA' : The value must be 0.
'GlobalAveragePool'
:No restrictions.
'GlobalMaxPool'
:The following restrictions apply:
attribute 'dilations' : The value must be 1.
'LeakyRelu'
:No restrictions.
'LogSoftmax'
:The following restrictions apply:
attribute 'axis' : The value must be 1.
'LRN'
:No restrictions. Hint: Attribute 'size' has no effect.
'MaxPool'
:No restrictions.
'Mean'
:No restrictions.
'Mul'
:No restrictions.
'ReduceL2'
:attribute 'noop_with_empty_axes' : The attribute is optional. The value must be 0.
attribute 'keepdims' : The attribute is optional. The value must be 1.
attribute 'axes' : The attribute is optional. If empty reduce all dimensions. In the new opset versions the attribute 'axes' was moved to the inputs.
'ReduceMax'
:The following restrictions apply:
attribute 'axes' : The value must be 1.
attribute 'keepdims' : The value must be 1.
'ReduceSum'
:attribute 'noop_with_empty_axes' : The attribute is optional. The value must be 0.
attribute 'keepdims' : The attribute is optional. The value must be 1.
attribute 'axes' : The attribute is optional. If empty reduce all dimensions. In the new opset versions the attribute 'axes' was moved to the inputs.
'Relu'
:No restrictions.
'Resize'
:The following restrictions apply:
attribute 'mode' : Only the values 'linear' or 'bilinear' are supported.
attribute 'coordinate_transformation_mode' : Only the values 'pytorch_half_pixel' and 'align_corners' are supported.
input tensor 'roi' : If values are set they have no effect on the inference.
The attributes 'cubic_coeff_a' , 'exclude_outside' , 'extrapolation_value' , or 'nearest_mode' have no effect.
'Reshape'
:The following restrictions apply:
attribute 'allowzero' : If the attribute is used its value must be 0.
'Sigmoid'
:No restrictions.
'Softmax'
:The following restrictions apply:
attribute 'axis' : If the attribute is used its value must be 1.
'Sub'
:No restrictions.
'Sum'
:No restrictions.
'Transpose'
:No restrictions.
Moreover the ONNX 'metadata_props'
field is supported. It is written to
the model parameter 'meta_data' .
This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.
FileName
(input_control) filename.read →
(string)
Filename
Default: 'pretrained_dl_classifier_compact.hdl'
List of values: 'initial_dl_anomaly_large.hdl' , 'initial_dl_anomaly_medium.hdl' , 'pretrained_deep_ocr_detection.hdl' , 'pretrained_deep_ocr_detection_compact.hdl' , 'pretrained_deep_ocr_recognition.hdl' , 'pretrained_dl_3d_gripping_point.hdl' , 'pretrained_dl_anomaly_global_context.hdl' , 'pretrained_dl_classifier_alexnet.hdl' , 'pretrained_dl_classifier_compact.hdl' , 'pretrained_dl_classifier_enhanced.hdl' , 'pretrained_dl_classifier_mobilenet_v2.hdl' , 'pretrained_dl_classifier_resnet18.hdl' , 'pretrained_dl_classifier_resnet50.hdl' , 'pretrained_dl_edge_extractor.hdl' , 'pretrained_dl_segmentation_compact.hdl' , 'pretrained_dl_segmentation_enhanced.hdl'
File extension:
.hdl
, .onnx
DLModelHandle
(output_control) dl_model →
(handle)
Handle of the deep learning model.
If the parameters are valid, the operator read_dl_model
returns the value 2 (
H_MSG_TRUE)
. If necessary, an exception is raised.
set_dl_model_param
,
get_dl_model_param
,
apply_dl_model
,
train_dl_model_batch
,
train_dl_model_anomaly_dataset
Open Neural Network Exchange (ONNX), https://onnx.ai/
Foundation. This operator uses dynamic licensing (see the ``Installation Guide''). Which of the following modules is required depends on the specific usage of the operator:
3D Metrology, OCR/OCV, Matching, Deep Learning Inference