Parameters can apply to the whole model or be specific for a given component.
The following table gives an overview, which parameters can be set and
which ones retrieved as well as for which model part they apply.
Note, the device can be reset for an individual component, in which case
only the possibly remaining part of the model (e.g., the remaining
component) will be executed on the device of this handle.
Default:
Handle of the default device, thus the GPU with index
0 when querying a list using get_systemget_systemGetSystemGetSystemget_system
with 'cuda_devices'"cuda_devices""cuda_devices""cuda_devices""cuda_devices".
If no device is available, this is an empty tuple.
This parameter will set the device on which the detection component of
the Deep OCR model is executed. For a further description, see
'device'"device""device""device""device".
Default:
The same value as for 'device'"device""device""device""device".
Tuple containing the image dimensions ('detection_image_width'"detection_image_width""detection_image_width""detection_image_width""detection_image_width",
'detection_image_height'"detection_image_height""detection_image_height""detection_image_height""detection_image_height", number of channels) the detection
component will process.
The input image is scaled to 'detection_image_dimensions'"detection_image_dimensions""detection_image_dimensions""detection_image_dimensions""detection_image_dimensions"
such that the original aspect ratio is preserved. The scaled image
is padded with gray value 0 if necessary. Therefore, changing
the 'detection_image_height'"detection_image_height""detection_image_height""detection_image_height""detection_image_height" or 'detection_image_width'"detection_image_width""detection_image_width""detection_image_width""detection_image_width"
can influence the results.
Height of the images the detection component will process.
This means, the network preserves the aspect ratio of the input image by
scaling it to a maximum of this height before processing it. Thus this size
can influence the results.
The model architecture requires that the height is a multiple of 32. If
this is not the case, the height is rounded up to the nearest integer
multiple of 32.
Tuple containing the image size ('detection_image_width'"detection_image_width""detection_image_width""detection_image_width""detection_image_width",
'detection_image_height'"detection_image_height""detection_image_height""detection_image_height""detection_image_height") the detection component will
process.
Width of the images the detection component will process.
This means, the network preserves the aspect ratio of the input image by
scaling it to a maximum of this width before processing it. Thus this size
can influence the results.
The model architecture requires that the width is a multiple of 32. If
this is not the case, the width is rounded up to the nearest integer
multiple of 32.
The parameter 'detection_min_character_score'"detection_min_character_score""detection_min_character_score""detection_min_character_score""detection_min_character_score" specifies the lower
threshold used for the character score map to estimate the dimensions of
the characters.
By adjusting the parameter, suggested instances can be split up or
neighboring instances can be merged.
The parameter 'detection_min_link_score'"detection_min_link_score""detection_min_link_score""detection_min_link_score""detection_min_link_score" defines the minimum link
score required between two localized characters to recognize these
characters as coherent word.
The parameter 'detection_min_word_area'"detection_min_word_area""detection_min_word_area""detection_min_word_area""detection_min_word_area" defines the minimum size
that a localized word must have in order to be suggested.
This parameter can be used to filter suggestions that are too small.
The parameter 'detection_min_word_score'"detection_min_word_score""detection_min_word_score""detection_min_word_score""detection_min_word_score" defines the minimum
score a localized instance must contain to be suggested as valid word.
With this parameter uncertain words can be filtered out.
The operator get_deep_ocr_paramget_deep_ocr_paramGetDeepOcrParamGetDeepOcrParamget_deep_ocr_param returns the handle of the Deep
OCR model component for word detection.
Using set_deep_ocr_paramset_deep_ocr_paramSetDeepOcrParamSetDeepOcrParamset_deep_ocr_param it is possible to either specify a handle,
filename or special string. As a special string only 'default'"default""default""default""default"
and 'compact'"compact""compact""compact""compact" are allowed. In case of 'default'"default""default""default""default" the
default pretrained word detection component is loaded
(i.e. 'pretrained_deep_ocr_detection.hdl'"pretrained_deep_ocr_detection.hdl""pretrained_deep_ocr_detection.hdl""pretrained_deep_ocr_detection.hdl""pretrained_deep_ocr_detection.hdl"). In case of
'compact'"compact""compact""compact""compact", a more efficient word detection component
is loaded (i.e. 'pretrained_deep_ocr_detection_compact.hdl'"pretrained_deep_ocr_detection_compact.hdl""pretrained_deep_ocr_detection_compact.hdl""pretrained_deep_ocr_detection_compact.hdl""pretrained_deep_ocr_detection_compact.hdl").
If the given value is a string the model is loaded internally and the
batch size is set to 1.
This parameter allows to set a predefined orientation angle for the word
detection. To revert to default behavior using the internal orientation
estimation, 'detection_orientation'"detection_orientation""detection_orientation""detection_orientation""detection_orientation" is set to 'auto'"auto""auto""auto""auto".
The words are sorted line-wise based on the orientation of the localized
word instances. If the parameter 'detection_sort_by_line'"detection_sort_by_line""detection_sort_by_line""detection_sort_by_line""detection_sort_by_line" is
set to 'false'"false""false""false""false", the results will not be sorted.
The input image is automatically split into overlapping tile images of
size 'detection_image_size'"detection_image_size""detection_image_size""detection_image_size""detection_image_size", which are processed separately by
the detection component. This allows processing images that are much
larger than the actual 'detection_image_size'"detection_image_size""detection_image_size""detection_image_size""detection_image_size" without having to
zoom the input image.
Thus, if 'detection_tiling'"detection_tiling""detection_tiling""detection_tiling""detection_tiling" = {'true'}, the input image will
not be zoomed before processing it.
This parameter defines how much neighboring tiles overlap when input
images are split (see 'detection_tiling'"detection_tiling""detection_tiling""detection_tiling""detection_tiling"). The overlap
is given in pixels.
The character set that can be recognized by the Deep OCR model.
It contains all characters that are not mapped to the Blank character
of the internal alphabet (see parameters
'recognition_alphabet_mapping'"recognition_alphabet_mapping""recognition_alphabet_mapping""recognition_alphabet_mapping""recognition_alphabet_mapping"
and 'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal").
The alphabet can be changed or extended if needed. Changing the
alphabet with this parameter will edit the internal alphabet and
mapping in such a way that it tries to keep the length of the internal
alphabet unchanged.
After changing the alphabet, it is recommended to retrain the model on
application specific data (see the HDevelop example
deep_ocr_recognition_training_workflow.hdev).
Previously unknown characters will need more training data.
Note, that if the length of the internal alphabet changes, the last
model layers have to be randomly initialized and thus the output of the
model will be random strings (see
'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal"). In that case it is required to
retrain the model.
The full character set which the Deep OCR recognition component has
been trained on.
The first character of the internal alphabet is a special character.
In the pretrained model this character is specified as Blank
(U+2800) and is not to be confused with a space.
The Blank is never returned in a word output but can occur in the
reported character candidates. It is required and cannot be omitted.
If the internal alphabet is changed, the first character has to be
the Blank.
Furthermore, if 'recognition_alphabet'"recognition_alphabet""recognition_alphabet""recognition_alphabet""recognition_alphabet" is used to change
the alphabet, the Blank symbol is added automatically to the
character set.
The length of this tuple corresponds to the depth of the last
convolution layer in the model. If the length changes, the last
convolution layer and all layers after it have to be resized and
potentially reinitialized randomly. After such a change, it is required
to retrain the model (see HDevelop example
deep_ocr_recognition_training_workflow.hdev).
It is recommend to use the parameter 'recognition_alphabet'"recognition_alphabet""recognition_alphabet""recognition_alphabet""recognition_alphabet" to
change the alphabet, as it will automatically try to preserve the
length of the internal alphabet.
It is a mapping that is applied by the model during the decoding step
of each word. The mapping overwrites a character of
'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal" with
the character at the specified index in
'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal".
In the decoding step each prediction is mapped according to the index
specified in this tuple.
The tuple has to be of same length as the tuple
'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal". Each integer index of the
mapping has to be within 0 and
|'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal"|-1.
In some applications it can be helpful to map certain characters onto
other characters. E.g. if only numeric words occur in an application it
might be helpful to map the character "O" to the "0" character
without the need to retrain the model.
If an entry contains a 0, the corresponding character in
'recognition_alphabet_internal'"recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal""recognition_alphabet_internal" will not be decoded in the
word.
Number of images in a batch that is transferred to device memory and
processed in parallel in the recognition component. For further
details, please refer to the reference documentation of
apply_dl_modelapply_dl_modelApplyDlModelApplyDlModelapply_dl_model with respect to the parameter
'batch_size'"batch_size""batch_size""batch_size""batch_size". This parameter can be used to optimize the
runtime of apply_deep_ocrapply_deep_ocrApplyDeepOcrApplyDeepOcrapply_deep_ocr on a given dl device. If the
recognition component has to process multiple inputs (words),
processing multiple inputs in parallel can result in a faster execution
of apply_deep_ocrapply_deep_ocrApplyDeepOcrApplyDeepOcrapply_deep_ocr. Note, however, that a higher
'recognition_batch_size'"recognition_batch_size""recognition_batch_size""recognition_batch_size""recognition_batch_size" will require more device memory.
This parameter will set the device on which the recognition component of
the Deep OCR model is executed. For a further description, see
'device'"device""device""device""device".
Default:
The same value as for 'device'"device""device""device""device".
Tuple containing the image dimensions ('recognition_image_width'"recognition_image_width""recognition_image_width""recognition_image_width""recognition_image_width",
'recognition_image_height'"recognition_image_height""recognition_image_height""recognition_image_height""recognition_image_height", number of channels)
the recognition component will process.
This means, the network will first zoom the input image part to
'recognition_image_height'"recognition_image_height""recognition_image_height""recognition_image_height""recognition_image_height" while maintaining the aspect ratio
of the input. If the width of the resulting image is smaller than
'recognition_image_width'"recognition_image_width""recognition_image_width""recognition_image_width""recognition_image_width", the image part is padded with
gray value 0 on the right. If it is larger, the image is
zoomed to 'recognition_image_width'"recognition_image_width""recognition_image_width""recognition_image_width""recognition_image_width".
The operator get_deep_ocr_paramget_deep_ocr_paramGetDeepOcrParamGetDeepOcrParamget_deep_ocr_param returns the handle of the Deep
OCR model component for word recognition.
Using set_deep_ocr_paramset_deep_ocr_paramSetDeepOcrParamSetDeepOcrParamset_deep_ocr_param it is possible to either specify a
handle, filename or 'default'"default""default""default""default". In case of 'default'"default""default""default""default" the
pretrained word recognition component is loaded (i.e.
'pretrained_deep_ocr_recognition.hdl'"pretrained_deep_ocr_recognition.hdl""pretrained_deep_ocr_recognition.hdl""pretrained_deep_ocr_recognition.hdl""pretrained_deep_ocr_recognition.hdl").
If the given value is a string the model is loaded internally and the
batch size is set to 1.
If the parameters are valid, the operator get_deep_ocr_paramget_deep_ocr_paramGetDeepOcrParamGetDeepOcrParamget_deep_ocr_param
returns the value 2 (
H_MSG_TRUE)
. If necessary, an exception is raised.