ClassesClassesClassesClasses | | | | Operators

select_charactersT_select_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters (Operator)

Name

select_charactersT_select_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters — Selects characters from a given region.

Signature

select_characters(Region : RegionCharacters : DotPrint, StrokeWidth, CharWidth, CharHeight, Punctuation, DiacriticMarks, PartitionMethod, PartitionLines, FragmentDistance, ConnectFragments, ClutterSizeMax, StopAfter : )

Herror T_select_characters(const Hobject Region, Hobject* RegionCharacters, const Htuple DotPrint, const Htuple StrokeWidth, const Htuple CharWidth, const Htuple CharHeight, const Htuple Punctuation, const Htuple DiacriticMarks, const Htuple PartitionMethod, const Htuple PartitionLines, const Htuple FragmentDistance, const Htuple ConnectFragments, const Htuple ClutterSizeMax, const Htuple StopAfter)

Herror select_characters(Hobject Region, Hobject* RegionCharacters, const HTuple& DotPrint, const HTuple& StrokeWidth, const HTuple& CharWidth, const HTuple& CharHeight, const HTuple& Punctuation, const HTuple& DiacriticMarks, const HTuple& PartitionMethod, const HTuple& PartitionLines, const HTuple& FragmentDistance, const HTuple& ConnectFragments, const HTuple& ClutterSizeMax, const HTuple& StopAfter)

HRegion HRegion::SelectCharacters(const HTuple& DotPrint, const HTuple& StrokeWidth, const HTuple& CharWidth, const HTuple& CharHeight, const HTuple& Punctuation, const HTuple& DiacriticMarks, const HTuple& PartitionMethod, const HTuple& PartitionLines, const HTuple& FragmentDistance, const HTuple& ConnectFragments, const HTuple& ClutterSizeMax, const HTuple& StopAfter) const

HRegionArray HRegionArray::SelectCharacters(const HTuple& DotPrint, const HTuple& StrokeWidth, const HTuple& CharWidth, const HTuple& CharHeight, const HTuple& Punctuation, const HTuple& DiacriticMarks, const HTuple& PartitionMethod, const HTuple& PartitionLines, const HTuple& FragmentDistance, const HTuple& ConnectFragments, const HTuple& ClutterSizeMax, const HTuple& StopAfter) const

void SelectCharacters(const HObject& Region, HObject* RegionCharacters, const HTuple& DotPrint, const HTuple& StrokeWidth, const HTuple& CharWidth, const HTuple& CharHeight, const HTuple& Punctuation, const HTuple& DiacriticMarks, const HTuple& PartitionMethod, const HTuple& PartitionLines, const HTuple& FragmentDistance, const HTuple& ConnectFragments, const HTuple& ClutterSizeMax, const HTuple& StopAfter)

HRegion HRegion::SelectCharacters(const HString& DotPrint, const HString& StrokeWidth, const HTuple& CharWidth, const HTuple& CharHeight, const HString& Punctuation, const HString& DiacriticMarks, const HString& PartitionMethod, const HString& PartitionLines, const HString& FragmentDistance, const HString& ConnectFragments, Hlong ClutterSizeMax, const HString& StopAfter) const

HRegion HRegion::SelectCharacters(const char* DotPrint, const char* StrokeWidth, const HTuple& CharWidth, const HTuple& CharHeight, const char* Punctuation, const char* DiacriticMarks, const char* PartitionMethod, const char* PartitionLines, const char* FragmentDistance, const char* ConnectFragments, Hlong ClutterSizeMax, const char* StopAfter) const

void HOperatorSetX.SelectCharacters(
[in] IHUntypedObjectX* Region, [out] IHUntypedObjectX*RegionCharacters, [in] VARIANT DotPrint, [in] VARIANT StrokeWidth, [in] VARIANT CharWidth, [in] VARIANT CharHeight, [in] VARIANT Punctuation, [in] VARIANT DiacriticMarks, [in] VARIANT PartitionMethod, [in] VARIANT PartitionLines, [in] VARIANT FragmentDistance, [in] VARIANT ConnectFragments, [in] VARIANT ClutterSizeMax, [in] VARIANT StopAfter)

IHRegionX* HRegionX.SelectCharacters(
[in] BSTR DotPrint, [in] BSTR StrokeWidth, [in] VARIANT CharWidth, [in] VARIANT CharHeight, [in] BSTR Punctuation, [in] BSTR DiacriticMarks, [in] BSTR PartitionMethod, [in] BSTR PartitionLines, [in] BSTR FragmentDistance, [in] BSTR ConnectFragments, [in] Hlong ClutterSizeMax, [in] BSTR StopAfter)

static void HOperatorSet.SelectCharacters(HObject region, out HObject regionCharacters, HTuple dotPrint, HTuple strokeWidth, HTuple charWidth, HTuple charHeight, HTuple punctuation, HTuple diacriticMarks, HTuple partitionMethod, HTuple partitionLines, HTuple fragmentDistance, HTuple connectFragments, HTuple clutterSizeMax, HTuple stopAfter)

HRegion HRegion.SelectCharacters(string dotPrint, string strokeWidth, HTuple charWidth, HTuple charHeight, string punctuation, string diacriticMarks, string partitionMethod, string partitionLines, string fragmentDistance, string connectFragments, int clutterSizeMax, string stopAfter)

Description

select_charactersselect_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters selects from a given RegionRegionRegionRegionRegionregion the areas which might be characters and returns them in RegionCharactersRegionCharactersRegionCharactersRegionCharactersRegionCharactersregionCharacters. This is done by using features like StrokeWidthStrokeWidthStrokeWidthStrokeWidthStrokeWidthstrokeWidth, DotPrintDotPrintDotPrintDotPrintDotPrintdotPrint, the size of the characters and some more. The given RegionRegionRegionRegionRegionregion should be united, else every RegionRegionRegionRegionRegionregion is processed separately. Thus do not call connectionconnectionConnectionconnectionConnectionConnection before calling select_charactersselect_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters, because then fragments or dots would not be connected to a character. If you have more than one region with text, you can of course handle them without merging them. The RegionRegionRegionRegionRegionregion for select_charactersselect_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters typically comes from segment_characterssegment_charactersSegmentCharacterssegment_charactersSegmentCharactersSegmentCharacters but also any other segmentation operators can be used.

The process of the selection can be partitioned into four parts. All steps are influenced by the parameters StrokeWidthStrokeWidthStrokeWidthStrokeWidthStrokeWidthstrokeWidth, CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight, and CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth. If you loose small objects like dots, adapt the minimum CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth and the minimum CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight. But some parameters affect the result of a certain step in particular. A closer description follows below. With the parameter StopAfterStopAfterStopAfterStopAfterStopAfterstopAfter you can terminate after a specified step.

In the first step, 'step1_select_candidates'"step1_select_candidates""step1_select_candidates""step1_select_candidates""step1_select_candidates""step1_select_candidates", CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth and the CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight are used to select the candidates. The result of this step is also affected by ClutterSizeMaxClutterSizeMaxClutterSizeMaxClutterSizeMaxClutterSizeMaxclutterSizeMax.

In the next step, 'step2_partition_characters'"step2_partition_characters""step2_partition_characters""step2_partition_characters""step2_partition_characters""step2_partition_characters", the parameter PartitionMethodPartitionMethodPartitionMethodPartitionMethodPartitionMethodpartitionMethod and the parameter PartitionLinesPartitionLinesPartitionLinesPartitionLinesPartitionLinespartitionLines influence the result.

Step three, 'step3_connect_fragments'"step3_connect_fragments""step3_connect_fragments""step3_connect_fragments""step3_connect_fragments""step3_connect_fragments", uses the parameters ConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsconnectFragments and DotPrintDotPrintDotPrintDotPrintDotPrintdotPrint. If dot-printed characters have to be detected and some dots are not connected to the character, there are two ways to overcome this problem: You can increase the FragmentDistanceFragmentDistanceFragmentDistanceFragmentDistanceFragmentDistancefragmentDistance and/or decrease the StrokeWidthStrokeWidthStrokeWidthStrokeWidthStrokeWidthstrokeWidth.

In the last step, 'step4_select_characters'"step4_select_characters""step4_select_characters""step4_select_characters""step4_select_characters""step4_select_characters", the result is affected by the parameters DiacriticMarksDiacriticMarksDiacriticMarksDiacriticMarksDiacriticMarksdiacriticMarks and PunctuationPunctuationPunctuationPunctuationPunctuationpunctuation.

DotPrintDotPrintDotPrintDotPrintDotPrintdotPrint: Should be set to 'true'"true""true""true""true""true" if dot prints should be read, else to 'false'"false""false""false""false""false".

StrokeWidthStrokeWidthStrokeWidthStrokeWidthStrokeWidthstrokeWidth: Specifies the stroke width of the text. It is used to calculate internally used mask sizes to determine the characters. This mask sizes are also influenced through the parameters DotPrintDotPrintDotPrintDotPrintDotPrintdotPrint, the average CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth, and the average CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight.

CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth: This can be a tuple with up to three values. The first value is the average width of a character. The second is the minimum width of a character and the third is the maximum width of a character. If the minimum is not set or equal -1, the operator automatically sets these value depending on average CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth. The same is the case if the maximum is not set. Some examples:

[10] sets the average character width to 10, the minimum and maximum are calculated by the operator.

[10,-1,20] sets the average character width to 10, the minimum value is calculated by the system, and the maximum is set to 20.

[10,5,20] sets the average character width to 10, the minimum to 5, and the maximum to 20.

CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight: This can be a tuple with up to three values. The first value is the average height of a character. The second is the minimum height of a character and the third is the maximum height of a character. If the minimum is not set or equal -1, the operator automatically sets these value depending on average CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight. The same is the case if the maximum is not set. Some examples:

[10] sets the average character height to 10, the minimum and maximum are calculated by the operator.

[10,-1,20] sets the average character height to 10 the minimum value is calculated by the system and the maximum is set to 20.

[10,5,20] this sets the average character height to 10, the minimum to 5 and the maximum to 20.

PunctuationPunctuationPunctuationPunctuationPunctuationpunctuation: Set this parameter to 'true'"true""true""true""true""true" if the operator also has to detect punctuation marks (e.g. .,:'`"), otherwise they will be suppressed.

DiacriticMarksDiacriticMarksDiacriticMarksDiacriticMarksDiacriticMarksdiacriticMarks: Set this parameter to 'true'"true""true""true""true""true" if the text in your application contains diacritic marks (e.g. â,é,ö), or to 'false'"false""false""false""false""false" to suppress them.

PartitionMethodPartitionMethodPartitionMethodPartitionMethodPartitionMethodpartitionMethod: If neighboring characters are printed close to each other, they may be partly merged. With this parameter you can specify the method to partition such characters. The possible values are 'none'"none""none""none""none""none", which means no partitioning is performed. 'fixed_width'"fixed_width""fixed_width""fixed_width""fixed_width""fixed_width" means that the partitioning assumes a constant character width. If the width of the extracted region is well above the average CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth, the region ist split into parts that have the given average CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth. The partitioning starts at the left border of the region. 'variable_width'"variable_width""variable_width""variable_width""variable_width""variable_width" means that the characters are partitioned at the position where they have the thinnest connection. This method can be selected for characters that are printed with a variable-width font or if many consecutive characters are extracted as one symbol. It could be helpful to call text_line_slanttext_line_slantTextLineSlanttext_line_slantTextLineSlantTextLineSlant and/or use text_line_orientationtext_line_orientationTextLineOrientationtext_line_orientationTextLineOrientationTextLineOrientation before calling select_charactersselect_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters.

PartitionLinesPartitionLinesPartitionLinesPartitionLinesPartitionLinespartitionLines: If some text lines or some characters of different text lines are connected, set this parameter to 'true'"true""true""true""true""true".

FragmentDistanceFragmentDistanceFragmentDistanceFragmentDistanceFragmentDistancefragmentDistance: This parameter influences the connection of character fragments. If too much is connected, set the parameter to 'narrow'"narrow""narrow""narrow""narrow""narrow" or 'medium'"medium""medium""medium""medium""medium". In the case that more fragments should be connected, set the parameter to 'medium'"medium""medium""medium""medium""medium" or 'wide'"wide""wide""wide""wide""wide". The connection is also influenced by the maximum of CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth and CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight. See also ConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsconnectFragments.

ConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsconnectFragments: Set this parameter to 'true'"true""true""true""true""true" if the extracted symbols are fragmented, i.e., if a symbol is not extracted as one region but broken up into several parts. See also FragmentDistanceFragmentDistanceFragmentDistanceFragmentDistanceFragmentDistancefragmentDistance and StopAfterStopAfterStopAfterStopAfterStopAfterstopAfter in the step 'step3_connect_fragments'"step3_connect_fragments""step3_connect_fragments""step3_connect_fragments""step3_connect_fragments""step3_connect_fragments".

ClutterSizeMaxClutterSizeMaxClutterSizeMaxClutterSizeMaxClutterSizeMaxclutterSizeMax: If the extracted characters contain clutter, i.e., small regions near the actual symbols, increase this value. If parts of the symbols are missing, decrease this value.

StopAfterStopAfterStopAfterStopAfterStopAfterstopAfter: Use this parameter in the case the operator does not produce the desired results. By modifying this value the operator stops after the execution of the selected step and provides the corresponding results. To end on completion, set StopAfterStopAfterStopAfterStopAfterStopAfterstopAfter to 'completion'"completion""completion""completion""completion""completion".

Parallelization

Parameters

RegionRegionRegionRegionRegionregion (input_object)  region(-array) objectHRegionHRegionHRegionHRegionXHobject

Region of text lines in which to select the characters.

RegionCharactersRegionCharactersRegionCharactersRegionCharactersRegionCharactersregionCharacters (output_object)  region(-array) objectHRegionHRegionHRegionHRegionXHobject *

Selected characters.

DotPrintDotPrintDotPrintDotPrintDotPrintdotPrint (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Should dot print characters be detected?

Default value: 'false' "false" "false" "false" "false" "false"

List of values: 'false'"false""false""false""false""false", 'true'"true""true""true""true""true"

StrokeWidthStrokeWidthStrokeWidthStrokeWidthStrokeWidthstrokeWidth (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Stroke width of a character.

Default value: 'medium' "medium" "medium" "medium" "medium" "medium"

List of values: 'bold'"bold""bold""bold""bold""bold", 'light'"light""light""light""light""light", 'medium'"medium""medium""medium""medium""medium", 'ultra_light'"ultra_light""ultra_light""ultra_light""ultra_light""ultra_light"

CharWidthCharWidthCharWidthCharWidthCharWidthcharWidth (input_control)  integer-array HTupleHTupleHTupleVARIANTHtuple (integer) (int / long) (Hlong) (Hlong) (Hlong) (Hlong)

Width of a character.

Default value: 25

Typical range of values: 1 ≤ CharWidth CharWidth CharWidth CharWidth CharWidth charWidth

Restriction: CharWidth >= 1

CharHeightCharHeightCharHeightCharHeightCharHeightcharHeight (input_control)  integer-array HTupleHTupleHTupleVARIANTHtuple (integer) (int / long) (Hlong) (Hlong) (Hlong) (Hlong)

Height of a character.

Default value: 25

Typical range of values: 1 ≤ CharHeight CharHeight CharHeight CharHeight CharHeight charHeight

Restriction: CharHeight >= 1

PunctuationPunctuationPunctuationPunctuationPunctuationpunctuation (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Add punctuation?

Default value: 'false' "false" "false" "false" "false" "false"

List of values: 'false'"false""false""false""false""false", 'true'"true""true""true""true""true"

DiacriticMarksDiacriticMarksDiacriticMarksDiacriticMarksDiacriticMarksdiacriticMarks (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Exist diacritic marks?

Default value: 'false' "false" "false" "false" "false" "false"

List of values: 'false'"false""false""false""false""false", 'true'"true""true""true""true""true"

PartitionMethodPartitionMethodPartitionMethodPartitionMethodPartitionMethodpartitionMethod (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Method to partition neighbored characters.

Default value: 'none' "none" "none" "none" "none" "none"

List of values: 'fixed_width'"fixed_width""fixed_width""fixed_width""fixed_width""fixed_width", 'none'"none""none""none""none""none", 'variable_width'"variable_width""variable_width""variable_width""variable_width""variable_width"

PartitionLinesPartitionLinesPartitionLinesPartitionLinesPartitionLinespartitionLines (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Should lines be partitioned?

Default value: 'false' "false" "false" "false" "false" "false"

List of values: 'false'"false""false""false""false""false", 'true'"true""true""true""true""true"

FragmentDistanceFragmentDistanceFragmentDistanceFragmentDistanceFragmentDistancefragmentDistance (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Distance of fragments.

Default value: 'medium' "medium" "medium" "medium" "medium" "medium"

List of values: 'medium'"medium""medium""medium""medium""medium", 'narrow'"narrow""narrow""narrow""narrow""narrow", 'wide'"wide""wide""wide""wide""wide"

ConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsConnectFragmentsconnectFragments (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Connect fragments?

Default value: 'false' "false" "false" "false" "false" "false"

List of values: 'false'"false""false""false""false""false", 'true'"true""true""true""true""true"

ClutterSizeMaxClutterSizeMaxClutterSizeMaxClutterSizeMaxClutterSizeMaxclutterSizeMax (input_control)  integer HTupleHTupleHTupleVARIANTHtuple (integer) (int / long) (Hlong) (Hlong) (Hlong) (Hlong)

Maximum size of clutter.

Default value: 0

Typical range of values: 0 ≤ ClutterSizeMax ClutterSizeMax ClutterSizeMax ClutterSizeMax ClutterSizeMax clutterSizeMax

Restriction: 0 < ClutterSizeMax

StopAfterStopAfterStopAfterStopAfterStopAfterstopAfter (input_control)  string HTupleHTupleHTupleVARIANTHtuple (string) (string) (HString) (char*) (BSTR) (char*)

Stop execution after this step.

Default value: 'completion' "completion" "completion" "completion" "completion" "completion"

List of values: 'completion'"completion""completion""completion""completion""completion", 'step1_select_candidates'"step1_select_candidates""step1_select_candidates""step1_select_candidates""step1_select_candidates""step1_select_candidates", 'step2_partition_characters'"step2_partition_characters""step2_partition_characters""step2_partition_characters""step2_partition_characters""step2_partition_characters", 'step3_connect_fragments'"step3_connect_fragments""step3_connect_fragments""step3_connect_fragments""step3_connect_fragments""step3_connect_fragments", 'step4_select_characters'"step4_select_characters""step4_select_characters""step4_select_characters""step4_select_characters""step4_select_characters"

Example (HDevelop)

read_image (Image, 'dot_print_rotated/dot_print_rotated_'+J$'02d')
text_line_orientation (Image, Image, 50, rad(-30), rad(30), OrientationAngle)
rotate_image (Image, ImageRotate, -OrientationAngle/rad(180)*180, 'constant')
segment_characters (ImageRotate, ImageRotate, ImageForeground, \
                    RegionForeground, 'local_auto_shape', 0, 0, 'medium', \
                    25, 25, 0, 10, UsedThreshold)
select_characters (RegionForeground, RegionCharacters, 1, 'ultra_light', \
                   [60,1,100], [60,1,100], 0, 0, 'none', 1, 'wide', 1, 0, \
                   'completion')

Result

If the input parameters are set correctly, the operator select_charactersselect_charactersSelectCharactersselect_charactersSelectCharactersSelectCharacters returns the value 2 (H_MSG_TRUE). Otherwise an exception will be raised.

Possible Predecessors

segment_characterssegment_charactersSegmentCharacterssegment_charactersSegmentCharactersSegmentCharacters, text_line_slanttext_line_slantTextLineSlanttext_line_slantTextLineSlantTextLineSlant

Possible Successors

do_ocr_singledo_ocr_singleDoOcrSingledo_ocr_singleDoOcrSingleDoOcrSingle, do_ocr_multido_ocr_multiDoOcrMultido_ocr_multiDoOcrMultiDoOcrMulti

Alternatives

connectionconnectionConnectionconnectionConnectionConnection

Module

Foundation


ClassesClassesClassesClasses | | | | Operators