August 31, 2017 | Developers Corner
A recommended practice for setting up classification tasks and OCR: Regularly review your growing set of training data with the HDevelop OCR Training File Browser – while developing the segmentation part, but also at further development steps. It is quick and convenient. Although it does not solve the classification task itself, it helps with almost no extra effort to avoid segmentation mistakes and others. In addition, it also points out possible classification issues.
You are setting up a classification or an OCR solution. Not to mention, you have potentially laboriously collected and labeled sample images. If only a few samples are labeled wrongly or segmented badly, your classification result might already be skewed. To prevent this, make use of the HDevelop OCR Training File Browser:
After each and every code change of the segmentation, after a manual segmentation, after any setting and adjustment of the classification parameters, etc.
Simply write each image and segmented region to a training file, open the OCR assistant (Visualization ⇒ Tools ⇒ OCR Training File Browser) and load the training file and the classifier, as soon as there is one.
write_ocr(regions, image, ClassName, TraininingFileName) // at the first occurrence
append_ocr_file(regions, image, TrainingFileName) // afterwards
Published on: August 31, 2017