HALCON Reference Manual 10.0.2
Table of Contents / OCR / Support Vector Machines ClassesClassesClasses | | | Operators

get_prep_info_ocr_class_svmT_get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm (Operator)

Name

get_prep_info_ocr_class_svmT_get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm — Compute the information content of the preprocessed feature vectors of an SVM-based OCR classifier.

Signature

get_prep_info_ocr_class_svm( : : OCRHandle, TrainingFile, Preprocessing : InformationCont, CumInformationCont)

Herror T_get_prep_info_ocr_class_svm(const Htuple OCRHandle, const Htuple TrainingFile, const Htuple Preprocessing, Htuple* InformationCont, Htuple* CumInformationCont)

Herror get_prep_info_ocr_class_svm(const HTuple& OCRHandle, const HTuple& TrainingFile, const HTuple& Preprocessing, HTuple* InformationCont, HTuple* CumInformationCont)

HTuple HOCRSvm::GetPrepInfoOcrClassSvm(const HTuple& TrainingFile, const HTuple& Preprocessing, HTuple* CumInformationCont) const

void HOperatorSetX.GetPrepInfoOcrClassSvm(
[in] VARIANT OCRHandle, [in] VARIANT TrainingFile, [in] VARIANT Preprocessing, [out] VARIANT* InformationCont, [out] VARIANT* CumInformationCont)

VARIANT HOCRSvmX.GetPrepInfoOcrClassSvm(
[in] VARIANT TrainingFile, [in] BSTR Preprocessing, [out] VARIANT* CumInformationCont)

static void HOperatorSet.GetPrepInfoOcrClassSvm(HTuple OCRHandle, HTuple trainingFile, HTuple preprocessing, out HTuple informationCont, out HTuple cumInformationCont)

HTuple HOCRSvm.GetPrepInfoOcrClassSvm(HTuple trainingFile, string preprocessing, out HTuple cumInformationCont)

HTuple HOCRSvm.GetPrepInfoOcrClassSvm(string trainingFile, string preprocessing, out HTuple cumInformationCont)

Description

get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm computes the information content of the training vectors that have been transformed with the preprocessing given by PreprocessingPreprocessingPreprocessingPreprocessingpreprocessing. PreprocessingPreprocessingPreprocessingPreprocessingpreprocessing can be set to 'principal_components'"principal_components""principal_components""principal_components""principal_components" or 'canonical_variates'"canonical_variates""canonical_variates""canonical_variates""canonical_variates". The OCR classifier OCRHandleOCRHandleOCRHandleOCRHandleOCRHandle must have been created with create_ocr_class_svmcreate_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvm. The preprocessing methods are described with create_class_svmcreate_class_svmcreate_class_svmCreateClassSvmCreateClassSvm. The information content is derived from the variations of the transformed components of the feature vector, i.e., it is computed solely based on the training data, independent of any error rate on the training data. The information content is computed for all relevant components of the transformed feature vectors (NumFeatures for 'principal_components'"principal_components""principal_components""principal_components""principal_components" and min(NumClasses - 1, NumFeatures) for 'canonical_variates'"canonical_variates""canonical_variates""canonical_variates""canonical_variates", see create_class_svmcreate_class_svmcreate_class_svmCreateClassSvmCreateClassSvm), and is returned in InformationContInformationContInformationContInformationContinformationCont as a number between 0 and 1. To convert the information content into a percentage, it simply needs to be multiplied by 100. The cumulative information content of the first n components is returned in the n-th component of CumInformationContCumInformationContCumInformationContCumInformationContcumInformationCont, i.e., CumInformationContCumInformationContCumInformationContCumInformationContcumInformationCont contains the sums of the first n elements of InformationContInformationContInformationContInformationContinformationCont. To use get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm, a sufficient number of samples must be stored in the training files given by TrainingFileTrainingFileTrainingFileTrainingFiletrainingFile (see write_ocr_trainfwrite_ocr_trainfwrite_ocr_trainfWriteOcrTrainfWriteOcrTrainf).

InformationContInformationContInformationContInformationContinformationCont and CumInformationContCumInformationContCumInformationContCumInformationContcumInformationCont can be used to decide how many components of the transformed feature vectors contain relevant information. An often used criterion is to require that the transformed data must represent x% (e.g., 90%) of the total data. This can be decided easily from the first value of CumInformationContCumInformationContCumInformationContCumInformationContcumInformationCont that lies above x%. The number thus obtained can be used as the value for NumComponents in a new call to create_ocr_class_svmcreate_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvm. The call to get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm already requires the creation of a classifier, and hence the setting of NumComponents in create_ocr_class_svmcreate_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvm to an initial value. However, if get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm is called it is typically not known how many components are relevant, and hence how to set NumComponents in this call. Therefore, the following two-step approach should typically be used to select NumComponents: In a first step, a classifier with the maximum number for NumComponents is created (NumFeatures for 'principal_components'"principal_components""principal_components""principal_components""principal_components" and min(NumClasses - 1, NumFeatures) for 'canonical_variates'"canonical_variates""canonical_variates""canonical_variates""canonical_variates"). Then, the training samples are saved in a training file using write_ocr_trainfwrite_ocr_trainfwrite_ocr_trainfWriteOcrTrainfWriteOcrTrainf. Subsequently, get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm is used to determine the information content of the components, and with this NumComponents. After this, a new classifier with the desired number of components is created, and the classifier is trained with trainf_ocr_class_svmtrainf_ocr_class_svmtrainf_ocr_class_svmTrainfOcrClassSvmTrainfOcrClassSvm.

Parallelization

Parameters

OCRHandleOCRHandleOCRHandleOCRHandleOCRHandle (input_control)  ocr_svm HOCRSvm, HTupleHOCRSvm, HTupleHOCRSvmX, VARIANTHtuple (integer) (IntPtr) (Hlong) (Hlong) (Hlong)

Handle of the OCR classifier.

TrainingFileTrainingFileTrainingFileTrainingFiletrainingFile (input_control)  filename.read(-array) HTupleHTupleVARIANTHtuple (string) (string) (char*) (BSTR) (char*)

Name(s) of the training file(s).

Default value: 'ocr.trf' "ocr.trf" "ocr.trf" "ocr.trf" "ocr.trf"

File extension: .trf

PreprocessingPreprocessingPreprocessingPreprocessingpreprocessing (input_control)  string HTupleHTupleVARIANTHtuple (string) (string) (char*) (BSTR) (char*)

Type of preprocessing used to transform the feature vectors.

Default value: 'principal_components' "principal_components" "principal_components" "principal_components" "principal_components"

List of values: 'principal_components'"principal_components""principal_components""principal_components""principal_components", 'canonical_variates'"canonical_variates""canonical_variates""canonical_variates""canonical_variates"

InformationContInformationContInformationContInformationContinformationCont (output_control)  real-array HTupleHTupleVARIANTHtuple (real) (double) (double) (double) (double)

Relative information content of the transformed feature vectors.

CumInformationContCumInformationContCumInformationContCumInformationContcumInformationCont (output_control)  real-array HTupleHTupleVARIANTHtuple (real) (double) (double) (double) (double)

Cumulative information content of the transformed feature vectors.

Example (HDevelop)

* Create the initial OCR classifier.
read_ocr_trainf_names ('ocr.trf', CharacterNames, CharacterCount)
create_ocr_class_svm (8, 10, 'constant', 'default', CharacterNames, \
                      'rbf', 0.01, 0.01, 'one-versus-one', \
                      'principal_components', 81, OCRHandle)
* Get the information content of the transformed feature vectors.
get_prep_info_ocr_class_svm (OCRHandle, 'ocr.trf', 'principal_components', \
                             InformationCont, CumInformationCont)
* Determine the number of transformed components.
* NumComp = [...]
clear_ocr_class_svm (OCRHandle)
* Create the final OCR classifier.
create_ocr_class_svm (8, 10, 'constant', 'default', CharacterNames, \
                      'rbf', 0.01, 0.01,'one-versus-one', \
                      'principal_components', NumComp, OCRHandle)
* Train the final classifier.
trainf_ocr_class_svm (OCRHandle, 'ocr.trf', 0.001, 'default')
write_ocr_class_svm (OCRHandle, 'ocr.osc')
clear_ocr_class_svm (OCRHandle)

Result

If the parameters are valid the operator get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm returns the value 2 (H_MSG_TRUE). If necessary, an exception is raised.

get_prep_info_ocr_class_svmget_prep_info_ocr_class_svmget_prep_info_ocr_class_svmGetPrepInfoOcrClassSvmGetPrepInfoOcrClassSvm may return the error 9211 (Matrix is not positive definite) if PreprocessingPreprocessingPreprocessingPreprocessingpreprocessing = 'canonical_variates'"canonical_variates""canonical_variates""canonical_variates""canonical_variates" is used. This typically indicates that not enough training samples have been stored for each class.

Possible Predecessors

create_ocr_class_svmcreate_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvm, write_ocr_trainfwrite_ocr_trainfwrite_ocr_trainfWriteOcrTrainfWriteOcrTrainf, append_ocr_trainfappend_ocr_trainfappend_ocr_trainfAppendOcrTrainfAppendOcrTrainf, write_ocr_trainf_imagewrite_ocr_trainf_imagewrite_ocr_trainf_imageWriteOcrTrainfImageWriteOcrTrainfImage

Possible Successors

clear_ocr_class_svmclear_ocr_class_svmclear_ocr_class_svmClearOcrClassSvmClearOcrClassSvm, create_ocr_class_svmcreate_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvm

Module

OCR/OCV


Table of Contents / OCR / Support Vector Machines ClassesClassesClasses | | | Operators
HALCON Reference Manual 10.0.2 Copyright © 1996-2011 MVTec Software GmbH