set_feature_lengths_class_train_data — Define subfeatures in training data.
set_feature_lengths_class_train_data defines subfeatures in the
training data in
ClassTrainDataHandle. The subfeatures are defined
SubFeatureLength by a set of lengths that groups the previously
added columns subsequently into subfeatures. It is not possible
to group columns which are not subsequent.
The sum over all entries in
must be equal to the number of dimensions set in
create_class_train_data with the parameter
Optionally, names for all subsets can be defined in
An exemplary situation in which this operator is helpful is described here:
Two different data sources are available. Both data sources
provide a vector of a certain length. The first data source provides
data of length n and the second of length m. In order
to automatically decide which of the data sources is more valuable for
a certain classification problem, training data can be created that contains
both data sources. E.g., if
create_class_train_data was called with
NumDim =n+m=w, then
can be called with [n,m] in
and [Name1, Name2] in
describe this situation for a later usage of operators like
Then the classification problem has to be specified via calls of
add_sample_class_train_data, by giving a vector of the first
data source and a vector of the second data source as the combined
feature vector of length w.
The result of the call of
select_feature_set_knn would then be
either [Name1] if the first is more relevant,
[Name2] if the second is more relevant
or [Name1, Name2] if both are necessary.
This operator modifies the state of the following input parameter:
During execution of this operator, access to the value of this parameter must be synchronized if it is used across multiple threads.
ClassTrainDataHandle(input_control, state is modified) class_train_data
Handle of the training data that should be partitioned into subfeatures.
Length of the subfeatures.
Names of the subfeatures.
* Find out which of the two features distinguishes two Classes NameFeature1 := 'Good Feature' NameFeature2 := 'Bad Feature' LengthFeature1 := 3 LengthFeature2 := 2 * Create training data create_class_train_data (LengthFeature1+LengthFeature2,\ ClassTrainDataHandle) * Define the features which are in the training data set_feature_lengths_class_train_data (ClassTrainDataHandle, [LengthFeature1,\ LengthFeature2], [NameFeature1, NameFeature2]) * Add training data * |Feat1| |Feat2| add_sample_class_train_data (ClassTrainDataHandle, 'row', [1,1,1, 2,1 ], 0) add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,2,2, 2,1 ], 1) add_sample_class_train_data (ClassTrainDataHandle, 'row', [1,1,1, 3,4 ], 0) add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,2,2, 3,4 ], 1) * Add more data * ... * Select the better feature select_feature_set_knn (ClassTrainDataHandle, 'greedy', , , KNNHandle,\ SelectedFeature, Score) classify_class_knn (KNNHandle, [1,1,1], Result, Rating) classify_class_knn (KNNHandle, [2,2,2], Result, Rating) * Use the classifier * ...
If the parameters are valid, the operator
returns the value TRUE. If necessary, an exception is raised.