create_dl_layer_batch_normalizationT_create_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization (Operator)

Name

create_dl_layer_batch_normalizationT_create_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization — Create a batch normalization layer.

Signature

create_dl_layer_batch_normalization( : : DLLayerInput, LayerName, Momentum, Epsilon, Activation, GenParamName, GenParamValue : DLLayerBatchNorm)

Description

The operator create_dl_layer_batch_normalizationcreate_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization creates a batch normalization layer whose handle is returned in DLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormdllayer_batch_norm. Batch normalization is used to improve the performance and stability of a neural network during training. The mean and variance of each input activation are calculated for each batch and the input values are transformed to have zero mean and unit variance. Moreover, a linear scale and shift transformation is learned. During training, to take all samples into account, the batch-wise calculated mean and variance values are combined with a MomentumMomentumMomentumMomentummomentummomentum into running mean and running variance, where denotes the iteration index: To affect the mean and variance values you can set the following options for MomentumMomentumMomentumMomentummomentummomentum:

'auto'"auto""auto""auto""auto""auto":: Combines mean and variance values by a cumulative moving average.
'freeze'"freeze""freeze""freeze""freeze""freeze":: Stops the adjustment of the mean and variance and their values stay fixed. This is usually done before fine-tuning a model.

EpsilonEpsilonEpsilonEpsilonepsilonepsilon is a small offset to the variance and used to control the numerical stability. Usually its default value should be adequate.

The parameter DLLayerInputDLLayerInputDLLayerInputDLLayerInputDLLayerInputdllayer_input determines the feeding input layer.

The parameter LayerNameLayerNameLayerNameLayerNamelayerNamelayer_name sets an individual layer name. Note that if creating a model using create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelCreateDlModelcreate_dl_model each layer of the created network must have an unique name.

The parameter ActivationActivationActivationActivationactivationactivation determines whether an activation is performed after the batch normalization in order to optimize the runtime performance.

'relu'"relu""relu""relu""relu""relu": perform a ReLU activation after the batch normalization.

It is possible to specify an upper bound to the ReLU operation (see create_dl_layer_activationcreate_dl_layer_activationCreateDlLayerActivationCreateDlLayerActivationCreateDlLayerActivationcreate_dl_layer_activation) via the generic parameter 'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound""upper_bound".
'none'"none""none""none""none""none": no activation operation is performed.

The following generic parameters GenParamNameGenParamNameGenParamNameGenParamNamegenParamNamegen_param_name and the corresponding values GenParamValueGenParamValueGenParamValueGenParamValuegenParamValuegen_param_value are supported:

'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler""bias_filler":

See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

List of values: 'xavier'"xavier""xavier""xavier""xavier""xavier", 'msra'"msra""msra""msra""msra""msra", 'const'"const""const""const""const""const".

Default: 'xavier'"xavier""xavier""xavier""xavier""xavier"

'bias_filler_const_val'"bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val":

Constant value if 'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler""bias_filler" has been set to 'const'"const""const""const""const""const".

Default: 0

'bias_filler_variance_norm'"bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm":

List of values: 'norm_out'"norm_out""norm_out""norm_out""norm_out""norm_out", 'norm_in'"norm_in""norm_in""norm_in""norm_in""norm_in", 'norm_average'"norm_average""norm_average""norm_average""norm_average""norm_average", or constant value (in combination with 'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler""bias_filler" set to 'msra'"msra""msra""msra""msra""msra").

Default: 'norm_out'"norm_out""norm_out""norm_out""norm_out""norm_out"

'bias_term'"bias_term""bias_term""bias_term""bias_term""bias_term":

If set to 'false'"false""false""false""false""false", the created batch normalization layer has no bias term.

Default: 'true'"true""true""true""true""true"

'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output""is_inference_output":

Determines whether apply_dl_modelapply_dl_modelApplyDlModelApplyDlModelApplyDlModelapply_dl_model will include the output of this layer in the dictionary DLResultBatchDLResultBatchDLResultBatchDLResultBatchDLResultBatchdlresult_batch even without specifying this layer in OutputsOutputsOutputsOutputsoutputsoutputs ('true'"true""true""true""true""true") or not ('false'"false""false""false""false""false").

Default: 'false'"false""false""false""false""false"

'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier":

Multiplier for the learning rate for this layer that is used during training. If 'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier" is set to 0, the layer is skipped during training.

Default: 1.0

'num_trainable_params'"num_trainable_params""num_trainable_params""num_trainable_params""num_trainable_params""num_trainable_params":

Number of trainable parameters (weights and biases) of the layer.

'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound""upper_bound":

Float value defining an upper bound for a rectified linear unit. If the activation layer is part of a model, which has been created using create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelCreateDlModelcreate_dl_model, the upper bound can be unset. To do so, use set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and set an empty tuple for 'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound""upper_bound".

Default: []

'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler""weight_filler":

List of values: 'xavier'"xavier""xavier""xavier""xavier""xavier", 'msra'"msra""msra""msra""msra""msra", 'const'"const""const""const""const""const".

Default: 'xavier'"xavier""xavier""xavier""xavier""xavier"

'weight_filler_const_val'"weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val":

Constant value if 'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler""weight_filler" has been set to 'const'"const""const""const""const""const". See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

Default: 0.5

'weight_filler_variance_norm'"weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm":

List of values: 'norm_in'"norm_in""norm_in""norm_in""norm_in""norm_in", 'norm_out'"norm_out""norm_out""norm_out""norm_out""norm_out", 'norm_average'"norm_average""norm_average""norm_average""norm_average""norm_average", or constant value (in combination with 'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler""weight_filler" set to 'msra'"msra""msra""msra""msra""msra").

Default: 'norm_in'"norm_in""norm_in""norm_in""norm_in""norm_in"

Certain parameters of layers created using this operator create_dl_layer_batch_normalizationcreate_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization can be set and retrieved using further operators. The following tables give an overview, which parameters can be set using set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and which ones can be retrieved using get_dl_model_layer_paramget_dl_model_layer_paramGetDlModelLayerParamGetDlModelLayerParamGetDlModelLayerParamget_dl_model_layer_param or get_dl_layer_paramget_dl_layer_paramGetDlLayerParamGetDlLayerParamGetDlLayerParamget_dl_layer_param. Note, the operators set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and get_dl_model_layer_paramget_dl_model_layer_paramGetDlModelLayerParamGetDlModelLayerParamGetDlModelLayerParamget_dl_model_layer_param require a model created by create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelCreateDlModelcreate_dl_model.

Layer Parameters	set	get
'activation_mode'"activation_mode""activation_mode""activation_mode""activation_mode""activation_mode" (`ActivationActivationActivationActivationactivationactivation`)
'epsilon'"epsilon""epsilon""epsilon""epsilon""epsilon" (`EpsilonEpsilonEpsilonEpsilonepsilonepsilon`)
'input_layer'"input_layer""input_layer""input_layer""input_layer""input_layer" (`DLLayerInputDLLayerInputDLLayerInputDLLayerInputDLLayerInputdllayer_input`)
'momentum'"momentum""momentum""momentum""momentum""momentum" (`MomentumMomentumMomentumMomentummomentummomentum`)
'name'"name""name""name""name""name" (`LayerNameLayerNameLayerNameLayerNamelayerNamelayer_name`)
'output_layer'"output_layer""output_layer""output_layer""output_layer""output_layer" (`DLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormdllayer_batch_norm`)
'shape'"shape""shape""shape""shape""shape"
'type'"type""type""type""type""type"

Generic Layer Parameters	set	get
'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler""bias_filler"
'bias_filler_const_val'"bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val"
'bias_filler_variance_norm'"bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm"
'bias_term'"bias_term""bias_term""bias_term""bias_term""bias_term"
'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output""is_inference_output"
'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier"
'learning_rate_multiplier_bias'"learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias"
'num_trainable_params'"num_trainable_params""num_trainable_params""num_trainable_params""num_trainable_params""num_trainable_params"
'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound""upper_bound"
'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler""weight_filler"
'weight_filler_const_val'"weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val"
'weight_filler_variance_norm'"weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm"

Execution Information

Multithreading type: reentrant (runs in parallel with non-exclusive operators).
Multithreading scope: global (may be called from any thread).
Processed without parallelization.

Parameters

DLLayerInputDLLayerInputDLLayerInputDLLayerInputDLLayerInputdllayer_input (input_control) dl_layer → (handle)

Feeding layer.

LayerNameLayerNameLayerNameLayerNamelayerNamelayer_name (input_control) string → (string)

Name of the layer.

MomentumMomentumMomentumMomentummomentummomentum (input_control) string → (string / real)

Momentum.

Default value: 'auto' "auto" "auto" "auto" "auto" "auto"

List of values: 0.9, 0.99, 0.999, 'auto'"auto""auto""auto""auto""auto", 'freeze'"freeze""freeze""freeze""freeze""freeze"

EpsilonEpsilonEpsilonEpsilonepsilonepsilon (input_control) number → (real)

Variance offset.

Default value: 0.0001

ActivationActivationActivationActivationactivationactivation (input_control) string → (string)

Optional activation function.

Default value: 'none' "none" "none" "none" "none" "none"

List of values: 'none'"none""none""none""none""none", 'relu'"relu""relu""relu""relu""relu"

GenParamNameGenParamNameGenParamNameGenParamNamegenParamNamegen_param_name (input_control) attribute.name(-array) → (string)

Generic input parameter names.

Default value: []

List of values: 'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler""bias_filler", 'bias_filler_const_val'"bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val", 'bias_filler_variance_norm'"bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm", 'bias_term'"bias_term""bias_term""bias_term""bias_term""bias_term", 'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output""is_inference_output", 'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier", 'learning_rate_multiplier_bias'"learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias", 'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound""upper_bound", 'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler""weight_filler", 'weight_filler_const_val'"weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val", 'weight_filler_variance_norm'"weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm"

GenParamValueGenParamValueGenParamValueGenParamValuegenParamValuegen_param_value (input_control) attribute.value(-array) → (string / integer / real)

Generic input parameter values.

Default value: []

Suggested values: 'xavier'"xavier""xavier""xavier""xavier""xavier", 'msra'"msra""msra""msra""msra""msra", 'const'"const""const""const""const""const", 'nearest_neighbor'"nearest_neighbor""nearest_neighbor""nearest_neighbor""nearest_neighbor""nearest_neighbor", 'bilinear'"bilinear""bilinear""bilinear""bilinear""bilinear", 'norm_in'"norm_in""norm_in""norm_in""norm_in""norm_in", 'norm_out'"norm_out""norm_out""norm_out""norm_out""norm_out", 'norm_average'"norm_average""norm_average""norm_average""norm_average""norm_average", 'true'"true""true""true""true""true", 'false'"false""false""false""false""false", 1.0, 0.9, 0.0

DLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormdllayer_batch_norm (output_control) dl_layer → (handle)

Batch normalization layer.

Example (HDevelop)

create_dl_layer_input ('input', [224,224,3], [], [], DLLayerInput)
* In practice, one typically sets ['bias_term'], ['false'] for a convolution 
* that is directly followed by a batch normalization layer.
create_dl_layer_convolution (DLLayerInput, 'conv1', 3, 1, 1, 64, 1, 'none', \
                             'none', ['bias_term'], ['false'], \
                             DLLayerConvolution)
create_dl_layer_batch_normalization (DLLayerConvolution, 'bn1', 'auto', \
                                     0.0001, 'none', [], [], DLLayerBatchNorm)

Possible Predecessors

create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution

Possible Successors

create_dl_layer_activationcreate_dl_layer_activationCreateDlLayerActivationCreateDlLayerActivationCreateDlLayerActivationcreate_dl_layer_activation, create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution

References

Sergey Ioffe and Christian Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," Proceedings of the 32nd International Conference on Machine Learning, (ICML) 2015, Lille, France, 6-11 July 2015, pp. 448--456

Module

Deep Learning Training

Operators