create_dl_layer_batch_normalizationT_create_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization (Operator)

Name

create_dl_layer_batch_normalizationT_create_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization — Create a batch normalization layer.

Signature

create_dl_layer_batch_normalization( : : DLLayerInput, LayerName, Momentum, Epsilon, Activation, GenParamName, GenParamValue : DLLayerBatchNorm)

Herror T_create_dl_layer_batch_normalization(const Htuple DLLayerInput, const Htuple LayerName, const Htuple Momentum, const Htuple Epsilon, const Htuple Activation, const Htuple GenParamName, const Htuple GenParamValue, Htuple* DLLayerBatchNorm)

void CreateDlLayerBatchNormalization(const HTuple& DLLayerInput, const HTuple& LayerName, const HTuple& Momentum, const HTuple& Epsilon, const HTuple& Activation, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* DLLayerBatchNorm)

HDlLayer HDlLayer::CreateDlLayerBatchNormalization(const HString& LayerName, const HTuple& Momentum, double Epsilon, const HString& Activation, const HTuple& GenParamName, const HTuple& GenParamValue) const

HDlLayer HDlLayer::CreateDlLayerBatchNormalization(const HString& LayerName, const HString& Momentum, double Epsilon, const HString& Activation, const HString& GenParamName, const HString& GenParamValue) const

HDlLayer HDlLayer::CreateDlLayerBatchNormalization(const char* LayerName, const char* Momentum, double Epsilon, const char* Activation, const char* GenParamName, const char* GenParamValue) const

HDlLayer HDlLayer::CreateDlLayerBatchNormalization(const wchar_t* LayerName, const wchar_t* Momentum, double Epsilon, const wchar_t* Activation, const wchar_t* GenParamName, const wchar_t* GenParamValue) const   ( Windows only)

static void HOperatorSet.CreateDlLayerBatchNormalization(HTuple DLLayerInput, HTuple layerName, HTuple momentum, HTuple epsilon, HTuple activation, HTuple genParamName, HTuple genParamValue, out HTuple DLLayerBatchNorm)

HDlLayer HDlLayer.CreateDlLayerBatchNormalization(string layerName, HTuple momentum, double epsilon, string activation, HTuple genParamName, HTuple genParamValue)

HDlLayer HDlLayer.CreateDlLayerBatchNormalization(string layerName, string momentum, double epsilon, string activation, string genParamName, string genParamValue)

def create_dl_layer_batch_normalization(dllayer_input: HHandle, layer_name: str, momentum: Union[float, str], epsilon: float, activation: str, gen_param_name: MaybeSequence[str], gen_param_value: MaybeSequence[Union[int, float, str]]) -> HHandle

Description

The operator create_dl_layer_batch_normalizationcreate_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization creates a batch normalization layer whose handle is returned in DLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormdllayer_batch_norm. Batch normalization is used to improve the performance and stability of a neural network during training. The mean and variance of each input activation are calculated for each batch and the input values are transformed to have zero mean and unit variance. Moreover, a linear scale and shift transformation is learned. During training, to take all samples into account, the batch-wise calculated mean and variance values are combined with a MomentumMomentumMomentummomentummomentum into running mean and running variance, where denotes the iteration index: To affect the mean and variance values you can set the following options for MomentumMomentumMomentummomentummomentum:

Given number:

For example: 0.9. This is the default and recommended option.

Restriction: 0 MomentumMomentumMomentummomentummomentum 1

'auto'"auto""auto""auto""auto":

Combines mean and variance values by a cumulative moving average. This is only recommended in case the parameters of all previous layers in the network are frozen, i.e., have a learning rate of 0.

'freeze'"freeze""freeze""freeze""freeze":

Stops the adjustment of the mean and variance and their values stay fixed. In this case, the mean and variance are used during training for normalizing a batch, analogously to how the batch normalization operates during inference. The parameters of the linear scale and shift transformation, however, remain learnable.

EpsilonEpsilonEpsilonepsilonepsilon is a small offset to the variance and used to control the numerical stability. Usually its default value should be adequate.

The parameter DLLayerInputDLLayerInputDLLayerInputDLLayerInputdllayer_input determines the feeding input layer.

The parameter LayerNameLayerNameLayerNamelayerNamelayer_name sets an individual layer name. Note that if creating a model using create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelcreate_dl_model each layer of the created network must have a unique name.

The parameter ActivationActivationActivationactivationactivation determines whether an activation is performed after the batch normalization in order to optimize the runtime performance.

It is not possible to specify a leaky ReLU or a sigmoid activation function. Use create_dl_layer_activationcreate_dl_layer_activationCreateDlLayerActivationCreateDlLayerActivationcreate_dl_layer_activation instead.

The following generic parameters GenParamNameGenParamNameGenParamNamegenParamNamegen_param_name and the corresponding values GenParamValueGenParamValueGenParamValuegenParamValuegen_param_value are supported:

'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler":

See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

List of values: 'xavier'"xavier""xavier""xavier""xavier", 'msra'"msra""msra""msra""msra", 'const'"const""const""const""const".

Default: 'const'"const""const""const""const"

'bias_filler_const_val'"bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val":

Constant value.

Restriction: 'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler" must be set to 'const'"const""const""const""const".

Default: 0

'bias_filler_variance_norm'"bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm":

See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

List of values: 'norm_out'"norm_out""norm_out""norm_out""norm_out", 'norm_in'"norm_in""norm_in""norm_in""norm_in", 'norm_average'"norm_average""norm_average""norm_average""norm_average", or constant value (in combination with 'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler" = 'msra'"msra""msra""msra""msra").

Default: 'norm_out'"norm_out""norm_out""norm_out""norm_out"

'bias_term'"bias_term""bias_term""bias_term""bias_term":

Determines whether the created batch normalization layer has a bias term ('true'"true""true""true""true") or not ('false'"false""false""false""false").

Default: 'true'"true""true""true""true"

'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output":

Determines whether apply_dl_modelapply_dl_modelApplyDlModelApplyDlModelapply_dl_model will include the output of this layer in the dictionary DLResultBatchDLResultBatchDLResultBatchDLResultBatchdlresult_batch even without specifying this layer in OutputsOutputsOutputsoutputsoutputs ('true'"true""true""true""true") or not ('false'"false""false""false""false").

Default: 'false'"false""false""false""false"

'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier":

Multiplier for the learning rate for this layer that is used during training. If 'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier" is set to 0.0, the layer is skipped during training.

Default: 1.0

'learning_rate_multiplier_bias'"learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias":

Multiplier for the learning rate of the bias term. The total bias learning rate is the product of 'learning_rate_multiplier_bias'"learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias" and 'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier".

Default: 1.0

'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound":

Float value defining an upper bound for a rectified linear unit. If the activation layer is part of a model, which has been created using create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelcreate_dl_model, the upper bound can be unset. To do so, use set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and set an empty tuple for 'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound".

Default: []

'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler":

See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

List of values: 'xavier'"xavier""xavier""xavier""xavier", 'msra'"msra""msra""msra""msra", 'const'"const""const""const""const".

Default: 'const'"const""const""const""const"

'weight_filler_const_val'"weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val":

See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

Default: 1.0

'weight_filler_variance_norm'"weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm":

See create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution for a detailed explanation of this parameter and its values.

List of values: 'norm_in'"norm_in""norm_in""norm_in""norm_in", 'norm_out'"norm_out""norm_out""norm_out""norm_out", 'norm_average'"norm_average""norm_average""norm_average""norm_average", or constant value (in combination with 'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler" = 'msra'"msra""msra""msra""msra").

Default: 'norm_in'"norm_in""norm_in""norm_in""norm_in"

Certain parameters of layers created using this operator create_dl_layer_batch_normalizationcreate_dl_layer_batch_normalizationCreateDlLayerBatchNormalizationCreateDlLayerBatchNormalizationcreate_dl_layer_batch_normalization can be set and retrieved using further operators. The following tables give an overview, which parameters can be set using set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and which ones can be retrieved using get_dl_model_layer_paramget_dl_model_layer_paramGetDlModelLayerParamGetDlModelLayerParamget_dl_model_layer_param or get_dl_layer_paramget_dl_layer_paramGetDlLayerParamGetDlLayerParamget_dl_layer_param. Note, the operators set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and get_dl_model_layer_paramget_dl_model_layer_paramGetDlModelLayerParamGetDlModelLayerParamget_dl_model_layer_param require a model created by create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelcreate_dl_model.

Layer Parameters set get
'activation_mode'"activation_mode""activation_mode""activation_mode""activation_mode" (ActivationActivationActivationactivationactivation) x
'epsilon'"epsilon""epsilon""epsilon""epsilon" (EpsilonEpsilonEpsilonepsilonepsilon) x
'input_layer'"input_layer""input_layer""input_layer""input_layer" (DLLayerInputDLLayerInputDLLayerInputDLLayerInputdllayer_input) x
'momentum'"momentum""momentum""momentum""momentum" (MomentumMomentumMomentummomentummomentum) x x
'name'"name""name""name""name" (LayerNameLayerNameLayerNamelayerNamelayer_name) x x
'output_layer'"output_layer""output_layer""output_layer""output_layer" (DLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormdllayer_batch_norm) x
'shape'"shape""shape""shape""shape" x
'type'"type""type""type""type" x
Generic Layer Parameters set get
'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler" x x
'bias_filler_const_val'"bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val" x x
'bias_filler_variance_norm'"bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm" x x
'bias_term'"bias_term""bias_term""bias_term""bias_term" x
'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output" x x
'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier" x x
'learning_rate_multiplier_bias'"learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias" x x
'num_trainable_params'"num_trainable_params""num_trainable_params""num_trainable_params""num_trainable_params" x
'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound" x x
'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler" x x
'weight_filler_const_val'"weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val" x x
'weight_filler_variance_norm'"weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm" x x

Execution Information

Parameters

DLLayerInputDLLayerInputDLLayerInputDLLayerInputdllayer_input (input_control)  dl_layer HDlLayer, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

Feeding layer.

LayerNameLayerNameLayerNamelayerNamelayer_name (input_control)  string HTuplestrHTupleHtuple (string) (string) (HString) (char*)

Name of the output layer.

MomentumMomentumMomentummomentummomentum (input_control)  string HTupleUnion[float, str]HTupleHtuple (string / real) (string / double) (HString / double) (char* / double)

Momentum.

Default: 0.9

List of values: 0.9, 0.99, 0.999, 'auto'"auto""auto""auto""auto", 'freeze'"freeze""freeze""freeze""freeze"

EpsilonEpsilonEpsilonepsilonepsilon (input_control)  number HTuplefloatHTupleHtuple (real) (double) (double) (double)

Variance offset.

Default: 0.0001

ActivationActivationActivationactivationactivation (input_control)  string HTuplestrHTupleHtuple (string) (string) (HString) (char*)

Optional activation function.

Default: 'none' "none" "none" "none" "none"

List of values: 'none'"none""none""none""none", 'relu'"relu""relu""relu""relu"

GenParamNameGenParamNameGenParamNamegenParamNamegen_param_name (input_control)  attribute.name(-array) HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)

Generic input parameter names.

Default: []

List of values: 'bias_filler'"bias_filler""bias_filler""bias_filler""bias_filler", 'bias_filler_const_val'"bias_filler_const_val""bias_filler_const_val""bias_filler_const_val""bias_filler_const_val", 'bias_filler_variance_norm'"bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm""bias_filler_variance_norm", 'bias_term'"bias_term""bias_term""bias_term""bias_term", 'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output", 'learning_rate_multiplier'"learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier""learning_rate_multiplier", 'learning_rate_multiplier_bias'"learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias""learning_rate_multiplier_bias", 'upper_bound'"upper_bound""upper_bound""upper_bound""upper_bound", 'weight_filler'"weight_filler""weight_filler""weight_filler""weight_filler", 'weight_filler_const_val'"weight_filler_const_val""weight_filler_const_val""weight_filler_const_val""weight_filler_const_val", 'weight_filler_variance_norm'"weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm""weight_filler_variance_norm"

GenParamValueGenParamValueGenParamValuegenParamValuegen_param_value (input_control)  attribute.value(-array) HTupleMaybeSequence[Union[int, float, str]]HTupleHtuple (string / integer / real) (string / int / long / double) (HString / Hlong / double) (char* / Hlong / double)

Generic input parameter values.

Default: []

Suggested values: 'xavier'"xavier""xavier""xavier""xavier", 'msra'"msra""msra""msra""msra", 'const'"const""const""const""const", 'nearest_neighbor'"nearest_neighbor""nearest_neighbor""nearest_neighbor""nearest_neighbor", 'bilinear'"bilinear""bilinear""bilinear""bilinear", 'norm_in'"norm_in""norm_in""norm_in""norm_in", 'norm_out'"norm_out""norm_out""norm_out""norm_out", 'norm_average'"norm_average""norm_average""norm_average""norm_average", 'true'"true""true""true""true", 'false'"false""false""false""false", 1.0, 0.9, 0.0

DLLayerBatchNormDLLayerBatchNormDLLayerBatchNormDLLayerBatchNormdllayer_batch_norm (output_control)  dl_layer HDlLayer, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

Batch normalization layer.

Example (HDevelop)

create_dl_layer_input ('input', [224,224,3], [], [], DLLayerInput)
* In practice, one typically sets ['bias_term'], ['false'] for a convolution
* that is directly followed by a batch normalization layer.
create_dl_layer_convolution (DLLayerInput, 'conv1', 3, 1, 1, 64, 1, \
                             'none', 'none', ['bias_term'], ['false'], \
                             DLLayerConvolution)
create_dl_layer_batch_normalization (DLLayerConvolution, 'bn1', 0.9, \
                                     0.0001, 'none', [], [], \
                                     DLLayerBatchNorm)

Possible Predecessors

create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution

Possible Successors

create_dl_layer_activationcreate_dl_layer_activationCreateDlLayerActivationCreateDlLayerActivationcreate_dl_layer_activation, create_dl_layer_convolutioncreate_dl_layer_convolutionCreateDlLayerConvolutionCreateDlLayerConvolutioncreate_dl_layer_convolution

References

Sergey Ioffe and Christian Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," Proceedings of the 32nd International Conference on Machine Learning, (ICML) 2015, Lille, France, 6-11 July 2015, pp. 448--456

Module

Deep Learning Training