MXNet的Model API

MXNet的API

mxnet里面的model API不是真的API,它只不过是一个对ndarray的一个封装,使其更容易使用。

训练一个模型

为了训练一个模型,你需要遵循以下两步,第一步是使用symbol来构造,然后调用model.Feedforward.create这个方法来创建一个model。下面的代码创建了一个两层的神经网络。
# configure a two layer neuralnetwork
data = mx.symbol.Variable('data')
fc1 = mx.symbol.FullyConnected(data, name='fc1', num_hidden=128)
act1 = mx.symbol.Activation(fc1, name='relu1', act_type='relu')
fc2 = mx.symbol.FullyConnected(act1, name='fc2', num_hidden=64)
softmax = mx.symbol.SoftmaxOutput(fc2, name='sm')
# create a model
model = mx.model.FeedForward.create(softmax,X=data_set,num_epoch=num_epoch,learning_rate=0.01)
你还可以使用scikit-learn一样的风格来构造和拟合一个模型
# create a model using sklearn-style two step way
model = mx.model.FeedForward(softmax,num_epoch=num_epoch,learning_rate=0.01)model.fit(X=data_set)
你如果想看更多的功能,请看Model API Reference

保存模型

# save a model to mymodel-symbol.json and mymodel-0100.params
prefix = 'mymodel'
iteration = 100
model.save(prefix, iteration)# load model back
model_loaded = mx.model.FeedForward.load(prefix, iteration)
我们往往用一个脚本进行对数据的训练,往往以前缀加序号的形式如mymodel-0100.params这样的形式保存,然后用另一个脚本加载模型,并进行预测来完成相应的功能。

阶段性的点检测(Checkpoint)

我们进行周期性的点检测是很有必要的。为了做这个,你只要简单的加一个回调函数do_checkpoint(path)在函数里面。这个训练的过程将会自动的在每次迭代的时候,在特殊的位置进行点检测。
prefix='models/chkpt'
model = mx.model.FeedForward.create(softmax,X=data_set,iter_end_callback=mx.callback.do_checkpoint(prefix),...)
你可以加载模型的点检测在使用Feedforward.load之后。

使用多个设备

简单的设置ctx,其内容为你要训练设备(cpu,gpu)的列表。
devices = [mx.gpu(i) for i in range(num_device)]
model = mx.model.FeedForward.create(softmax,X=dataset,ctx=devices,...)
这个训练过程将会通过一个并行的方式在你指定的GPUS进行。

模型API

MXNet模型模块

mxnet.model.BatchEndParam

alias of BatchEndParams


BatchEndParam是BatchEndParams的参数

mxnet.model.save_checkpoint(prefixepochsymbolarg_paramsaux_params)

Checkpoint the model data into file.

Parameters:
  • prefix (str) – Prefix of model name.
  • epoch (int) – The epoch number of the model.
  • symbol (Symbol) – The input symbol
  • arg_params (dict of str to NDArray) – Model parameter, dict of name to NDArray of net’s weights.
  • aux_params (dict of str to NDArray) – Model parameter, dict of name to NDArray of net’s auxiliary states.

Notes

  • prefix-symbol.json will be saved for symbol.
  • prefix-epoch.params will be saved for parameters.

类功能:对模型数据点检测后存入到文件中。
参数:
prefix(str)-模型名的前缀(可以是个文件夹)
epoch(int)-模型的epoch的数量(epoch在机器学习里面指的是把所有的样本进行一次全部操作(前向传播,反向传播等等),和普通的迭代相比,epoch的尺度比较大)
symbol(Symbol)-输入的symbol。
arg_params(一个NDArray的字符字典)-模型参数,以及网络权重字典。
aux_params(一个NDArray的字符字典)-模型参数,以及一些附加状态的字典。

Notes

  • prefix-symbol.json will be saved for symbol.
  • prefix-epoch.params will be saved for parameters.
注意:
prefix-symbol.json将会存储symbol。
prefix-epoch.params会存储参数。
一个模型的symbol文件往往是唯一确定的,而params文件可以很多,最后你可以把一些没用的params文件给删掉。一般params的个数等于epoch的个数,因为越往后面的params越好,所以你可以只保留最后一个的params文件。

mxnet.model.load_checkpoint(prefixepoch)

Load model checkpoint from file.

Parameters:
  • prefix (str) – Prefix of model name.
  • epoch (int) – Epoch number of model we would like to load.
Returns:

  • symbol (Symbol) – The symbol configuration of computation network.
  • arg_params (dict of str to NDArray) – Model parameter, dict of name to NDArray of net’s weights.
  • aux_params (dict of str to NDArray) – Model parameter, dict of name to NDArray of net’s auxiliary states.
类功能:加载检测点(感觉还是翻译成检测点比较好)
参数:
prefix(str)-模型名称的前缀
epoch(int)-你想加载的模型的epoch的序号,一般是最大的那个。
返回值:
symbol(Symbol)-我们要计算网络的模型配置
arg_params(一个NDArray的字符字典)-模型参数,以及网络权重字典。
aux_params(一个NDArray的字符字典)-模型参数,以及一些附加状态的字典。

class mxnet.model.FeedForward(symbolctx=Nonenum_epoch=Noneepoch_size=None,optimizer='sgd'initializer=<mxnet.initializer.Uniform object>numpy_batch_size=128,arg_params=Noneaux_params=Noneallow_extra_params=Falsebegin_epoch=0**kwargs)

Model class of MXNet for training and predicting feedforward nets. This class is designed for a single-data single output supervised network.

Parameters:
  • symbol (Symbol) – The symbol configuration of computation network.
  • ctx (Context or list of Context, optional) – The device context of training and prediction. To use multi GPU training, pass in a list of gpu contexts.
  • num_epoch (int, optional) – Training parameter, number of training epochs(epochs).
  • epoch_size (int, optional) – Number of batches in a epoch. In default, it is set to ceil(num_train_examples / batch_size)
  • optimizer (str or Optimizer, optional) – Training parameter, name or optimizer object for training.
  • initializer (initializer function, optional) – Training parameter, the initialization scheme used.
  • numpy_batch_size (int, optional) – The batch size of training data. Only needed when input array is numpy.
  • arg_params (dict of str to NDArray, optional) – Model parameter, dict of name to NDArray of net’s weights.
  • aux_params (dict of str to NDArray, optional) – Model parameter, dict of name to NDArray of net’s auxiliary states.
  • allow_extra_params (boolean, optional) – Whether allow extra parameters that are not needed by symbol to be passed by aux_params and arg_params. If this is True, no error will be thrown when aux_params and arg_params contain extra parameters than needed.
  • begin_epoch (int, optional) – The begining training epoch.
  • kwargs (dict) – The additional keyword arguments passed to optimizer.
类功能:
MXNet的用来训练和预测前向传播网络的模型类。这个类设计来是为了得到一个单一输出的监督网络。
参数:
symbol(Symbol)-计算网络的symbol构造。
ctx(Context or list of Context,optional)-用来训练和预测的设备。如果要使用多个GPU,请传入gpu上下文。
num_epoch(int,optional)-训练epoches的个数。
epoch_size(int,optional)- 一个epoch里面的batch的个数。默认ceil(num_train_examples/batch_size)即训练的样本的个数/batch的大小然后取整。
optimizer(str or Optimizer,optional)-训练参数,名字或者相应的优化类用来训练的。
initializer(initializer function,optional)-训练参数,用来初始化的组合。
numpy_batch_size(int,optional)-训练集的batch尺寸。只有当输入的数组是numpy的时候需要。
arg_params(一个NDArray的字符字典)-模型参数,以及网络权重字典。
aux_params(一个NDArray的字符字典)-模型参数,以及一些附加状态的字典。
allow_extra_params(boolean,optional)-是否需要一些额外的参数,aux_params和arg_params不需要的。如果这是真的,那么就不会抛出错误当参数的个数超出所需要的参数的时候。
begin_epoch(int,optional)-开始训练的epoch,也就是说这一epoch后面的epoch都会重新训练。
kwargs(dict)-额外的关键参数被传到optimizer里面的。

predict(Xnum_batch=Nonereturn_data=Falsereset=True)

Run the prediction, always only use one device. :param X: :type X: mxnet.DataIter :param num_batch: the number of batch to run. Go though all batches if None :type num_batch: int or None

Returns:y – The predicted value of the output.
Return type:numpy.ndarray or a list of numpy.ndarray if the network has multiple outputs.
类方法功能:进行预测,只能使用一个device.参数X是X类型的,batch的运行数量,如果被设置为None的话,会对里面的所有的批进行处理。
返回值:我们的预测值。

score(Xeval_metric='acc'num_batch=Nonebatch_end_callback=Nonereset=True)

Run the model on X and calculate the score with eval_metric :param X: :type X: mxnet.DataIter :param eval_metric: The metric for calculating score :type eval_metric: metric.metric :param num_batch: the number of batch to run. Go though all batches if None :type num_batch: int or None

Returns:s – the final score
Return type:float
类方法功能:在X上运行模型并且用评估矩阵计算分数。
返回值:我们的最终分数。

fit(Xy=Noneeval_data=Noneeval_metric='acc'epoch_end_callback=None,batch_end_callback=Nonekvstore='local'logger=Nonework_load_list=Nonemonitor=None,eval_batch_end_callback=None)

Fit the model.

Parameters:
  • X (DataIter, or numpy.ndarray/NDArray) – Training data. If X is an DataIter, the name or, if not available, position, of its outputs should match the corresponding variable names defined in the symbolic graph.
  • y (numpy.ndarray/NDArray, optional) – Training set label. If X is numpy.ndarray/NDArray, y is required to be set. While y can be 1D or 2D (with 2nd dimension as 1), its 1st dimension must be the same as X, i.e. the number of data points and labels should be equal.
  • eval_data (DataIter or numpy.ndarray/list/NDArray pair) – If eval_data is numpy.ndarray/list/NDArray pair, it should be (valid_data, valid_label).
  • eval_metric (metric.EvalMetric or str or callable) – The evaluation metric, name of evaluation metric. Or a customize evaluation function that returns the statistics based on minibatch.
  • epoch_end_callback (callable(epoch, symbol, arg_params, aux_states)) – A callback that is invoked at end of each epoch. This can be used to checkpoint model each epoch.
  • batch_end_callback (callable(epoch)) – A callback that is invoked at end of each batch For print purpose
  • kvstore (KVStore or str, optional) – The KVStore or a string kvstore type: ‘local’, ‘dist_sync’, ‘dist_async’ In default uses ‘local’, often no need to change for single machiine.
  • logger (logging logger, optional) – When not specified, default logger will be used.
  • work_load_list (float or int, optional) – The list of work load for different devices, in the same order as ctx
类方法功能:模型拟合
参数:
X:训练集。
Y:训练集标签。可以是二维的,不过第二维是一,标签的个数需要和输入点的个数一致。
eval_data:解析数据(和javascript里面的eval函数差不多),输入应该是(vaild_data,vaild_label)
eval_metric评估矩阵
epoch_end_callback-在执行到每一epoch的结尾的时候调用。通常用来点检测。
batch_end_callback-在每一批结尾都会调用,只是为了打印出来看。
kvstore:这个通常不用改,基本上都是'local'
logger:当没有指定的时候,会用默认的logger。
work_load_list:不同设备的工作流列表,和ctx的顺序一样。

save(prefixepoch=None)

Checkpoint the model checkpoint into file. You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)

Parameters:prefix (str) – Prefix of model name.

Notes

  • prefix-symbol.json will be saved for symbol.
  • prefix-epoch.params will be saved for parameters.
static load(prefixepochctx=None**kwargs)

Load model checkpoint from file.

Parameters:
  • prefix (str) – Prefix of model name.
  • epoch (int) – epoch number of model we would like to load.
  • ctx (Context or list of Context, optional) – The device context of training and prediction.
  • kwargs (dict) – other parameters for model, including num_epoch, optimizer and numpy_batch_size
Returns:

model – The loaded model that can be used for prediction.

Return type:

FeedForward

保存和加载的比较简单,我就不说了。

static create(symbolXy=Nonectx=Nonenum_epoch=Noneepoch_size=None,optimizer='sgd'initializer=<mxnet.initializer.Uniform object>eval_data=Noneeval_metric='acc',epoch_end_callback=Nonebatch_end_callback=Nonekvstore='local'logger=None,work_load_list=Noneeval_batch_end_callback=None**kwargs)

Functional style to create a model. This function will be more consistent with functional languages such as R, where mutation is not allowed.

Parameters:
  • symbol (Symbol) – The symbol configuration of computation network.
  • X (DataIter) – Training data
  • y (numpy.ndarray, optional) – If X is numpy.ndarray y is required to set
  • ctx (Context or list of Context, optional) – The device context of training and prediction. To use multi GPU training, pass in a list of gpu contexts.
  • num_epoch (int, optional) – Training parameter, number of training epochs(epochs).
  • epoch_size (int, optional) – Number of batches in a epoch. In default, it is set to ceil(num_train_examples / batch_size)
  • optimizer (str or Optimizer, optional) – Training parameter, name or optimizer object for training.
  • initializier (initializer function, optional) – Training parameter, the initialization scheme used.
  • eval_data (DataIter or numpy.ndarray pair) – If eval_set is numpy.ndarray pair, it should be (valid_data, valid_label)
  • eval_metric (metric.EvalMetric or str or callable) – The evaluation metric, name of evaluation metric. Or a customize evaluation function that returns the statistics based on minibatch.
  • epoch_end_callback (callable(epoch, symbol, arg_params, aux_states)) – A callback that is invoked at end of each epoch. This can be used to checkpoint model each epoch.
  • batch_end_callback (callable(epoch)) – A callback that is invoked at end of each batch For print purpose
  • kvstore (KVStore or str, optional) – The KVStore or a string kvstore type: ‘local’, ‘dist_sync’, ‘dis_async’ In default uses ‘local’, often no need to change for single machiine.
  • logger (logging logger, optional) – When not specified, default logger will be used.
  • work_load_list (list of float or int, optional) – The list of work load for different devices, in the same order as ctx
创建模型这个API和前面也是大同小异。

接下去的这些API不常用到


初使化的API参考

class mxnet.initializer.Initializer

Base class for Initializer.

__call__(namearr)

Override () function to do Initialization

Parameters:
  • name (str) – name of corrosponding ndarray
  • arr (NDArray) – ndarray to be Initialized
class mxnet.initializer.Load(paramdefault_init=Noneverbose=False)

Initialize by loading pretrained param from file or dict

Parameters:
  • param (str or dict of str->NDArray) – param file or dict mapping name to NDArray.
  • default_init (Initializer) – default initializer when name is not found in param.
  • verbose (bool) – log source when initializing.
class mxnet.initializer.Mixed(patternsinitializers)

Initialize with mixed Initializer

Parameters:
  • patterns (list of str) – list of regular expression patterns to match parameter names.
  • initializers (list of Initializer) – list of Initializer corrosponding to patterns
class mxnet.initializer.Uniform(scale=0.07)

Initialize the weight with uniform [-scale, scale]

Parameters:scale (float, optional) – The scale of uniform distribution
class mxnet.initializer.Normal(sigma=0.01)

Initialize the weight with normal(0, sigma)

Parameters:sigma (float, optional) – Standard deviation for gaussian distribution.
class mxnet.initializer.Orthogonal(scale=1.414rand_type='uniform')

Intialize weight as Orthogonal matrix

Parameters:
  • scale (float optional) – scaling factor of weight
  • rand_type (string optional) – use “uniform” or “normal” random number to initialize weight
  • Reference –
  • --------- –
  • solutions to the nonlinear dynamics of learning in deep linear neural networks(Exact) –
  • preprint arXiv (arXiv) –
class mxnet.initializer.Xavier(rnd_type='uniform'factor_type='avg'magnitude=3)

Initialize the weight with Xavier or similar initialization scheme.

Parameters:
  • rnd_type (str, optional) – Use `gaussian` or `uniform` to init
  • factor_type (str, optional) – Use `avg``in`, or `out` to init
  • magnitude (float, optional) – scale of random number range



评估矩阵(Evalution Metric)API

Online evaluation metric module.

mxnet.metric.check_label_shapes(labelspredsshape=0)

Check to see if the two arrays are the same size.

class mxnet.metric.EvalMetric(namenum=None)

Base class of all evaluation metrics.

update(labelpred)

Update the internal evaluation.

Parameters:
  • labels (list of NDArray) – The labels of the data.
  • preds (list of NDArray) – Predicted values.
reset()

Clear the internal statistics to initial state.

get()

Get the current evaluation result.

Returns:
  • name (str) – Name of the metric.
  • value (float) – Value of the evaluation.
get_name_value()

Get zipped name and value pairs

class mxnet.metric.CompositeEvalMetric(**kwargs)

Manage multiple evaluation metrics.

add(metric)

Add a child metric.

get_metric(index)

Get a child metric.

class mxnet.metric.Accuracy

Calculate accuracy

class mxnet.metric.TopKAccuracy(**kwargs)

Calculate top k predictions accuracy

class mxnet.metric.F1

Calculate the F1 score of a binary classification problem.

class mxnet.metric.MAE

Calculate Mean Absolute Error loss

class mxnet.metric.MSE

Calculate Mean Squared Error loss

class mxnet.metric.RMSE

Calculate Root Mean Squred Error loss

class mxnet.metric.CrossEntropy

Calculate Cross Entropy loss

class mxnet.metric.Torch

Dummy metric for torch criterions

class mxnet.metric.CustomMetric(fevalname=Noneallow_extra_outputs=False)

Custom evaluation metric that takes a NDArray function.

Parameters:
  • feval (callable(label, pred)) – Customized evaluation function.
  • name (str, optional) – The name of the metric
  • allow_extra_outputs (bool) – If true, the prediction outputs can have extra outputs. This is useful in RNN, where the states are also produced in outputs for forwarding.
mxnet.metric.np(numpy_fevalname=Noneallow_extra_outputs=False)

Create a customized metric from numpy function.

Parameters:
  • numpy_feval (callable(label, pred)) – Customized evaluation function.
  • name (str, optional) – The name of the metric.
  • allow_extra_outputs (bool) – If true, the prediction outputs can have extra outputs. This is useful in RNN, where the states are also produced in outputs for forwarding.
mxnet.metric.create(metric**kwargs)

Create an evaluation metric.

Parameters:metric (str or callable) – The name of the metric, or a function providing statistics given pred, label NDArray
优化API

Common Optimization algorithms with regularizations.

class mxnet.optimizer.Optimizer(rescale_grad=1.0param_idx2name=Nonewd=0.0,clip_gradient=Nonelearning_rate=0.01lr_scheduler=Nonesym=None)

Base class of all optimizers.

static register(klass)

Register optimizers to the optimizer factory

static create_optimizer(namerescale_grad=1**kwargs)

Create an optimizer with specified name.

Parameters:
  • name (str) – Name of required optimizer. Should be the name of a subclass of Optimizer. Case insensitive.
  • rescale_grad (float) – Rescaling factor on gradient.
  • kwargs (dict) – Parameters for optimizer
Returns:

opt – The result optimizer.

Return type:

Optimizer

create_state(indexweight)

Create additional optimizer state such as momentum. override in implementations.

update(indexweightgradstate)

Update the parameters. override in implementations

set_lr_scale(args_lrscale)

set lr scale is deprecated. Use set_lr_mult instead.

set_lr_mult(args_lr_mult)

Set individual learning rate multipler for parameters

Parameters:args_lr_mult (dict of string/int to float) – set the lr multipler for name/index to float. setting multipler by index is supported for backward compatibility, but we recommend using name and symbol.
set_wd_mult(args_wd_mult)

Set individual weight decay multipler for parameters. By default wd multipler is 0 for all params whose name doesn’t end with _weight, if param_idx2name is provided.

Parameters:args_wd_mult (dict of string/int to float) – set the wd multipler for name/index to float. setting multipler by index is supported for backward compatibility, but we recommend using name and symbol.
mxnet.optimizer.register(klass)

Register optimizers to the optimizer factory

class mxnet.optimizer.SGD(momentum=0.0**kwargs)

A very simple SGD optimizer with momentum and weight regularization.

Parameters:
  • learning_rate (float, optional) – learning_rate of SGD
  • momentum (float, optional) – momentum value
  • wd (float, optional) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
  • param_idx2name (dict of string/int to float, optional) – special treat weight decay in parameter ends with bias, gamma, and beta
create_state(indexweight)

Create additional optimizer state such as momentum.

Parameters:weight (NDArray) – The weight data
update(indexweightgradstate)

Update the parameters.

Parameters:
  • index (int) – An unique integer key used to index the parameters
  • weight (NDArray) – weight ndarray
  • grad (NDArray) – grad ndarray
  • state (NDArray or other objects returned by init_state) – The auxiliary state used in optimization.
class mxnet.optimizer.NAG(**kwargs)

SGD with nesterov It is implemented according to https://github.com/torch/optim/blob/master/sgd.lua

update(indexweightgradstate)

Update the parameters.

Parameters:
  • index (int) – An unique integer key used to index the parameters
  • weight (NDArray) – weight ndarray
  • grad (NDArray) – grad ndarray
  • state (NDArray or other objects returned by init_state) – The auxiliary state used in optimization.
class mxnet.optimizer.SGLD(**kwargs)

Stochastic Langevin Dynamics Updater to sample from a distribution.

Parameters:
  • learning_rate (float, optional) – learning_rate of SGD
  • wd (float, optional) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
  • param_idx2name (dict of string/int to float, optional) – special treat weight decay in parameter ends with bias, gamma, and beta
create_state(indexweight)

Create additional optimizer state such as momentum.

Parameters:weight (NDArray) – The weight data
update(indexweightgradstate)

Update the parameters.

Parameters:
  • index (int) – An unique integer key used to index the parameters
  • weight (NDArray) – weight ndarray
  • grad (NDArray) – grad ndarray
  • state (NDArray or other objects returned by init_state) – The auxiliary state used in optimization.
class mxnet.optimizer.ccSGD(momentum=0.0**kwargs)

A very simple SGD optimizer with momentum and weight regularization. Implemented in C++.

Parameters:
  • learning_rate (float, optional) – learning_rate of SGD
  • momentum (float, optional) – momentum value
  • wd (float, optional) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
update(indexweightgradstate)

Update the parameters.

Parameters:
  • index (int) – An unique integer key used to index the parameters
  • weight (NDArray) – weight ndarray
  • grad (NDArray) – grad ndarray
  • state (NDArray or other objects returned by init_state) – The auxiliary state used in optimization.
class mxnet.optimizer.Adam(learning_rate=0.001beta1=0.9beta2=0.999epsilon=1e-08,decay_factor=0.99999999**kwargs)

Adam optimizer as described in [King2014].

[King2014] Diederik Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization,http://arxiv.org/abs/1412.6980

the code in this class was adapted from https://github.com/mila-udem/blocks/blob/master/blocks/algorithms/__init__.py#L765

Parameters:
  • learning_rate (float, optional) – Step size. Default value is set to 0.002.
  • beta1 (float, optional) – Exponential decay rate for the first moment estimates. Default value is set to 0.9.
  • beta2 (float, optional) – Exponential decay rate for the second moment estimates. Default value is set to 0.999.
  • epsilon (float, optional) – Default value is set to 1e-8.
  • decay_factor (float, optional) – Default value is set to 1 - 1e-8.
  • wd (float, optional) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
create_state(indexweight)

Create additional optimizer state: mean, variance

Parameters:weight (NDArray) – The weight data
update(indexweightgradstate)

Update the parameters.

Parameters:
  • index (int) – An unique integer key used to index the parameters
  • weight (NDArray) – weight ndarray
  • grad (NDArray) – grad ndarray
  • state (NDArray or other objects returned by init_state) – The auxiliary state used in optimization.
class mxnet.optimizer.AdaGrad(eps=1e-07**kwargs)

AdaGrad optimizer of Duchi et al., 2011,

This code follows the version in http://arxiv.org/pdf/1212.5701v1.pdf Eq(5) by Matthew D. Zeiler, 2012. AdaGrad will help the network to converge faster in some cases.

Parameters:
  • learning_rate (float, optional) – Step size. Default value is set to 0.05.
  • wd (float, optional) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • eps (float, optional) – A small float number to make the updating processing stable Default value is set to 1e-7.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
class mxnet.optimizer.RMSProp(gamma1=0.95gamma2=0.9**kwargs)

RMSProp optimizer of Tieleman & Hinton, 2012,

This code follows the version in http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.

Parameters:
  • learning_rate (float, optional) – Step size. Default value is set to 0.002.
  • gamma1 (float, optional) – decay factor of moving average for gradient, gradient^2. Default value is set to 0.95.
  • gamma2 (float, optional) – “momentum” factor. Default value if set to 0.9.
  • wd (float, optional) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
create_state(indexweight)

Create additional optimizer state: mean, variance :param weight: The weight data :type weight: NDArray

update(indexweightgradstate)

Update the parameters. :param index: An unique integer key used to index the parameters

Parameters:
  • weight (NDArray) – weight ndarray
  • grad (NDArray) – grad ndarray
  • state (NDArray or other objects returned by init_state) – The auxiliary state used in optimization.
class mxnet.optimizer.AdaDelta(rho=0.9epsilon=1e-05**kwargs)

AdaDelta optimizer as described in Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method.

http://arxiv.org/abs/1212.5701

Parameters:
  • rho (float) – Decay rate for both squared gradients and delta x
  • epsilon (float) – The constant as described in the thesis
  • wd (float) – L2 regularization coefficient add to all the weights
  • rescale_grad (float, optional) – rescaling factor of gradient.
  • clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
class mxnet.optimizer.Test(**kwargs)

For test use

create_state(indexweight)

Create a state to duplicate weight

update(indexweightgradstate)

performs w += rescale_grad * grad

mxnet.optimizer.create(namerescale_grad=1**kwargs)

Create an optimizer with specified name.

Parameters:
  • name (str) – Name of required optimizer. Should be the name of a subclass of Optimizer. Case insensitive.
  • rescale_grad (float) – Rescaling factor on gradient.
  • kwargs (dict) – Parameters for optimizer
Returns:

opt – The result optimizer.

Return type:

Optimizer

mxnet.optimizer.get_updater(optimizer)

Return a clossure of the updater needed for kvstore

Parameters:optimizer (Optimizer) – The optimizer
Returns:updater – The clossure of the updater
Return type:function



本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/565996.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

点击修改表格背景色

使用Jquery事件实现点击修改表格的背景颜色 每列表格之前都有一个多选按钮&#xff0c;当你点击按钮时&#xff0c;这一行所有内容的背景色会发生改变&#xff0c;当你再次点击该多选按钮的时候&#xff0c;取消背景色。 未点击之前的样式 点击时候的样式 程序解读&#xff…

【OpenCV 例程200篇】36. 直角坐标与极坐标转换(cv2.polarToCart)

『youcans 的 OpenCV 例程200篇 - 总目录』 【youcans 的 OpenCV 例程200篇】36. 直角坐标与极坐标的转换 函数 cv2.cartToPolar 用于将直角坐标&#xff08;笛卡尔坐标&#xff09;转换为极坐标&#xff0c;函数 cv2.polarToCart 用于将极坐标转换为直角坐标&#xff08;笛卡尔…

python里面的函数

python里面的函数 函数定义 def my_abs(x):if x > 0:return xelse:return -x 如果没有return语句&#xff0c;函数执行完毕后也会返回结果&#xff0c;只是结果为None。 return None可以简写为return。 在Python交互环境中定义函数时&#xff0c;注意Python会出现...的提示…

【youcans 的图像处理学习课】6. 灰度变换与直方图处理

专栏地址&#xff1a;『youcans 的图像处理学习课』 文章目录&#xff1a;『youcans 的图像处理学习课 - 总目录』 【youcans 的图像处理学习课】6. 灰度变换与直方图处理 文章目录【youcans 的图像处理学习课】6. 灰度变换与直方图处理1. 图像增强技术2. 图像的灰度化处理和二…

数字时钟

一个美丽的数字时钟 利用所学的Jquery知识制作一个自己的专属时钟&#xff0c;下面先看一下效果图 效果图此图为静止 &#xff08;时分秒都是动态变化的&#xff09; 程序解读&#xff1a;使用定时器进行动态变化&#xff0c;时分秒的数字小于10的时候前面应该加上一个0&…

python里面的高级特性

python里面的高级特性 1.切片(Slice) >>> L[0:3] [Michael, Sarah, Tracy]L[0:3]表示&#xff0c;从索引0开始取&#xff0c;直到索引3为止&#xff0c;但不包括索引3。即索引0&#xff0c;1&#xff0c;2&#xff0c;正好是3个元素。 如果第一个索引是0&#xff0c;…

链接数据库增删改通用

实现对SQLServer和MySql数据库通用链接及数据的增删改 我们经常需要和数据库打交道&#xff0c;对数据库数据进行增改删查的操作&#xff0c;首先我们必须要先链接数据库&#xff0c;然后对数据内容进行相关增删改操作。 首先看一下目录结构 程序解读&#xff1a;一共有三个…

2021爱智先行者—(2)零基础APP开发实例

【本文正在参与"2021爱智先行者-征文大赛"活动】&#xff0c;活动链接&#xff1a;https://bbs.csdn.net/topics/602601454 欢迎关注 『Python小白的项目实战』 系列&#xff0c;持续更新 2021爱智先行者—&#xff08;1&#xff09;开箱点评 2021爱智先行者—&#…

pyinstaller使用

pyinstaller 顾名思义&#xff0c;pyinstaller是给python脚本打包用的。 先直接上例子吧 1.直接打包&#xff0c;包含多个文件 pyinstaller xx.py 2.打包成一个文件 pyinstaller -F xx.py 3.打包为一个文件没有黑框(终端或者命令行) pyinstaller -F -w xx.py 4.打包…

ATM取款机系统

模拟银行实现ATM机取款系统 该系统使用( ( (SQLServer) ) )数据库 功能介绍&#xff1a; 开户&#xff08;到银行填写开户申请单&#xff09;取钱存钱查询余额转账 根据需求设计相对应的数据库概念模型 流程分步详解 1 创建数据库Bank_db --创建数据库 CREATE DATABASE B…

【OpenCV 例程200篇】37. 图像的灰度化处理和二值化处理(cv2.threshold)

『youcans 的 OpenCV 例程200篇 - 总目录』 【OpenCV 例程200篇】37. 图像的灰度化处理和二值化处理 按照颜色对图像进行分类&#xff0c;可以分为二值图像、灰度图像和彩色图像。 二值图像&#xff1a;只有黑色和白色两种颜色的图像。每个像素点可以用 0/1 表示&#xff0c;0…

python库大全

环境管理 管理 Python 版本和环境的工具 p&#xff1a;非常简单的交互式 python 版本管理工具。官网pyenv&#xff1a;简单的 Python 版本管理工具。官网Vex&#xff1a;可以在虚拟环境中执行命令。官网virtualenv&#xff1a;创建独立 Python 环境的工具。官网virtualenvwrap…

【OpenCV 例程200篇】38. 图像的反色变换(图像反转)

『youcans 的 OpenCV 例程200篇 - 总目录』 【OpenCV 例程200篇】38. 图像的反色变换&#xff08;图像反转&#xff09; 灰度变换是图像增强的重要方法&#xff0c;可以使图像动态范围扩大、图像对比度增强&#xff0c;图像更清晰&#xff0c;特征更明显&#xff0c;从而改善图…

SQLServer 条件查询语句大全

对于刚开始认识SQLServer数据库的小伙伴们来说添加一些条件查询是比较困难的&#xff0c;我整理了一份常用的条件查询语句供大家参考借鉴 一. 创建数据库 CREATE DATABASE Class ON PRIMARY ( NAME Bank_db_data, FILENAME D:\DATA\Class_data.mdf, SIZE 5MB, MAXSIZE 50…

【OpenCV 例程200篇】39. 图像灰度的线性变换

『youcans 的 OpenCV 例程200篇 - 总目录』 【OpenCV 例程200篇】39. 图像灰度的线性变换 线性灰度变换将原始图像灰度值的动态范围按线性关系扩展到指定范围或整个动态范围。 线性灰度变化对图像的每一个像素作线性拉伸&#xff0c;可以凸显图像的细节&#xff0c;提高图像的…

网络编程基础

网络协议 TCP/IP协议 IP地址与端口

获取焦点改变状态

表格显示文本内容&#xff0c;当用鼠标点击时获取到焦点文本变为可输入的输入框&#xff0c;点击空白处时失去焦点变为文本显示的文本内容 先看效果图 失去焦点的时候&#xff08;文本内容只能看不能编辑&#xff09; 获取焦点的时候&#xff08;文本内容变为可以编辑的输入框…

【OpenCV 例程300篇】40. 图像分段线性灰度变换

『youcans 的 OpenCV 例程300篇 - 总目录』 【youcans 的 OpenCV 例程300篇】40. 图像分段线性灰度变换 分段线性变换函数可以增强图像各部分的反差&#xff0c;增强感兴趣的灰度区间、抑制不感兴趣的灰度级。 分段线性函数的优点是可以根据需要拉伸特征物的灰度细节&#xff…

Ajax链接输出数据库

使用Ajax链接数据库并且获取数据库里的内容显示在页面 两大步骤&#xff1a; 设计并实现数据库进行数据库链接并获取数据库内容显示 先看效果图 没有查询并显示数据之前效果 点击查询按钮之后获取数据库内容显示在页面 下面进行程序的讲解 一 数据库的设计及实现 新建一…