pytorch gather_【Pytorch】Pytorch-1.1.0 版本新特性

v2-3fcad7f7b3b167f7b10f84c2012bc9d2_1440w.jpg?source=172ae18b

2019年05月01日，Pytorch 1.1.0 版本正式发布啦~https://github.com/pytorch/pytorch/releases/tag/v1.1.0

主要的几个功能：
1. TensorBoard (currently experimental)
2. JIT 的升级
· [JIT] Attributes in ScriptModules
· [JIT] Dictionary and List Support in TorchScript
· [JIT] User-defined classes in TorchScript (experimental)
3. DistributedDataParallel new functionality and tutorials

TensorBoard (currently experimental)

PyTorch now supports TensorBoard logging with a simplefrom torch.utils.tensorboard import SummaryWritercommand.
Histograms, embeddings, scalars, images, text, graphs, and more can be visualized across training runs.
TensorBoard support is currently experimental. You can browse the docs here.

JIT

Attributes in ScriptModules

Attributes can be assigned on a ScriptModule by wrapping them with torch.jit.Attribute and specifying the type.
They will be serialized along with any paramters/buffers when you call torch.jit.save() , so they are a great way to store arbitrary state in your model.
See the docs for more info.
Example:

class Foo(torch.jit.ScriptModule):def __init__(self, a_dict):super(Foo, self).__init__(False)self.words = torch.jit.Attribute([], List[str])self.some_dict = torch.jit.Attribute(a_dict, Dict[str, int])@torch.jit.script_methoddef forward(self, input: str) -> int:self.words.append(input)return self.some_dict[input]

Dictionary and List Support in TorchScript

TorchScript now has robust support for list and dictionary types. They behave much like Python lists and dictionaries, supporting most built-in methods, as well as simple comprehensions and for…in constructs.

User-defined classes in TorchScript (experimental)

For more complex stateful operations, TorchScript now supports annotating a class with @torch.jit.script. Classes used this way can be JIT-compiled and loaded in C++ like other TorchScript modules.
See the docs for more info.
Example:

@torch.jit.script
class Pair:def __init__(self, first, second)self.first = firstself.second = seconddef sum(self):return self.first + self.second

DistributedDataParallel new functionality and tutorials

nn.parallel.DistributedDataParallel: can now wrap multi-GPU modules, which enables use cases such as model parallel (tutorial) on one server and data parallel (tutorial) across servers. (19271).

Breaking Changes

Tensor.set_: the device of a Tensor can no longer be changed via Tensor.set_. This would most commonly happen when setting up a Tensor with the default CUDA device and later swapping in a Storage on a different CUDA device. Instead, set up the Tensor on the correct device from the beginning. (18832).
Pay attention to the order change of lr_scheduler.step(). (7889).
torch.unique: changed the default value of sorted to True. (15379).
[JIT] Rename isTensor api -> isCompleteTensor. #18437
[JIT] Remove GraphExecutor's python bindings. #19141
[C++]: many methods on Type no longer exist; use the functional or Tensor method equivalent. (17991).
[C++]: the Backend constructor of TensorOptions no longer exists. (18137).
[C++, Distributed]: Remove c10d ProcessGroup::getGroupRank has been removed. (19147).

【New Features】
这次的版本更新添加了很多可调用的方法。

Operators

torch.tril_indices, torch.triu_indices: added operator with same behavior as NumPy. (14904, 15203).
torch.combinations, torch.cartesian_prod: added new itertools-like operators. (9393).
torch.repeat_interleave: new operator similar to numpy.repeat. (18395).
torch.from_file: new operator similar to Storage.from_file, but returning a tensor. (18688).
torch.unique_consecutive: new operator with semantics similar to std::unique in C++. (19060).
torch.tril, torch.triu, torch.trtrs: now support batching. (15257, 18025).
torch.gather: add support for sparse_grad option. (17182).
torch.std, torch.max_values, torch.min_values, torch.logsumexp can now operate over multiple dimensions at once. (14535, 15892, 16475).
torch.cdist: added operator equivalent to scipy.spatial.distance.cdist. (16168, 17173).
torch.__config__.show(): reports detailed version of all libraries. (18579).

NN

nn.MultiheadedAttention: new module implementing MultiheadedAttention from Attention Is All You Need. (18334).
nn.functional.interpolate: added support for bicubic. (9849).
nn.SyncBatchNorm: support synchronous Batch Normalization. (14267).
nn.Conv: added support for Circular Padding via mode='circular'. (17240).
nn.EmbeddingBag: now supports trainable per_sample_weights. (18799).
nn.EmbeddingBag: add support for from_pretrained method, as in nn.Embedding. (15273).
RNNs: automatically handle unsorted variable-length sequences via enforce_sorted. (15225).
nn.Identity: new module for easier model surgery. (19249).

Tensors / dtypes

torch.bool: added support for torch.bool dtype and Tensors with that dtype (1-byte storage). NumPy conversion is supported, but operations are currently limited. (16810).

Optim

optim.lr_scheduler.CyclicLR: Support for Cyclical Learning Rate and Momentum. (18001).
optim.lr_scheduler.CosineAnnealingWarmRestarts: new scheduler: Stochastic Gradient Descent with Warm Restarts). (17226).
Support multiple simultaneous LR schedulers. (14010)

Distributions

torch.distributions: now support multiple inheritance. (16772).

Samplers

quasirandom.SobolEngine: new sampler. (10505).

DistributedDataParallel

nn.parallel.DistributedDataParallel: now supports modules with unused parameters (e.g. control flow, like adaptive softmax, etc). (18251, 18953).

TorchScript and Tracer

Allow early returns from if-statements. (#154463)
Add an @ignore annotation, which statically tells the TorchScript compiler to ignore the Python function. (#16055)
Simple for...in loops on lists. (#16726)
Ellipses (...) in Tensor indexing. (#17763)
None in Tensor indexing. (#18615)
Support for basic list comprehensions. (#17267)
Add implicit unwrapping of optionals on if foo is not None. (#15587)
Tensors, ints, and floats will once again be implicitly cast to bool if used in a conditional. (#18755).
Implement to(), cpu(), and cuda() on ScriptModules. (#15340 , #15904)
Add support for various methods on lists: (clear(), pop(), reverse(), copy() , extend(),index(), count(), insert(), remove() ).
Add support for sort() on lists of specialized type (Tensors, int, float, bool). (#19572)
Add support for various methods on strings: (index(), slice(), len())
Support Tensor.to() in TorchScript. ( #15976 )
Support for Torch.tensor() in TorchScript. (#14913, #19445)
Support for torch.manual_seed() in TorchScript. (#19510)
Support for nn.LSTM in TorchScript. (#15744)
Support for nn.init in TorchScript. (#19640)
Add hash() builtin. (#18258)
Add min() and max() builtins for numerical types. (#15680)
Add isinstance() builtin, which performs a static type check. (#15076)
Add train() / eval() / is_training() to C++ ScriptModule API. (#16044)
Allow List arguments to Python functions called from TorchScript. (#15721)
Allow using std::vector and std::unordered_map as arguments to custom operators. (#17587)
Tracer: now allows passing static dicts and lists as trace inputs. (#18092, #19580)
Allow generic containers as ScriptModule inputs. (#16482)
Allow nn.Sequential in ModuleList. (#16882)

Experimental Features

[Quantization] (API unstable): added limited support for quantized datatypes via torch.qint8 dtype, torch.quantize_linear conversion function. (18230).
[MKLDNN tensor] (API unstable): Added limited (opaque) support for MKLDNN tensors via Tensor.to_mkldnn(); operators are currently limited to ResNext101 operators. (17748).

另外，日志里还有关于【Improvements】【Bug Fixes】【Deprecations】【Performance】【Documentation】【ONNX】的说明。

下边是已经修复了的比较严重的一些Bug。

torch.prod: correct erroneous calculation on large tensors. (15653).
torch.mean (and other reductions): fix incorrect calculation on CUDA on large inputs. (16023).
nn.Conv: correctly handle non-contiguous inputs on MKLDNN convolution codepath. (16300).
Tensor.eq_: Fix erroneous calculation. (15475).
torch.mean: Fix fp16 output calculation. (14878).
nn.PoissonNLLLoss: Properly handle reduction=None. (17358).
[JIT] Fix bug where custom ops could get optimized out if their outputs weren't used. (#18711).
[JIT] Fix bug where the model serializer would accidentally reorder statements. (#17557).

下边是挑出的几个比较显著的【Performance】。

nn.BatchNorm CPU inference speed increased up to ~19x.(19152).
nn.AdaptiveAvgPool: speed up common-case of size=1 output by ~30x. (17011).
nn.EmbeddingBag CPU performance increased by ~4x. (19329).
Tensor.copy_: sped up larger tensor copy ~2-3x, small regression in small tensor copy. (18618).
torch.nonzero: is now ~2x faster than numpy on CPU. (15190)
Improve caching allocator for Pascal and newer GPUs; 10-20% better memory utilization on Mask-RCNN. (17120).
reduction functions: Speed up some large Tensor cases by 50-80%. (17428).
[JIT] Graph fuser: better fusion for backwards graphs in the presence of broadcasting. (#14957)
[JIT] Graph fuser: batch_norm fusion for inference. (#15146)
[JIT] Graph fuser: layer_norm fusion for inference. (#18266)