2019年05月01日,Pytorch 1.1.0 版本正式发布啦~https://github.com/pytorch/pytorch/releases/tag/v1.1.0
主要的几个功能:
1. TensorBoard (currently experimental)
2. JIT 的升级
· [JIT] Attributes in ScriptModules
· [JIT] Dictionary and List Support in TorchScript
· [JIT] User-defined classes in TorchScript (experimental)
3. DistributedDataParallel new functionality and tutorials
TensorBoard (currently experimental)
- PyTorch now supports TensorBoard logging with a simple
from torch.utils.tensorboard import SummaryWriter
command. - Histograms, embeddings, scalars, images, text, graphs, and more can be visualized across training runs.
- TensorBoard support is currently experimental. You can browse the docs here.
JIT
- Attributes in ScriptModules
- Attributes can be assigned on a
ScriptModule
by wrapping them withtorch.jit.Attribute
and specifying the type. - They will be serialized along with any paramters/buffers when you call
torch.jit.save()
, so they are a great way to store arbitrary state in your model. - See the docs for more info.
- Example:
class Foo(torch.jit.ScriptModule):def __init__(self, a_dict):super(Foo, self).__init__(False)self.words = torch.jit.Attribute([], List[str])self.some_dict = torch.jit.Attribute(a_dict, Dict[str, int])@torch.jit.script_methoddef forward(self, input: str) -> int:self.words.append(input)return self.some_dict[input]
- Dictionary and List Support in TorchScript
- TorchScript now has robust support for list and dictionary types. They behave much like Python lists and dictionaries, supporting most built-in methods, as well as simple comprehensions and
for…in
constructs.
- User-defined classes in TorchScript (experimental)
- For more complex stateful operations, TorchScript now supports annotating a class with
@torch.jit.script
. Classes used this way can be JIT-compiled and loaded in C++ like other TorchScript modules. - See the docs for more info.
- Example:
@torch.jit.script
class Pair:def __init__(self, first, second)self.first = firstself.second = seconddef sum(self):return self.first + self.second
DistributedDataParallel new functionality and tutorials
nn.parallel.DistributedDataParallel
: can now wrap multi-GPU modules, which enables use cases such as model parallel (tutorial) on one server and data parallel (tutorial) across servers. (19271).
Breaking Changes
Tensor.set_
: thedevice
of a Tensor can no longer be changed viaTensor.set_
. This would most commonly happen when setting up a Tensor with the default CUDA device and later swapping in aStorage
on a different CUDA device. Instead, set up the Tensor on the correct device from the beginning. (18832).- Pay attention to the order change of
lr_scheduler.step()
. (7889). torch.unique
: changed the default value ofsorted
toTrue
. (15379).- [JIT] Rename isTensor api -> isCompleteTensor. #18437
- [JIT] Remove GraphExecutor's python bindings. #19141
- [C++]: many methods on
Type
no longer exist; use the functional or Tensor method equivalent. (17991). - [C++]: the
Backend
constructor ofTensorOptions
no longer exists. (18137). - [C++, Distributed]: Remove c10d
ProcessGroup::getGroupRank
has been removed. (19147).
【New Features】
这次的版本更新添加了很多可调用的方法。
Operators
torch.tril_indices
,torch.triu_indices
: added operator with same behavior as NumPy. (14904, 15203).torch.combinations
,torch.cartesian_prod
: added newitertools
-like operators. (9393).torch.repeat_interleave
: new operator similar tonumpy.repeat
. (18395).torch.from_file
: new operator similar toStorage.from_file
, but returning a tensor. (18688).torch.unique_consecutive
: new operator with semantics similar tostd::unique
in C++. (19060).torch.tril
,torch.triu
,torch.trtrs
: now support batching. (15257, 18025).torch.gather
: add support forsparse_grad
option. (17182).torch.std
,torch.max_values
,torch.min_values
,torch.logsumexp
can now operate over multiple dimensions at once. (14535, 15892, 16475).torch.cdist
: added operator equivalent toscipy.spatial.distance.cdist
. (16168, 17173).torch.__config__.show()
: reports detailed version of all libraries. (18579).
NN
nn.MultiheadedAttention
: new module implementing MultiheadedAttention fromAttention Is All You Need
. (18334).nn.functional.interpolate
: added support forbicubic
. (9849).nn.SyncBatchNorm
: support synchronous Batch Normalization. (14267).nn.Conv
: added support for Circular Padding viamode='circular'
. (17240).nn.EmbeddingBag
: now supports trainableper_sample_weights
. (18799).nn.EmbeddingBag
: add support forfrom_pretrained
method, as innn.Embedding
. (15273).RNNs
: automatically handle unsorted variable-length sequences viaenforce_sorted
. (15225).nn.Identity
: new module for easier model surgery. (19249).
Tensors / dtypes
torch.bool
: added support fortorch.bool
dtype and Tensors with that dtype (1-byte storage). NumPy conversion is supported, but operations are currently limited. (16810).
Optim
optim.lr_scheduler.CyclicLR
: Support for Cyclical Learning Rate and Momentum. (18001).optim.lr_scheduler.CosineAnnealingWarmRestarts
: new scheduler: Stochastic Gradient Descent with Warm Restarts). (17226).- Support multiple simultaneous LR schedulers. (14010)
Distributions
torch.distributions
: now support multiple inheritance. (16772).
Samplers
quasirandom.SobolEngine
: new sampler. (10505).
DistributedDataParallel
nn.parallel.DistributedDataParallel
: now supports modules with unused parameters (e.g. control flow, like adaptive softmax, etc). (18251, 18953).
TorchScript and Tracer
- Allow early returns from if-statements. (#154463)
- Add an
@ignore
annotation, which statically tells the TorchScript compiler to ignore the Python function. (#16055) - Simple
for...in
loops on lists. (#16726) - Ellipses (
...
) in Tensor indexing. (#17763) None
in Tensor indexing. (#18615)- Support for basic list comprehensions. (#17267)
- Add implicit unwrapping of optionals on
if foo is not None
. (#15587) - Tensors, ints, and floats will once again be implicitly cast to bool if used in a conditional. (#18755).
- Implement
to()
,cpu()
, andcuda()
on ScriptModules. (#15340 , #15904) - Add support for various methods on lists: (
clear()
,pop()
,reverse()
,copy()
,extend()
,index()
,count()
,insert()
,remove()
). - Add support for
sort()
on lists of specialized type (Tensors
,int
,float
,bool
). (#19572) - Add support for various methods on strings: (
index()
,slice()
,len()
) - Support
Tensor.to()
in TorchScript. ( #15976 ) - Support for
Torch.tensor()
in TorchScript. (#14913, #19445) - Support for
torch.manual_seed()
in TorchScript. (#19510) - Support for
nn.LSTM
in TorchScript. (#15744) - Support for
nn.init
in TorchScript. (#19640) - Add
hash()
builtin. (#18258) - Add
min()
andmax()
builtins for numerical types. (#15680) - Add
isinstance()
builtin, which performs a static type check. (#15076) - Add
train()
/eval()
/is_training()
to C++ ScriptModule API. (#16044) - Allow List arguments to Python functions called from TorchScript. (#15721)
- Allow using
std::vector
andstd::unordered_map
as arguments to custom operators. (#17587) - Tracer: now allows passing static dicts and lists as trace inputs. (#18092, #19580)
- Allow generic containers as ScriptModule inputs. (#16482)
- Allow
nn.Sequential
in ModuleList. (#16882)
Experimental Features
- [Quantization] (API unstable): added limited support for quantized datatypes via
torch.qint8
dtype,torch.quantize_linear
conversion function. (18230). - [MKLDNN tensor] (API unstable): Added limited (opaque) support for
MKLDNN
tensors viaTensor.to_mkldnn()
; operators are currently limited to ResNext101 operators. (17748).
另外,日志里还有关于【Improvements】【Bug Fixes】【Deprecations】【Performance】【Documentation】【ONNX】的说明。
下边是已经修复了的比较严重的一些Bug。
torch.prod
: correct erroneous calculation on large tensors. (15653).torch.mean
(and other reductions): fix incorrect calculation on CUDA on large inputs. (16023).nn.Conv
: correctly handle non-contiguous inputs on MKLDNN convolution codepath. (16300).Tensor.eq_
: Fix erroneous calculation. (15475).torch.mean
: Fix fp16 output calculation. (14878).nn.PoissonNLLLoss
: Properly handlereduction=None
. (17358).- [JIT] Fix bug where custom ops could get optimized out if their outputs weren't used. (#18711).
- [JIT] Fix bug where the model serializer would accidentally reorder statements. (#17557).
下边是挑出的几个比较显著的【Performance】。
nn.BatchNorm
CPU inference speed increased up to ~19x.(19152).nn.AdaptiveAvgPool
: speed up common-case of size=1 output by ~30x. (17011).nn.EmbeddingBag
CPU performance increased by ~4x. (19329).Tensor.copy_
: sped up larger tensor copy ~2-3x, small regression in small tensor copy. (18618).torch.nonzero
: is now ~2x faster than numpy on CPU. (15190)- Improve caching allocator for Pascal and newer GPUs; 10-20% better memory utilization on Mask-RCNN. (17120).
reduction functions
: Speed up some large Tensor cases by 50-80%. (17428).- [JIT] Graph fuser: better fusion for backwards graphs in the presence of broadcasting. (#14957)
- [JIT] Graph fuser:
batch_norm
fusion for inference. (#15146) - [JIT] Graph fuser:
layer_norm
fusion for inference. (#18266)