
2019年05月01日,Pytorch 1.1.0 版本正式发布啦~https://github.com/pytorch/pytorch/releases/tag/v1.1.0
主要的几个功能:
1. TensorBoard (currently experimental)
2. JIT 的升级
· [JIT] Attributes in ScriptModules
· [JIT] Dictionary and List Support in TorchScript
· [JIT] User-defined classes in TorchScript (experimental)
3. DistributedDataParallel new functionality and tutorials
TensorBoard (currently experimental)
- PyTorch now supports TensorBoard logging with a simple
from torch.utils.tensorboard import SummaryWritercommand. - Histograms, embeddings, scalars, images, text, graphs, and more can be visualized across training runs.
- TensorBoard support is currently experimental. You can browse the docs here.
JIT
- Attributes in ScriptModules
- Attributes can be assigned on a
ScriptModuleby wrapping them withtorch.jit.Attributeand specifying the type. - They will be serialized along with any paramters/buffers when you call
torch.jit.save(), so they are a great way to store arbitrary state in your model. - See the docs for more info.
- Example:
class Foo(torch.jit.ScriptModule):def __init__(self, a_dict):super(Foo, self).__init__(False)self.words = torch.jit.Attribute([], List[str])self.some_dict = torch.jit.Attribute(a_dict, Dict[str, int])@torch.jit.script_methoddef forward(self, input: str) -> int:self.words.append(input)return self.some_dict[input]- Dictionary and List Support in TorchScript
- TorchScript now has robust support for list and dictionary types. They behave much like Python lists and dictionaries, supporting most built-in methods, as well as simple comprehensions and
for…inconstructs.
- User-defined classes in TorchScript (experimental)
- For more complex stateful operations, TorchScript now supports annotating a class with
@torch.jit.script. Classes used this way can be JIT-compiled and loaded in C++ like other TorchScript modules. - See the docs for more info.
- Example:
@torch.jit.script
class Pair:def __init__(self, first, second)self.first = firstself.second = seconddef sum(self):return self.first + self.secondDistributedDataParallel new functionality and tutorials
nn.parallel.DistributedDataParallel: can now wrap multi-GPU modules, which enables use cases such as model parallel (tutorial) on one server and data parallel (tutorial) across servers. (19271).
Breaking Changes
Tensor.set_: thedeviceof a Tensor can no longer be changed viaTensor.set_. This would most commonly happen when setting up a Tensor with the default CUDA device and later swapping in aStorageon a different CUDA device. Instead, set up the Tensor on the correct device from the beginning. (18832).- Pay attention to the order change of
lr_scheduler.step(). (7889). torch.unique: changed the default value ofsortedtoTrue. (15379).- [JIT] Rename isTensor api -> isCompleteTensor. #18437
- [JIT] Remove GraphExecutor's python bindings. #19141
- [C++]: many methods on
Typeno longer exist; use the functional or Tensor method equivalent. (17991). - [C++]: the
Backendconstructor ofTensorOptionsno longer exists. (18137). - [C++, Distributed]: Remove c10d
ProcessGroup::getGroupRankhas been removed. (19147).
【New Features】
这次的版本更新添加了很多可调用的方法。
Operators
torch.tril_indices,torch.triu_indices: added operator with same behavior as NumPy. (14904, 15203).torch.combinations,torch.cartesian_prod: added newitertools-like operators. (9393).torch.repeat_interleave: new operator similar tonumpy.repeat. (18395).torch.from_file: new operator similar toStorage.from_file, but returning a tensor. (18688).torch.unique_consecutive: new operator with semantics similar tostd::uniquein C++. (19060).torch.tril,torch.triu,torch.trtrs: now support batching. (15257, 18025).torch.gather: add support forsparse_gradoption. (17182).torch.std,torch.max_values,torch.min_values,torch.logsumexpcan now operate over multiple dimensions at once. (14535, 15892, 16475).torch.cdist: added operator equivalent toscipy.spatial.distance.cdist. (16168, 17173).torch.__config__.show(): reports detailed version of all libraries. (18579).
NN
nn.MultiheadedAttention: new module implementing MultiheadedAttention fromAttention Is All You Need. (18334).nn.functional.interpolate: added support forbicubic. (9849).nn.SyncBatchNorm: support synchronous Batch Normalization. (14267).nn.Conv: added support for Circular Padding viamode='circular'. (17240).nn.EmbeddingBag: now supports trainableper_sample_weights. (18799).nn.EmbeddingBag: add support forfrom_pretrainedmethod, as innn.Embedding. (15273).RNNs: automatically handle unsorted variable-length sequences viaenforce_sorted. (15225).nn.Identity: new module for easier model surgery. (19249).
Tensors / dtypes
torch.bool: added support fortorch.booldtype and Tensors with that dtype (1-byte storage). NumPy conversion is supported, but operations are currently limited. (16810).
Optim
optim.lr_scheduler.CyclicLR: Support for Cyclical Learning Rate and Momentum. (18001).optim.lr_scheduler.CosineAnnealingWarmRestarts: new scheduler: Stochastic Gradient Descent with Warm Restarts). (17226).- Support multiple simultaneous LR schedulers. (14010)
Distributions
torch.distributions: now support multiple inheritance. (16772).
Samplers
quasirandom.SobolEngine: new sampler. (10505).
DistributedDataParallel
nn.parallel.DistributedDataParallel: now supports modules with unused parameters (e.g. control flow, like adaptive softmax, etc). (18251, 18953).
TorchScript and Tracer
- Allow early returns from if-statements. (#154463)
- Add an
@ignoreannotation, which statically tells the TorchScript compiler to ignore the Python function. (#16055) - Simple
for...inloops on lists. (#16726) - Ellipses (
...) in Tensor indexing. (#17763) Nonein Tensor indexing. (#18615)- Support for basic list comprehensions. (#17267)
- Add implicit unwrapping of optionals on
if foo is not None. (#15587) - Tensors, ints, and floats will once again be implicitly cast to bool if used in a conditional. (#18755).
- Implement
to(),cpu(), andcuda()on ScriptModules. (#15340 , #15904) - Add support for various methods on lists: (
clear(),pop(),reverse(),copy(),extend(),index(),count(),insert(),remove()). - Add support for
sort()on lists of specialized type (Tensors,int,float,bool). (#19572) - Add support for various methods on strings: (
index(),slice(),len()) - Support
Tensor.to()in TorchScript. ( #15976 ) - Support for
Torch.tensor()in TorchScript. (#14913, #19445) - Support for
torch.manual_seed()in TorchScript. (#19510) - Support for
nn.LSTMin TorchScript. (#15744) - Support for
nn.initin TorchScript. (#19640) - Add
hash()builtin. (#18258) - Add
min()andmax()builtins for numerical types. (#15680) - Add
isinstance()builtin, which performs a static type check. (#15076) - Add
train()/eval()/is_training()to C++ ScriptModule API. (#16044) - Allow List arguments to Python functions called from TorchScript. (#15721)
- Allow using
std::vectorandstd::unordered_mapas arguments to custom operators. (#17587) - Tracer: now allows passing static dicts and lists as trace inputs. (#18092, #19580)
- Allow generic containers as ScriptModule inputs. (#16482)
- Allow
nn.Sequentialin ModuleList. (#16882)
Experimental Features
- [Quantization] (API unstable): added limited support for quantized datatypes via
torch.qint8dtype,torch.quantize_linearconversion function. (18230). - [MKLDNN tensor] (API unstable): Added limited (opaque) support for
MKLDNNtensors viaTensor.to_mkldnn(); operators are currently limited to ResNext101 operators. (17748).
另外,日志里还有关于【Improvements】【Bug Fixes】【Deprecations】【Performance】【Documentation】【ONNX】的说明。
下边是已经修复了的比较严重的一些Bug。
torch.prod: correct erroneous calculation on large tensors. (15653).torch.mean(and other reductions): fix incorrect calculation on CUDA on large inputs. (16023).nn.Conv: correctly handle non-contiguous inputs on MKLDNN convolution codepath. (16300).Tensor.eq_: Fix erroneous calculation. (15475).torch.mean: Fix fp16 output calculation. (14878).nn.PoissonNLLLoss: Properly handlereduction=None. (17358).- [JIT] Fix bug where custom ops could get optimized out if their outputs weren't used. (#18711).
- [JIT] Fix bug where the model serializer would accidentally reorder statements. (#17557).
下边是挑出的几个比较显著的【Performance】。
nn.BatchNormCPU inference speed increased up to ~19x.(19152).nn.AdaptiveAvgPool: speed up common-case of size=1 output by ~30x. (17011).nn.EmbeddingBagCPU performance increased by ~4x. (19329).Tensor.copy_: sped up larger tensor copy ~2-3x, small regression in small tensor copy. (18618).torch.nonzero: is now ~2x faster than numpy on CPU. (15190)- Improve caching allocator for Pascal and newer GPUs; 10-20% better memory utilization on Mask-RCNN. (17120).
reduction functions: Speed up some large Tensor cases by 50-80%. (17428).- [JIT] Graph fuser: better fusion for backwards graphs in the presence of broadcasting. (#14957)
- [JIT] Graph fuser:
batch_normfusion for inference. (#15146) - [JIT] Graph fuser:
layer_normfusion for inference. (#18266)