You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2020-08-18-pytorch-1.6-now-includes-stochastic-weight-averaging.md
+17-61
Original file line number
Diff line number
Diff line change
@@ -144,28 +144,12 @@ We expect solutions that are centered in the flat region of the loss to generali
144
144
We release a GitHub [repo](https://github.com/izmailovpavel/torch_swa_examples) with examples using the PyTorch implementation of SWA for training DNNs. For example, these examples can be used to achieve the following results on CIFAR-100:
@@ -180,45 +164,17 @@ In a follow-up [paper](https://arxiv.org/abs/1806.05594) SWA was applied to semi
180
164
181
165
In another follow-up [paper](http://www.gatsby.ucl.ac.uk/~balaji/udl-camera-ready/UDL-24.pdf) SWA was shown to improve the performance of policy gradient methods A2C and DDPG on several Atari games and MuJoCo environments [3]. This application is also an instance of where SWA is used with Adam. Recall that SWA is not specific to SGD and can benefit essentially any optimizer.
0 commit comments