Update 2020-08-18-pytorch-1.6-now-includes-stochastic-weight-averaging.md

andresruizfacebook · web-flow · commit 13bddecd4862 · 2020-08-20T07:49:31.000-07:00
diff --git a/_posts/2020-08-18-pytorch-1.6-now-includes-stochastic-weight-averaging.md b/_posts/2020-08-18-pytorch-1.6-now-includes-stochastic-weight-averaging.md
@@ -144,28 +144,12 @@ We expect solutions that are centered in the flat region of the loss to generali
 We release a GitHub [repo](https://github.com/izmailovpavel/torch_swa_examples) with examples using the PyTorch implementation of SWA for training DNNs. For example, these examples can be used to achieve the following results on CIFAR-100:
 
 
-<table width="700" border="1" cellspacing="5" cellpadding="5">
-  <tbody>
-    <tr>
-      <td>&nbsp;</td>
-      <td>VGG-16</td>
-      <td>ResNet-164</td>
-      <td>WideResNet-28x10</td>
-    </tr>
-    <tr>
-      <td>SGD</td>
-      <td>72.8 ± 0.3</td>
-      <td>78.4 ± 0.3</td>
-      <td>81.0 ± 0.3</td>
-    </tr>
-    <tr>
-      <td>SWA</td>
-      <td>74.4 ± 0.3</td>
-      <td>79.8 ± 0.4</td>
-      <td>82.5 ± 0.2</td>
-    </tr>
-  </tbody>
-</table>
+ {:.table.table-striped.table-bordered}
+ |  | VGG-16 | ResNet-164 | WideResNet-28x10 | 
+| ------------- | ------------- |  ------------- |  ------------- |
+| SGD | 72.8 ± 0.3 | 78.4 ± 0.3 | 81.0 ± 0.3 | 
+| SWA | 74.4 ± 0.3 | 79.8 ± 0.4 | 82.5 ± 0.2 |
+
 
 ## Semi-Supervised Learning
 
@@ -180,45 +164,17 @@ In a follow-up [paper](https://arxiv.org/abs/1806.05594) SWA was applied to semi
 
 In another follow-up [paper](http://www.gatsby.ucl.ac.uk/~balaji/udl-camera-ready/UDL-24.pdf) SWA was shown to improve the performance of policy gradient methods A2C and DDPG on several Atari games and MuJoCo environments [3]. This application is also an instance of where SWA is used with Adam. Recall that SWA is not specific to SGD and can benefit essentially any optimizer.
 
-<table width="700" border="1" cellspacing="5" cellpadding="5">
-  <tbody>
-    <tr>
-      <td>Environment Name</td>
-      <td>A2C</td>
-      <td>A2C + SWA</td>
-    </tr>
-    <tr>
-      <td>Breakout</td>
-      <td>522 ± 34</td>
-      <td>703 ± 60</td>
-    </tr>
-    <tr>
-      <td>Qbert</td>
-      <td>18777 ± 778</td>
-      <td>21272 ± 655</td>
-    </tr>
-    <tr>
-      <td>SpaceInvaders</td>
-      <td>7727 ± 1121</td>
-      <td>21676 ± 8897</td>
-    </tr>
-    <tr>
-      <td>Seaquest</td>
-      <td>1779 ± 4</td>
-      <td>1795 ± 4</td>
-    </tr>
-    <tr>
-      <td>BeamRider</td>
-      <td>9999 ± 402</td>
-      <td>11321 ± 1065</td>
-    </tr>
-    <tr>
-      <td>CrazyClimber</td>
-      <td>147030 ± 10239</td>
-      <td>139752 ± 11618</td>
-    </tr>
-  </tbody>
-</table>
+
+{:.table.table-striped.table-bordered}
+ | Environment Name | A2C | A2C + SWA | 
+| ------------- | ------------- |  ------------- |  
+| Breakout | 522 ± 34 | 703 ± 60 |
+| Qbert | 18777 ± 778 | 21272 ± 655 |
+| SpaceInvaders | 7727 ± 1121 | 21676 ± 8897 |
+| Seaquest | 1779 ± 4 | 1795 ± 4 |
+| BeamRider | 9999 ± 402 | 11321 ± 1065 |
+| CrazyClimber | 147030 ± 10239 | 139752 ± 11618 |
+
 
 ## Low Precision Training