Acebulf
diff --git a/‎Chapter6_Priorities/Priors.ipynb‎
Lines changed: 15 additions & 23 deletions b/‎Chapter6_Priorities/Priors.ipynb‎
Lines changed: 15 additions & 23 deletions
@@ -612,61 +612,53 @@
         "        \n",
         "        <div id=\"reveal-div\" style=\"margin:20px auto; width: 300px; display:none\"></div>\n",
         "        \n",
-        "       <div style=\"margin:auto; width = 400px\" >\n",
+        "       <div style=\"margin:auto; width: 400px\" >\n",
         "\n",
-        "            <div style=\"float: right; margin: 15px\"> \n",
+        "            <div style=\"margin: auto;width: 50px\"> \n",
         "                <p style=\"margin: 0px;\"> Rewards </p>\n",
-        "                <p  style=\"font-size:30pt; margin: 0px;\" id=\"rewards\"> 0 </p>\n",
+        "                <p  style=\"font-size:30pt; margin: 5px;\" id=\"rewards\"> 0 </p>\n",
         "            </div>            \n",
         "\n",
-        "            <div style=\"float: right; margin: 15px\"> \n",
+        "            <div style=\"margin: auto; width: 50px\"> \n",
         "                <p style=\"margin: 0px;\"> Pulls </p>\n",
-        "                <p id=\"pulls\" style=\"margin: 0px;font-size:30pt\"> 0 </p>\n",
+        "                <p id=\"pulls\" style=\"margin: 5px;font-size:30pt\"> 0 </p>\n",
         "            </div>    \n",
         "            \n",
-        "            <div style=\"float: right; margin: 15px\" > \n",
+        "            <div style=\"margin: auto; width: 50px\" > \n",
         "                <p style=\"margin: 0px;\"> Reward/Pull Ratio </p>\n",
-        "                <p id=\"ratio\" style=\"margin: 0px;font-size:30pt\"> 0 </p>\n",
+        "                <p id=\"ratio\" style=\"margin: 5px;font-size:30pt\"> 0 </p>\n",
         "            </div>       \n",
         "   \n",
         "        </div>\n",
         "\n",
-        "        <p style=\"margin: 20px auto; width:550px\" >\n",
-        "\n",
-        "\n",
-        "           Deviations of the observed ratio from the highest probability is a measure of performance. For example, \n",
-        "           in the long run, optimally we can attain the reward/pull ratio of the maximum bandit probability. \n",
-        "           Long-term realized ratios <em>less</em> than the maximum represent inefficiencies. (Realized ratios <em>larger<em> \n",
-        "           than the maximum probability is \n",
-        "           due to randomness, and will eventually fall below). \n",
-        "        </p>\n",
-        "\n",
         "<script src=\"https://gist.github.com/CamDavidsonPilon/9a987a5f65f612035554/raw/7ea3996e5bb0a92904ed9cbea6af293ab3949028/d3bandits.js\"></script>\n"
        ],
        "output_type": "pyout",
-       "prompt_number": 3,
+       "prompt_number": 104,
        "text": [
-        "<IPython.core.display.HTML at 0x835b4a8>"
+        "<IPython.core.display.HTML at 0x1663abe0>"
        ]
       }
      ],
-     "prompt_number": 3
+     "prompt_number": 104
     },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
+      "Deviations of the observed ratio from the highest probability is a measure of performance. For example,in the long run, optimally we can attain the reward/pull ratio of the maximum bandit probability. Long-term realized ratios less than the maximum represent inefficiencies. (Realized ratios larger than the maximum probability is due to randomness, and will eventually fall below). \n",
+      "\n",
       "### A Measure of *Good*\n",
       "\n",
-      "We need a metric to calculate how well we are doing. Recall the absolute *best* we can do is to always pick the bandit with the largest probability of winning. Denote this best bandit's probability of $w^*$. Our score should be relative to how well we would have done had we chosen the best bandit from the beginning. This motivates the *total regret* of a strategy, defined:\n",
+      "We need a metric to calculate how well we are doing. Recall the absolute *best* we can do is to always pick the bandit with the largest probability of winning. Denote this best bandit's probability of $w_{opt}$. Our score should be relative to how well we would have done had we chosen the best bandit from the beginning. This motivates the *total regret* of a strategy, defined:\n",
       "\n",
       "\\begin{align}\n",
-      "R_T & = \\sum_{i=1}^{T} \\left( w^* - w_{B(i)} \\right)\\\\\\\\\n",
+      "R_T & = \\sum_{i=1}^{T} \\left( w_{opt} - w_{B(i)} \\right)\\\\\\\\\n",
       "& = Tw^* - \\sum_{i=1}^{T} \\;  w_{B(i)} \n",
       "\\end{align}\n",
       "\n",
       "\n",
-      "where $w_{B(i)}$ is the probability of a prize of the chosen bandit in the $i$ round. A total regret of 0 means the strategy is matching the best possible score. This is likely not possible, as initially our algorithm will often make the wrong choice.  Ideally, a strategy's total regret should flatten as it learns the best bandit. (Mathematically we achieve $w_{B(i)}=w^*$ often)\n",
+      "where $w_{B(i)}$ is the probability of a prize of the chosen bandit in the $i$ round. A total regret of 0 means the strategy is matching the best possible score. This is likely not possible, as initially our algorithm will often make the wrong choice.  Ideally, a strategy's total regret should flatten as it learns the best bandit. (Mathematically we achieve $w_{B(i)}=w_{opt}$ often)\n",
       "\n",
       "\n",
       "Below we plot the total regret of this simulation, including the scores of some other strategies:\n",