some formatting issues

CamDavidsonPilon · CamDavidsonPilon · commit 8cfe7ab574a2 · 2013-09-29T18:11:21.000-04:00
diff --git a/Chapter1_Introduction/Chapter1_Introduction.ipynb b/Chapter1_Introduction/Chapter1_Introduction.ipynb
@@ -46,7 +46,7 @@
      "metadata": {},
      "source": [
       "\n",
-      "###The Bayesian state of mind\n",
+      "### The Bayesian state of mind\n",
       "\n",
       "\n",
       "Bayesian inference differs from more traditional statistical inference by preserving *uncertainty*. At first, this sounds like a bad statistical technique. Isn't statistics all about deriving *certainty* from randomness? To reconcile this, we need to start thinking like Bayesians. \n",
@@ -89,7 +89,7 @@
      "metadata": {},
      "source": [
       "\n",
-      "###Bayesian Inference in Practice\n",
+      "### Bayesian Inference in Practice\n",
       "\n",
       " If frequentist and Bayesian inference were programming functions, with inputs being statistical problems, then the two would be different in what they return to the user. The frequentist inference function would return a number, representing an estimate (typically a summary statistic like the sample average etc.), whereas the Bayesian function would return *probabilities*.\n",
       "\n",
@@ -246,7 +246,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "#####Example: Bug, or just sweet, unintended feature?\n",
+      "##### Example: Bug, or just sweet, unintended feature?\n",
       "\n",
       "\n",
       "Let $A$ denote the event that our code has **no bugs** in it. Let $X$ denote the event that the code passes all debugging tests. For now, we will leave the prior probability of no bugs as a variable, i.e. $P(A) = p$. \n",
@@ -1027,4 +1027,4 @@
    "metadata": {}
   }
  ]
-}
+}
diff --git a/Chapter2_MorePyMC/MorePyMC.ipynb b/Chapter2_MorePyMC/MorePyMC.ipynb
@@ -1573,7 +1573,7 @@
      "metadata": {},
      "source": [
       "\n",
-      "#####Example: Challenger Space Shuttle Disaster <span id=\"challenger\"/>\n",
+      "##### Example: Challenger Space Shuttle Disaster <span id=\"challenger\"/>\n",
       "\n",
       "On January 28, 1986, the twenty-fifth flight of the U.S. space shuttle program ended in disaster when one of the rocket boosters of the Shuttle Challenger exploded shortly after lift-off, killing all seven crew members. The presidential commission on the accident concluded that it was caused by the failure of an O-ring in a field joint on the rocket booster, and that this failure was due to a faulty design that made the O-ring unacceptably sensitive to a number of factors including outside temperature. Of the previous 24 flights, data were available on failures of O-rings on 23, (one was lost at sea), and these data were discussed on the evening preceding the Challenger launch, but unfortunately only the data corresponding to the 7 flights on which there was a damage incident were considered important and these were thought to show no obvious trend. The data are shown below (see [1]):\n",
       "\n",
@@ -2593,4 +2593,4 @@
    "metadata": {}
   }
  ]
-}
+}
diff --git a/Chapter3_MCMC/IntroMCMC.ipynb b/Chapter3_MCMC/IntroMCMC.ipynb
@@ -316,7 +316,7 @@
       "In the above algorithm's pseudocode, notice that only the current position matters (new positions are investigated only near the current position). We can describe this property as *memorylessness*, i.e. the algorithm does not care *how* it arrived at its current position, only that it is there. \n",
       "\n",
       "### Other approximation solutions to the posterior\n",
-      "Besides MCMC, there are other procedures available for determining the posterior distributions. A [Laplace approximation](http://en.wikipedia.org/wiki/Laplace's_method) is an approximation of the posterior using simple functions. A more advanced method is [Variational Bayes](http://en.wikipedia.org/wiki/Variational_Bayesian_methods). All three methods, Laplace Approximations, Variational Bayes, and classical MCMC have their pros and cons. We will only focus on MCMC in this book. That being said, my friend Imri Sofar likes to classify MCMC algorithms as either \"they suck\", or \"they really suck\". He classifies the particular flavour of MCMC used by PyMC as just *sucks* ;)"
+      "Besides MCMC, there are other procedures available for determining the posterior distributions. A Laplace approximation is an approximation of the posterior using simple functions. A more advanced method is [Variational Bayes](http://en.wikipedia.org/wiki/Variational_Bayesian_methods). All three methods, Laplace Approximations, Variational Bayes, and classical MCMC have their pros and cons. We will only focus on MCMC in this book. That being said, my friend Imri Sofar likes to classify MCMC algorithms as either \"they suck\", or \"they really suck\". He classifies the particular flavour of MCMC used by PyMC as just *sucks* ;)"
      ]
     },
     {
@@ -1338,4 +1338,4 @@
    "metadata": {}
   }
  ]
-}
+}
diff --git a/Chapter5_LossFunctions/LossFunctions.ipynb b/Chapter5_LossFunctions/LossFunctions.ipynb
@@ -33,7 +33,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Loss Functions\n",
+      "## Loss Functions\n",
       "\n",
       "We introduce what statisticians and decision theorists call *loss functions*. A loss function is a function of the true parameter, and an estimate of that parameter\n",
       "\n",
@@ -126,7 +126,6 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "\n",
       "##### Example: Optimizing for the *Showcase* on *The Price is Right*\n",
       "\n",
       "Bless you if you are ever chosen as a contestant on the Price is Right, for here we will show you how to optimize your final price on the *Showcase*. For those who forget the rules:\n",
@@ -389,7 +388,6 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "\n",
       "### Minimizing our losses\n",
       "\n",
       "It would be wise to choose the estimate that minimizes our expected loss. This corresponds to the minimum point on each of the curves above. More formally, we would like to minimize our expected loss by finding the solution to\n",
@@ -524,7 +522,7 @@
       "\n",
       "Maybe it is clear now why the first-introduced loss functions are used most often in the mathematics of Bayesian inference: no complicated optimizations are necessary. Luckily, we have machines to do the complications for us. \n",
       "\n",
-      "### Machine Learning via Bayesian Methods\n",
+      "##  Machine Learning via Bayesian Methods\n",
       "\n",
       "Whereas frequentist methods strive to achieve the best precision about all possible parameters, machine learning cares to achieve the best *prediction* among all possible parameters. Of course, one way to achieve accurate predictions is to aim for accurate predictions, but often your prediction measure and what frequentist methods are optimizing for are very different. \n",
       "\n",
@@ -548,7 +546,7 @@
       "\n",
       "Suppose the future return of a stock price is very small, say 0.01 (or 1%). We have a model that predicts the stock's future price, and our profit and loss is directly tied to us acting on the prediction.  How should we measure the loss associated with the model's predictions, and subsequent future predictions? A squared-error loss is agnostic to the signage and would penalize a prediction of -0.01 equally as bad a prediction of 0.03:\n",
       "\n",
-      "$$ \\(0.01 - (-0.01) \\)^2 = (0.01 - 0.03)^2 = 0.004$$\n",
+      "$$ (0.01 - (-0.01))^2 = (0.01 - 0.03)^2 = 0.004$$\n",
       "\n",
       "If you had made a bet based on your model's prediction, you would have earned money with a prediction of 0.03, and lost money with a prediction of -0.01, yet our loss did not capture this. We need a better loss that takes into account the *sign* of the prediction and true value. We design a new loss that is better for financial applications below:"
      ]

Original file line number	Diff line number	Diff line change
`@@ -1573,7 +1573,7 @@`
`1573`	`1573`	`"metadata": {},`
`1574`	`1574`	`"source": [`
`1575`	`1575`	`"\n",`
`1576`		`- "#####Example: Challenger Space Shuttle Disaster <span id=\"challenger\"/>\n",`
	`1576`	`+ "##### Example: Challenger Space Shuttle Disaster <span id=\"challenger\"/>\n",`
`1577`	`1577`	`"\n",`
`1578`	`1578`	"On January 28, 1986, the twenty-fifth flight of the U.S. space shuttle program ended in disaster when one of the rocket boosters of the Shuttle Challenger exploded shortly after lift-off, killing all seven crew members. The presidential commission on the accident concluded that it was caused by the failure of an O-ring in a field joint on the rocket booster, and that this failure was due to a faulty design that made the O-ring unacceptably sensitive to a number of factors including outside temperature. Of the previous 24 flights, data were available on failures of O-rings on 23, (one was lost at sea), and these data were discussed on the evening preceding the Challenger launch, but unfortunately only the data corresponding to the 7 flights on which there was a damage incident were considered important and these were thought to show no obvious trend. The data are shown below (see [1]):\n",
`1579`	`1579`	`"\n",`
`@@ -2593,4 +2593,4 @@`
`2593`	`2593`	`"metadata": {}`
`2594`	`2594`	`}`
`2595`	`2595`	`]`
`2596`		`-}`
	`2596`	`+}`
Original file line number	Diff line number	Diff line change
`@@ -316,7 +316,7 @@`
`316`	`316`	`"In the above algorithm's pseudocode, notice that only the current position matters (new positions are investigated only near the current position). We can describe this property as memorylessness, i.e. the algorithm does not care how it arrived at its current position, only that it is there. \n",`
`317`	`317`	`"\n",`
`318`	`318`	`"### Other approximation solutions to the posterior\n",`
`319`		- "Besides MCMC, there are other procedures available for determining the posterior distributions. A [Laplace approximation](http://en.wikipedia.org/wiki/Laplace's_method) is an approximation of the posterior using simple functions. A more advanced method is [Variational Bayes](http://en.wikipedia.org/wiki/Variational_Bayesian_methods). All three methods, Laplace Approximations, Variational Bayes, and classical MCMC have their pros and cons. We will only focus on MCMC in this book. That being said, my friend Imri Sofar likes to classify MCMC algorithms as either \"they suck\", or \"they really suck\". He classifies the particular flavour of MCMC used by PyMC as just sucks ;)"
	`319`	+ "Besides MCMC, there are other procedures available for determining the posterior distributions. A Laplace approximation is an approximation of the posterior using simple functions. A more advanced method is [Variational Bayes](http://en.wikipedia.org/wiki/Variational_Bayesian_methods). All three methods, Laplace Approximations, Variational Bayes, and classical MCMC have their pros and cons. We will only focus on MCMC in this book. That being said, my friend Imri Sofar likes to classify MCMC algorithms as either \"they suck\", or \"they really suck\". He classifies the particular flavour of MCMC used by PyMC as just sucks ;)"
`320`	`320`	`]`
`321`	`321`	`},`
`322`	`322`	`{`
`@@ -1338,4 +1338,4 @@`
`1338`	`1338`	`"metadata": {}`
`1339`	`1339`	`}`
`1340`	`1340`	`]`
`1341`		`-}`
	`1341`	`+}`