Skip to content

Commit 24317fb

Browse files
chp1 edits
1 parent a995564 commit 24317fb

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

Chapter1_Introduction/Chapter1_Introduction.ipynb

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@
107107
"\n",
108108
"Notice in the paragraph above, I assigned the belief (probability) measure to an *individual*, not to Nature. This is very interesting, as this definition leaves room for conflicting beliefs between individuals. Again, this is appropriate for what naturally occurs: different individuals have different beliefs of events occuring, because they possess different *information* about the world.\n",
109109
"\n",
110-
"Think about how we can extend this definition of probability to events that are not *really* random. That is, think about how we can extend this to anything that is fixed, but we are unsure about: \n",
110+
"Think about how we can extend this definition of probability to events that are not *really* random. That is, we can extend this to anything that is fixed, but we are unsure about: \n",
111111
"\n",
112112
"- Your code either has a bug in it or not, but we do not know for certain which is true. Though we have a belief about the presence or absence of a bug. \n",
113113
"\n",
@@ -119,17 +119,17 @@
119119
"\n",
120120
"To align ourselves with traditional probability notation, we denote our belief about event $A$ as $P(A)$.\n",
121121
"\n",
122-
"John Maynard Keynes, a great economist and thinker, said \"When the facts change, I change my mind. What do you do, sir?\" This quote reflects the way a Bayesian updates his or her beliefs after seeing evidence. Even -especially- if the evidence is counter to what was initially believed, it cannot be ignored. We denote our updated belief as $P(A |X )$, interpreted as the probability of $A$ given the evidence $X$. We call it the *posterior probability* so as to contrast the pre-evidence *prior probability*. Consider the posterior probabilities (read: posterior belief) of the above examples, after observing evidence $X$.:\n",
122+
"John Maynard Keynes, a great economist and thinker, said \"When the facts change, I change my mind. What do you do, sir?\" This quote reflects the way a Bayesian updates his or her beliefs after seeing evidence. Even --especially-- if the evidence is counter to what was initially believed, the evidence cannot be ignored. We denote our updated belief as $P(A |X )$, interpreted as the probability of $A$ given the evidence $X$. We call the updated belief the *posterior probability* so as to contrast it with the *prior probability*. For example, consider the posterior probabilities (read: posterior belief) of the above examples, after observing some evidence $X$.:\n",
123123
"\n",
124124
"1\\. $P(A): \\;\\;$ This big, complex code likely has a bug in it. $P(A | X): \\;\\;$ The code passed all $X$ tests; there still might be a bug, but its presence is less likely now.\n",
125125
"\n",
126-
"2\\. $P(A):\\;\\;$ The patient could have any number of diseases. $P(A | X):\\;\\;$ Performing a urine test generated evidence $X$, ruling out some of the possible diseases from consideration.\n",
126+
"2\\. $P(A):\\;\\;$ The patient could have any number of diseases. $P(A | X):\\;\\;$ Performing a blood test generated evidence $X$, ruling out some of the possible diseases from consideration.\n",
127127
"\n",
128128
"3\\. $P(A):\\;\\;$ That girl in your class probably doesn't have a crush on you. $P(A | X): \\;\\;$ She sent you an SMS message about some statistics homework. Maybe she does like me... \n",
129129
"\n",
130130
"It's clear that in each example we did not completely discard the prior belief after seeing new evidence, but we *re-weighted the prior* to incorporate the new evidence (i.e. we put more weight, or confidence, on some beliefs versus others). \n",
131131
"\n",
132-
"By introducing prior uncertainity about events, we are already admitting that any guess we make is potentially very wrong. After observing data, evidence, or other information, and we update our beliefs, our guess becomes *less wrong*. This is the opposite side of the prediction coin, where typically we try to be *more right*.\n"
132+
"By introducing prior uncertainity about events, we are already admitting that any guess we make is potentially very wrong. After observing data, evidence, or other information, and we update our beliefs, our guess becomes *less wrong*. This is the alternative side of the prediction coin, where typically we try to be *more right*.\n"
133133
]
134134
},
135135
{
@@ -148,23 +148,23 @@
148148
"\n",
149149
"\n",
150150
"\n",
151-
"This is very different from the answer the frequentist function returned. Notice that the Bayesian function accepted an additional argument: *\"Often my code has bugs\"*. This parameter, the *prior*, is that intuition in your head that says \"wait- something looks different with this situation\", or conversely \"yes, this is what I expected\". In our example, the programmer often sees debugging tests fail, but this time we didn't, which signals an alert in our head. By including the prior parameter, we are telling the Bayesian function to include our personal intuition. Technically this parameter in the Bayesian function is optional, but we will see excluding it has its own consequences. \n",
151+
"This is very different from the answer the frequentist function returned. Notice that the Bayesian function accepted an additional argument: *\"Often my code has bugs\"*. This parameter is the *prior*. By including the prior parameter, we are telling the Bayesian function to include our personal belief about the situation. Technically this parameter in the Bayesian function is optional, but we will see excluding it has its own consequences. \n",
152152
"\n",
153153
"\n",
154-
"As we acquire more and more instances of evidence, our prior belief is *washed out* by the new evidence. This is to be expected. For example, if your prior belief is something ridiculous, like \"I expect the sun to explode today\", and each day you are proved wrong, you would hope that any inference would correct you, or at least align your beliefs. \n",
154+
"As we acquire more and more instances of evidence, our prior belief is *washed out* by the new evidence. This is to be expected. For example, if your prior belief is something ridiculous, like \"I expect the sun to explode today\", and each day you are proved wrong, you would hope that any inference would correct you, or at least align your beliefs better. \n",
155155
"\n",
156156
"\n",
157-
"Denote $N$ as the number of instances of evidence we possess. As we gather an *infinite* amount of evidence, say as $N \\rightarrow \\infty$, our Bayesian results align with frequentist results. Hence for large $N$, statistical inference is more or less objective. On the other hand, for small $N$, inference is much more *unstable*: frequentist estimates have more variance and larger confidence intervals. This is where Bayesian analysis excels. By introducing a prior, and returning a distribution (instead of an scalar estimate), we *preserve the uncertainity* to reflect the instability of stasticial inference of a small $N$ dataset. \n",
157+
"Denote $N$ as the number of instances of evidence we possess. As we gather an *infinite* amount of evidence, say as $N \\rightarrow \\infty$, our Bayesian results align with frequentist results. Hence for large $N$, statistical inference is more or less objective. On the other hand, for small $N$, inference is much more *unstable*: frequentist estimates have more variance and larger confidence intervals. This is where Bayesian analysis excels. By introducing a prior, and returning a distribution (instead of a scalar estimate), we *preserve the uncertainity* to reflect the instability of stasticial inference of a small $N$ dataset. \n",
158158
"\n",
159-
"One may think that for large $N$, one can be indifferent between the two techniques, and might lean towards the computational-simpler, frequentist methods. An analysist should consider the following quote by Andrew Gelman (2005)[1], before making such a decision:\n",
159+
"One may think that for large $N$, one can be indifferent between the two techniques, and might lean towards the computational-simpler, frequentist methods. An analysist in this position should consider the following quote by Andrew Gelman (2005)[1], before making such a decision:\n",
160160
"\n",
161161
"> Sample sizes are never large. If $N$ is too small to get a sufficiently-precise estimate, you need to get more data (or make more assumptions). But once $N$ is \"large enough,\" you can start subdividing the data to learn more (for example, in a public opinion poll, once you have a good estimate for the entire country, you can estimate among men and women, northerners and southerners, different age groups, etc etc). $N$ is never enough because if it were \"enough\" you'd already be on to the next problem for which you need more data.\n",
162162
"\n",
163163
"\n",
164164
"#### A note on *Big Data*\n",
165-
"Paradoxically, big data's prediction problems are actually solved by relatively simple models [2]. Thus we can argue that big data's prediction difficulty does not lie in the algorithm used, but instead on the computational difficulties of storage and execution on big data. (One should also consider Gelman's qoute from above and ask \"Do I really have a big data prediction problem?\" )\n",
165+
"Paradoxically, big data's predictive analytic problems are actually solved by relatively simple models [2]. Thus we can argue that big data's prediction difficulty does not lie in the algorithm used, but instead on the computational difficulties of storage and execution on big data. (One should also consider Gelman's qoute from above and ask \"Do I really have a big data prediction problem?\" )\n",
166166
"\n",
167-
"The much more difficult prediction problems involve *medium data* and, especially troublesome, *really small data*. Using a similar argument as Gelman's above, if big data problems are *big enough* to be readily solved, then we should be more interested in the *not-big enough* datasets. "
167+
"The much more difficult analytic problems involve *medium data* and, especially troublesome, *really small data*. Using a similar argument as Gelman's above, if big data problems are *big enough* to be readily solved, then we should be more interested in the *not-quite-big enough* datasets. "
168168
]
169169
},
170170
{
@@ -896,4 +896,4 @@
896896
"metadata": {}
897897
}
898898
]
899-
}
899+
}

0 commit comments

Comments
 (0)