Skip to content

Commit bfdd2b9

Browse files
Merge pull request CamDavidsonPilon#149 from williamscott/master
Minor editing changes on Chapter 2
2 parents 9e4c4ef + 76ea3b9 commit bfdd2b9

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

Chapter2_MorePyMC/MorePyMC.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -470,7 +470,7 @@
470470
"\n",
471471
"1. We started by thinking \"what is the best random variable to describe this count data?\" A Poisson random variable is a good candidate because it can represent count data. So we model the number of sms's received as sampled from a Poisson distribution.\n",
472472
"\n",
473-
"2. Next, we think, \"Ok, assuming sms's are Poisson-distributed, what do I need for the Poisson distribution?\" Well, the Poisson distribution has a parameters $\\lambda$. \n",
473+
"2. Next, we think, \"Ok, assuming sms's are Poisson-distributed, what do I need for the Poisson distribution?\" Well, the Poisson distribution has a parameter $\\lambda$. \n",
474474
"\n",
475475
"3. Do we know $\\lambda$? No. In fact, we have a suspicion that there are *two* $\\lambda$ values, one for the earlier behaviour and one for the latter behaviour. We don't know when the behaviour switches though, but call the switchpoint $\\tau$.\n",
476476
"\n",
@@ -670,7 +670,7 @@
670670
"\n",
671671
"As this is a hacker book, we'll continue with the web-dev example. For the moment, we will focus on the analysis of site A only. Assume that there is some true $0 \\lt p_A \\lt 1$ probability that users who, upon shown site A, eventually purchase from the site. This is the true effectiveness of site A. Currently, this quantity is unknown to us. \n",
672672
"\n",
673-
"Suppose site A was shown to $N$ people, and $n$ people purchased from the site. One might conclude hastly that $p_A = \\frac{n}{N}$. Unfortunately, the *observed frequency* $\\frac{n}{N}$ does not necessarily equal $p_A$ -- there is a difference between the *observed frequency* and the *true frequency* of an event. The true frequency can be interpreted as the probability of an event occurring. For example, the true frequency of rolling a 1 on a 6-sided die is $\\frac{1}{6}$. Knowing the true frequency of events like:\n",
673+
"Suppose site A was shown to $N$ people, and $n$ people purchased from the site. One might conclude hastily that $p_A = \\frac{n}{N}$. Unfortunately, the *observed frequency* $\\frac{n}{N}$ does not necessarily equal $p_A$ -- there is a difference between the *observed frequency* and the *true frequency* of an event. The true frequency can be interpreted as the probability of an event occurring. For example, the true frequency of rolling a 1 on a 6-sided die is $\\frac{1}{6}$. Knowing the true frequency of events like:\n",
674674
"\n",
675675
"- fraction of users who make purchases, \n",
676676
"- frequency of social attributes, \n",
@@ -854,7 +854,7 @@
854854
"\n",
855855
"### *A* and *B* Together\n",
856856
"\n",
857-
"A similar anaylsis can be done for site B's response data to determine the analgous $p_B$. But what we are really interested in is the *difference* between $p_A$ and $p_B$. Let's infer $p_A$, $p_B$, *and* $\\text{delta} = p_A - p_B$, all at once. We can do this using PyMC's deterministic variables. (We'll assume for this exercise that $p_B = 0.04$, so $\\text{delta} = 0.01$, $N_B = 750$ (signifcantly less than $N_A$) and we will simulate site B's data like we did for site A's data )"
857+
"A similar analysis can be done for site B's response data to determine the analgous $p_B$. But what we are really interested in is the *difference* between $p_A$ and $p_B$. Let's infer $p_A$, $p_B$, *and* $\\text{delta} = p_A - p_B$, all at once. We can do this using PyMC's deterministic variables. (We'll assume for this exercise that $p_B = 0.04$, so $\\text{delta} = 0.01$, $N_B = 750$ (signifcantly less than $N_A$) and we will simulate site B's data like we did for site A's data )"
858858
]
859859
},
860860
{
@@ -1074,7 +1074,7 @@
10741074
"\n",
10751075
"Try playing with the parameters `true_p_A`, `true_p_B`, `N_A`, and `N_B`, to see what the posterior of $\\text{delta}$ looks like. Notice in all this, the difference in sample sizes between site A and site B was never mentioned: it naturally fits into Bayesian analysis.\n",
10761076
"\n",
1077-
"I hope the readers feel this style of A/B testing is more natural than hypothesis testing, which the latter has probably confused more than helped practitioners. Later in this book, we will see two extensions of this model: the first to help dynamically adjust for bad sites, and the second will improve the speed of this computation by reducing the analysis to a single equation. "
1077+
"I hope the readers feel this style of A/B testing is more natural than hypothesis testing, which has probably confused more than helped practitioners. Later in this book, we will see two extensions of this model: the first to help dynamically adjust for bad sites, and the second will improve the speed of this computation by reducing the analysis to a single equation. "
10781078
]
10791079
},
10801080
{

0 commit comments

Comments
 (0)