|
589 | 589 | "plt.bar( tau-1, data[tau-1], color = \"r\", label = \"user behaviour changed\" )\n", |
590 | 590 | "plt.xlabel( \"Time (days)\")\n", |
591 | 591 | "plt.ylabel(\"count of text-msgs received\")\n", |
592 | | - "plt.title(\"Artifical dataset\")\n", |
| 592 | + "plt.title(\"Artificial dataset\")\n", |
593 | 593 | "plt.xlim( 0, 80 );\n", |
594 | 594 | "plt.legend();" |
595 | 595 | ], |
|
617 | 617 | "It is okay that our fictional dataset does not look like our observed dataset: the probability is incredibly small it indeed would. PyMC's engine is designed to find good parameters, $\\lambda_i, \\tau$, that maximize this probability. \n", |
618 | 618 | "\n", |
619 | 619 | "\n", |
620 | | - "The ability to generate artifical dataset is an interesting side effect of our modeling, and we will see that this ability is a very important method of Bayesian inference. We produce a few more datasets below:" |
| 620 | + "The ability to generate artificial dataset is an interesting side effect of our modeling, and we will see that this ability is a very important method of Bayesian inference. We produce a few more datasets below:" |
621 | 621 | ] |
622 | 622 | }, |
623 | 623 | { |
|
662 | 662 | "source": [ |
663 | 663 | "#####Example: Bayesian A/B testing\n", |
664 | 664 | "\n", |
665 | | - "A/B testing is an experimental design pattern for determining the difference of effectiveness between two different treatments. For example, a pharmacutical company is interested in the effectiveness of drug A vs drug B. The company will test drug A on some fraction of their trials, and drug B on the other fraction (the fraction is often 1/2, but we will relax this assumption). After performing enough trials, the in-house statisticians compare which drug yielded better results.\n", |
| 665 | + "A/B testing is an experimental design pattern for determining the difference of effectiveness between two different treatments. For example, a pharmaceutical company is interested in the effectiveness of drug A vs drug B. The company will test drug A on some fraction of their trials, and drug B on the other fraction (the fraction is often 1/2, but we will relax this assumption). After performing enough trials, the in-house statisticians compare which drug yielded better results.\n", |
666 | 666 | "\n", |
667 | 667 | "Similarly, front-end web developers are interested in which design of their website yields more sales or some other metric of interest. They will route some fraction of visitors to site A, and the other fraction to site B, and record if the visit yielded a sale of not. The data is recorded (in real-time), and analyzed afterwards. \n", |
668 | 668 | "\n", |
|
983 | 983 | "cell_type": "markdown", |
984 | 984 | "metadata": {}, |
985 | 985 | "source": [ |
986 | | - "If this is probability is too high for comfortable decision-making, we can perform more trials on site B (as site B has less samples to begin with, each additional data point for site B contributes more inferencial \"power\" than each additional data point for site A). \n", |
| 986 | + "If this is probability is too high for comfortable decision-making, we can perform more trials on site B (as site B has less samples to begin with, each additional data point for site B contributes more inferential \"power\" than each additional data point for site A). \n", |
987 | 987 | "\n", |
988 | | - "Try playing with the parameters `true_p_A`, `true_p_B`, `N_A`, and `N_B`, to see what the posterior of $\\text{delta}$ looks like. Notice in all this, the differnence in sample sizes between site A and site B was never mentioned: it naturally fits into Bayesian analysis.\n", |
| 988 | + "Try playing with the parameters `true_p_A`, `true_p_B`, `N_A`, and `N_B`, to see what the posterior of $\\text{delta}$ looks like. Notice in all this, the difference in sample sizes between site A and site B was never mentioned: it naturally fits into Bayesian analysis.\n", |
989 | 989 | "\n", |
990 | | - "I hope the readers feels this style of A/B testing is more natural than hypothesis testing, which the latter has probably confused more than helped practioners. Later in this book, we will see two extensions of this model: the first to help dynamically adjust for bad sites, and the second will improve the speed of this computation by reducing the analysis to a single equation. " |
| 990 | + "I hope the readers feel this style of A/B testing is more natural than hypothesis testing, which the latter has probably confused more than helped practitioners. Later in this book, we will see two extensions of this model: the first to help dynamically adjust for bad sites, and the second will improve the speed of this computation by reducing the analysis to a single equation. " |
991 | 991 | ] |
992 | 992 | }, |
993 | 993 | { |
|
1006 | 1006 | "\n", |
1007 | 1007 | "$$P( X = k ) = {{N}\\choose{k}} p^k(1-p)^{N-k}$$\n", |
1008 | 1008 | "\n", |
1009 | | - "If $X$ is a binomial random variable with parameters $p$ and $N$, denoted $X \\sim \\text{Bin}(N,p)$, then $X$ is the number of events that occured in the $N$ trials (obviously $0 \\le X \\le N$). The larger $p$ is (while still remaining between 0 and 1), the more events are likely to occur. The expected value of a binomial is equal to $Np$. Below we plot the mass probability distribution for varying parameters. \n" |
| 1009 | + "If $X$ is a binomial random variable with parameters $p$ and $N$, denoted $X \\sim \\text{Bin}(N,p)$, then $X$ is the number of events that occurred in the $N$ trials (obviously $0 \\le X \\le N$). The larger $p$ is (while still remaining between 0 and 1), the more events are likely to occur. The expected value of a binomial is equal to $Np$. Below we plot the mass probability distribution for varying parameters. \n" |
1010 | 1010 | ] |
1011 | 1011 | }, |
1012 | 1012 | { |
|
1637 | 1637 | "source": [ |
1638 | 1638 | "Adding a constant term $\\alpha$ amounts to shifting the curve left or right (hence why it is called a *bias*. )\n", |
1639 | 1639 | "\n", |
1640 | | - "Let's start modeling this in PyMC. The $\\beta, \\alpha$ paramters have no reason to be positive, bounded or relatively large, so they are best modeled by a *Normal random variable*, introduced next." |
| 1640 | + "Let's start modeling this in PyMC. The $\\beta, \\alpha$ parameters have no reason to be positive, bounded or relatively large, so they are best modeled by a *Normal random variable*, introduced next." |
1641 | 1641 | ] |
1642 | 1642 | }, |
1643 | 1643 | { |
|
1646 | 1646 | "source": [ |
1647 | 1647 | "### Normal distributions\n", |
1648 | 1648 | "\n", |
1649 | | - "A Normal random variable, denoted $X \\sim N(\\mu, 1/\\tau)$, has a distribution with two parameters: the mean, $\\mu$, and the *precision*, $\\tau$. Those familar with the Normal distribution already have probably seen $\\sigma^2$ instead of $\\tau$. They are in fact reciprocals of each other. The change was motivated by simpler mathematical analysis and is an artifact of older Bayesian methods. Just remember: The smaller $\\tau$, the larger the spread of the distribution (i.e. we are more uncertain); the larger $\\tau$, the tighter the distribution (i.e. we are more certain). Regardless, $\\tau$ is always positive. \n", |
| 1649 | + "A Normal random variable, denoted $X \\sim N(\\mu, 1/\\tau)$, has a distribution with two parameters: the mean, $\\mu$, and the *precision*, $\\tau$. Those familiar with the Normal distribution already have probably seen $\\sigma^2$ instead of $\\tau$. They are in fact reciprocals of each other. The change was motivated by simpler mathematical analysis and is an artifact of older Bayesian methods. Just remember: The smaller $\\tau$, the larger the spread of the distribution (i.e. we are more uncertain); the larger $\\tau$, the tighter the distribution (i.e. we are more certain). Regardless, $\\tau$ is always positive. \n", |
1650 | 1650 | "\n", |
1651 | 1651 | "The probability density function of a $N( \\mu, 1/\\tau)$ random variable is:\n", |
1652 | 1652 | "\n", |
|
0 commit comments