diff --git a/Exercise1/Figures/cost_function.png b/Exercise1/Figures/cost_function.png deleted file mode 100755 index b5f175f2..00000000 Binary files a/Exercise1/Figures/cost_function.png and /dev/null differ diff --git a/Exercise1/Figures/dataset1.png b/Exercise1/Figures/dataset1.png deleted file mode 100755 index 8bded89d..00000000 Binary files a/Exercise1/Figures/dataset1.png and /dev/null differ diff --git a/Exercise1/Figures/learning_rate.png b/Exercise1/Figures/learning_rate.png deleted file mode 100755 index 8701bb97..00000000 Binary files a/Exercise1/Figures/learning_rate.png and /dev/null differ diff --git a/Exercise1/Figures/regression_result.png b/Exercise1/Figures/regression_result.png deleted file mode 100755 index 622a6ecb..00000000 Binary files a/Exercise1/Figures/regression_result.png and /dev/null differ diff --git a/Exercise1/exercise1.ipynb b/Exercise1/exercise1.ipynb deleted file mode 100755 index 0d245b5c..00000000 --- a/Exercise1/exercise1.ipynb +++ /dev/null @@ -1,1307 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Programming Exercise 1: Linear Regression\n", - "\n", - "## Introduction\n", - "\n", - "In this exercise, you will implement linear regression and get to see it work on data. Before starting on this programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics.\n", - "\n", - "All the information you need for solving this assignment is in this notebook, and all the code you will be implementing will take place within this notebook. The assignment can be promptly submitted to the coursera grader directly from this notebook (code and instructions are included below).\n", - "\n", - "Before we begin with the exercises, we need to import all libraries required for this programming exercise. Throughout the course, we will be using [`numpy`](http://www.numpy.org/) for all arrays and matrix operations, and [`matplotlib`](https://matplotlib.org/) for plotting.\n", - "\n", - "You can find instructions on how to install required libraries in the README file in the [github repository](https://github.com/dibgerge/ml-coursera-python-assignments)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# used for manipulating directory paths\n", - "import os\n", - "\n", - "# Scientific and vector computation for python\n", - "import numpy as np\n", - "\n", - "# Plotting library\n", - "from matplotlib import pyplot\n", - "from mpl_toolkits.mplot3d import Axes3D # needed to plot 3-D surfaces\n", - "\n", - "# library written for this exercise providing additional functions for assignment submission, and others\n", - "import utils \n", - "\n", - "# define the submission/grader object for this exercise\n", - "grader = utils.Grader()\n", - "\n", - "# tells matplotlib to embed plots within the notebook\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Submission and Grading\n", - "\n", - "After completing each part of the assignment, be sure to submit your solutions to the grader.\n", - "\n", - "For this programming exercise, you are only required to complete the first part of the exercise to implement linear regression with one variable. The second part of the exercise, which is optional, covers linear regression with multiple variables. The following is a breakdown of how each part of this exercise is scored.\n", - "\n", - "**Required Exercises**\n", - "\n", - "| Section | Part |Submitted Function | Points \n", - "|---------|:- |:- | :-: \n", - "| 1 | [Warm up exercise](#section1) | [`warmUpExercise`](#warmUpExercise) | 10 \n", - "| 2 | [Compute cost for one variable](#section2) | [`computeCost`](#computeCost) | 40 \n", - "| 3 | [Gradient descent for one variable](#section3) | [`gradientDescent`](#gradientDescent) | 50 \n", - "| | Total Points | | 100 \n", - "\n", - "**Optional Exercises**\n", - "\n", - "| Section | Part | Submitted Function | Points |\n", - "|:-------:|:- |:-: | :-: |\n", - "| 4 | [Feature normalization](#section4) | [`featureNormalize`](#featureNormalize) | 0 |\n", - "| 5 | [Compute cost for multiple variables](#section5) | [`computeCostMulti`](#computeCostMulti) | 0 |\n", - "| 6 | [Gradient descent for multiple variables](#section5) | [`gradientDescentMulti`](#gradientDescentMulti) |0 |\n", - "| 7 | [Normal Equations](#section7) | [`normalEqn`](#normalEqn) | 0 |\n", - "\n", - "You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "
\n", - "At the end of each section in this notebook, we have a cell which contains code for submitting the solutions thus far to the grader. Execute the cell to see your score up to the current section. For all your work to be submitted properly, you must execute those cells at least once. They must also be re-executed everytime the submitted function is updated.\n", - "
\n", - "\n", - "\n", - "## Debugging\n", - "\n", - "Here are some things to keep in mind throughout this exercise:\n", - "\n", - "- Python array indices start from zero, not one (contrary to OCTAVE/MATLAB). \n", - "\n", - "- There is an important distinction between python arrays (called `list` or `tuple`) and `numpy` arrays. You should use `numpy` arrays in all your computations. Vector/matrix operations work only with `numpy` arrays. Python lists do not support vector operations (you need to use for loops).\n", - "\n", - "- If you are seeing many errors at runtime, inspect your matrix operations to make sure that you are adding and multiplying matrices of compatible dimensions. Printing the dimensions of `numpy` arrays using the `shape` property will help you debug.\n", - "\n", - "- By default, `numpy` interprets math operators to be element-wise operators. If you want to do matrix multiplication, you need to use the `dot` function in `numpy`. For, example if `A` and `B` are two `numpy` matrices, then the matrix operation AB is `np.dot(A, B)`. Note that for 2-dimensional matrices or vectors (1-dimensional), this is also equivalent to `A@B` (requires python >= 3.5)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "## 1 Simple python and `numpy` function\n", - "\n", - "The first part of this assignment gives you practice with python and `numpy` syntax and the homework submission process. In the next cell, you will find the outline of a `python` function. Modify it to return a 5 x 5 identity matrix by filling in the following code:\n", - "\n", - "```python\n", - "A = np.eye(5)\n", - "```\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def warmUpExercise():\n", - " \"\"\"\n", - " Example function in Python which computes the identity matrix.\n", - " \n", - " Returns\n", - " -------\n", - " A : array_like\n", - " The 5x5 identity matrix.\n", - " \n", - " Instructions\n", - " ------------\n", - " Return the 5x5 identity matrix.\n", - " \"\"\" \n", - " # ======== YOUR CODE HERE ======\n", - " A = [] # modify this line\n", - " \n", - " # ==============================\n", - " return A" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The previous cell only defines the function `warmUpExercise`. We can now run it by executing the following cell to see its output. You should see output similar to the following:\n", - "\n", - "```python\n", - "array([[ 1., 0., 0., 0., 0.],\n", - " [ 0., 1., 0., 0., 0.],\n", - " [ 0., 0., 1., 0., 0.],\n", - " [ 0., 0., 0., 1., 0.],\n", - " [ 0., 0., 0., 0., 1.]])\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "warmUpExercise()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.1 Submitting solutions\n", - "\n", - "After completing a part of the exercise, you can submit your solutions for grading by first adding the function you modified to the grader object, and then sending your function to Coursera for grading. \n", - "\n", - "The grader will prompt you for your login e-mail and submission token. You can obtain a submission token from the web page for the assignment. You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "Execute the next cell to grade your solution to the first part of this exercise.\n", - "\n", - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# appends the implemented function in part 1 to the grader object\n", - "grader[1] = warmUpExercise\n", - "\n", - "# send the added functions to coursera grader for getting a grade on this part\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2 Linear regression with one variable\n", - "\n", - "Now you will implement linear regression with one variable to predict profits for a food truck. Suppose you are the CEO of a restaurant franchise and are considering different cities for opening a new outlet. The chain already has trucks in various cities and you have data for profits and populations from the cities. You would like to use this data to help you select which city to expand to next. \n", - "\n", - "The file `Data/ex1data1.txt` contains the dataset for our linear regression problem. The first column is the population of a city (in 10,000s) and the second column is the profit of a food truck in that city (in $10,000s). A negative value for profit indicates a loss. \n", - "\n", - "We provide you with the code needed to load this data. The dataset is loaded from the data file into the variables `x` and `y`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Read comma separated data\n", - "data = np.loadtxt(os.path.join('Data', 'ex1data1.txt'), delimiter=',')\n", - "X, y = data[:, 0], data[:, 1]\n", - "\n", - "m = y.size # number of training examples" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.1 Plotting the Data\n", - "\n", - "Before starting on any task, it is often useful to understand the data by visualizing it. For this dataset, you can use a scatter plot to visualize the data, since it has only two properties to plot (profit and population). Many other problems that you will encounter in real life are multi-dimensional and cannot be plotted on a 2-d plot. There are many plotting libraries in python (see this [blog post](https://blog.modeanalytics.com/python-data-visualization-libraries/) for a good summary of the most popular ones). \n", - "\n", - "In this course, we will be exclusively using `matplotlib` to do all our plotting. `matplotlib` is one of the most popular scientific plotting libraries in python and has extensive tools and functions to make beautiful plots. `pyplot` is a module within `matplotlib` which provides a simplified interface to `matplotlib`'s most common plotting tasks, mimicking MATLAB's plotting interface.\n", - "\n", - "
\n", - "You might have noticed that we have imported the `pyplot` module at the beginning of this exercise using the command `from matplotlib import pyplot`. This is rather uncommon, and if you look at python code elsewhere or in the `matplotlib` tutorials, you will see that the module is named `plt`. This is used by module renaming by using the import command `import matplotlib.pyplot as plt`. We will not using the short name of `pyplot` module in this class exercises, but you should be aware of this deviation from norm.\n", - "
\n", - "\n", - "\n", - "In the following part, your first job is to complete the `plotData` function below. Modify the function and fill in the following code:\n", - "\n", - "```python\n", - " pyplot.plot(x, y, 'ro', ms=10, mec='k')\n", - " pyplot.ylabel('Profit in $10,000')\n", - " pyplot.xlabel('Population of City in 10,000s')\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def plotData(x, y):\n", - " \"\"\"\n", - " Plots the data points x and y into a new figure. Plots the data \n", - " points and gives the figure axes labels of population and profit.\n", - " \n", - " Parameters\n", - " ----------\n", - " x : array_like\n", - " Data point values for x-axis.\n", - "\n", - " y : array_like\n", - " Data point values for y-axis. Note x and y should have the same size.\n", - " \n", - " Instructions\n", - " ------------\n", - " Plot the training data into a figure using the \"figure\" and \"plot\"\n", - " functions. Set the axes labels using the \"xlabel\" and \"ylabel\" functions.\n", - " Assume the population and revenue data have been passed in as the x\n", - " and y arguments of this function. \n", - " \n", - " Hint\n", - " ----\n", - " You can use the 'ro' option with plot to have the markers\n", - " appear as red circles. Furthermore, you can make the markers larger by\n", - " using plot(..., 'ro', ms=10), where `ms` refers to marker size. You \n", - " can also set the marker edge color using the `mec` property.\n", - " \"\"\"\n", - " fig = pyplot.figure() # open a new figure\n", - " \n", - " # ====================== YOUR CODE HERE ======================= \n", - " \n", - "\n", - " # =============================================================\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now run the defined function with the loaded data to visualize the data. The end result should look like the following figure:\n", - "\n", - "![](Figures/dataset1.png)\n", - "\n", - "Execute the next cell to visualize the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plotData(X, y)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To quickly learn more about the `matplotlib` plot function and what arguments you can provide to it, you can type `?pyplot.plot` in a cell within the jupyter notebook. This opens a separate page showing the documentation for the requested function. You can also search online for plotting documentation. \n", - "\n", - "To set the markers to red circles, we used the option `'or'` within the `plot` function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "?pyplot.plot" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.2 Gradient Descent\n", - "\n", - "In this part, you will fit the linear regression parameters $\\theta$ to our dataset using gradient descent.\n", - "\n", - "#### 2.2.1 Update Equations\n", - "\n", - "The objective of linear regression is to minimize the cost function\n", - "\n", - "$$ J(\\theta) = \\frac{1}{2m} \\sum_{i=1}^m \\left( h_{\\theta}(x^{(i)}) - y^{(i)}\\right)^2$$\n", - "\n", - "where the hypothesis $h_\\theta(x)$ is given by the linear model\n", - "$$ h_\\theta(x) = \\theta^Tx = \\theta_0 + \\theta_1 x_1$$\n", - "\n", - "Recall that the parameters of your model are the $\\theta_j$ values. These are\n", - "the values you will adjust to minimize cost $J(\\theta)$. One way to do this is to\n", - "use the batch gradient descent algorithm. In batch gradient descent, each\n", - "iteration performs the update\n", - "\n", - "$$ \\theta_j = \\theta_j - \\alpha \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta(x^{(i)}) - y^{(i)}\\right)x_j^{(i)} \\qquad \\text{simultaneously update } \\theta_j \\text{ for all } j$$\n", - "\n", - "With each step of gradient descent, your parameters $\\theta_j$ come closer to the optimal values that will achieve the lowest cost J($\\theta$).\n", - "\n", - "
\n", - "**Implementation Note:** We store each example as a row in the the $X$ matrix in Python `numpy`. To take into account the intercept term ($\\theta_0$), we add an additional first column to $X$ and set it to all ones. This allows us to treat $\\theta_0$ as simply another 'feature'.\n", - "
\n", - "\n", - "\n", - "#### 2.2.2 Implementation\n", - "\n", - "We have already set up the data for linear regression. In the following cell, we add another dimension to our data to accommodate the $\\theta_0$ intercept term. Do NOT execute this cell more than once." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Add a column of ones to X. The numpy function stack joins arrays along a given axis. \n", - "# The first axis (axis=0) refers to rows (training examples) \n", - "# and second axis (axis=1) refers to columns (features).\n", - "X = np.stack([np.ones(m), X], axis=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### 2.2.3 Computing the cost $J(\\theta)$\n", - "\n", - "As you perform gradient descent to learn minimize the cost function $J(\\theta)$, it is helpful to monitor the convergence by computing the cost. In this section, you will implement a function to calculate $J(\\theta)$ so you can check the convergence of your gradient descent implementation. \n", - "\n", - "Your next task is to complete the code for the function `computeCost` which computes $J(\\theta)$. As you are doing this, remember that the variables $X$ and $y$ are not scalar values. $X$ is a matrix whose rows represent the examples from the training set and $y$ is a vector whose each elemennt represent the value at a given row of $X$.\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def computeCost(X, y, theta):\n", - " \"\"\"\n", - " Compute cost for linear regression. Computes the cost of using theta as the\n", - " parameter for linear regression to fit the data points in X and y.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The input dataset of shape (m x n+1), where m is the number of examples,\n", - " and n is the number of features. We assume a vector of one's already \n", - " appended to the features so we have n+1 columns.\n", - " \n", - " y : array_like\n", - " The values of the function at each data point. This is a vector of\n", - " shape (m, ).\n", - " \n", - " theta : array_like\n", - " The parameters for the regression function. This is a vector of \n", - " shape (n+1, ).\n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The value of the regression cost function.\n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the cost of a particular choice of theta. \n", - " You should set J to the cost.\n", - " \"\"\"\n", - " \n", - " # initialize some useful values\n", - " m = y.size # number of training examples\n", - " \n", - " # You need to return the following variables correctly\n", - " J = 0\n", - " \n", - " # ====================== YOUR CODE HERE =====================\n", - "\n", - " \n", - " # ===========================================================\n", - " return J" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you have completed the function, the next step will run `computeCost` two times using two different initializations of $\\theta$. You will see the cost printed to the screen." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "J = computeCost(X, y, theta=np.array([0.0, 0.0]))\n", - "print('With theta = [0, 0] \\nCost computed = %.2f' % J)\n", - "print('Expected cost value (approximately) 32.07\\n')\n", - "\n", - "# further testing of the cost function\n", - "J = computeCost(X, y, theta=np.array([-1, 2]))\n", - "print('With theta = [-1, 2]\\nCost computed = %.2f' % J)\n", - "print('Expected cost value (approximately) 54.24')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions by executing the following cell.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[2] = computeCost\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### 2.2.4 Gradient descent\n", - "\n", - "Next, you will complete a function which implements gradient descent.\n", - "The loop structure has been written for you, and you only need to supply the updates to $\\theta$ within each iteration. \n", - "\n", - "As you program, make sure you understand what you are trying to optimize and what is being updated. Keep in mind that the cost $J(\\theta)$ is parameterized by the vector $\\theta$, not $X$ and $y$. That is, we minimize the value of $J(\\theta)$ by changing the values of the vector $\\theta$, not by changing $X$ or $y$. [Refer to the equations in this notebook](#section2) and to the video lectures if you are uncertain. A good way to verify that gradient descent is working correctly is to look at the value of $J(\\theta)$ and check that it is decreasing with each step. \n", - "\n", - "The starter code for the function `gradientDescent` calls `computeCost` on every iteration and saves the cost to a `python` list. Assuming you have implemented gradient descent and `computeCost` correctly, your value of $J(\\theta)$ should never increase, and should converge to a steady value by the end of the algorithm.\n", - "\n", - "
\n", - "**Vectors and matrices in `numpy`** - Important implementation notes\n", - "\n", - "A vector in `numpy` is a one dimensional array, for example `np.array([1, 2, 3])` is a vector. A matrix in `numpy` is a two dimensional array, for example `np.array([[1, 2, 3], [4, 5, 6]])`. However, the following is still considered a matrix `np.array([[1, 2, 3]])` since it has two dimensions, even if it has a shape of 1x3 (which looks like a vector).\n", - "\n", - "Given the above, the function `np.dot` which we will use for all matrix/vector multiplication has the following properties:\n", - "- It always performs inner products on vectors. If `x=np.array([1, 2, 3])`, then `np.dot(x, x)` is a scalar.\n", - "- For matrix-vector multiplication, so if $X$ is a $m\\times n$ matrix and $y$ is a vector of length $m$, then the operation `np.dot(y, X)` considers $y$ as a $1 \\times m$ vector. On the other hand, if $y$ is a vector of length $n$, then the operation `np.dot(X, y)` considers $y$ as a $n \\times 1$ vector.\n", - "- A vector can be promoted to a matrix using `y[None]` or `[y[np.newaxis]`. That is, if `y = np.array([1, 2, 3])` is a vector of size 3, then `y[None, :]` is a matrix of shape $1 \\times 3$. We can use `y[:, None]` to obtain a shape of $3 \\times 1$.\n", - "
\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def gradientDescent(X, y, theta, alpha, num_iters):\n", - " \"\"\"\n", - " Performs gradient descent to learn `theta`. Updates theta by taking `num_iters`\n", - " gradient steps with learning rate `alpha`.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The input dataset of shape (m x n+1).\n", - " \n", - " y : arra_like\n", - " Value at given features. A vector of shape (m, ).\n", - " \n", - " theta : array_like\n", - " Initial values for the linear regression parameters. \n", - " A vector of shape (n+1, ).\n", - " \n", - " alpha : float\n", - " The learning rate.\n", - " \n", - " num_iters : int\n", - " The number of iterations for gradient descent. \n", - " \n", - " Returns\n", - " -------\n", - " theta : array_like\n", - " The learned linear regression parameters. A vector of shape (n+1, ).\n", - " \n", - " J_history : list\n", - " A python list for the values of the cost function after each iteration.\n", - " \n", - " Instructions\n", - " ------------\n", - " Peform a single gradient step on the parameter vector theta.\n", - "\n", - " While debugging, it can be useful to print out the values of \n", - " the cost function (computeCost) and gradient here.\n", - " \"\"\"\n", - " # Initialize some useful values\n", - " m = y.shape[0] # number of training examples\n", - " \n", - " # make a copy of theta, to avoid changing the original array, since numpy arrays\n", - " # are passed by reference to functions\n", - " theta = theta.copy()\n", - " \n", - " J_history = [] # Use a python list to save cost in every iteration\n", - " \n", - " for i in range(num_iters):\n", - " # ==================== YOUR CODE HERE =================================\n", - " \n", - "\n", - " # =====================================================================\n", - " \n", - " # save the cost J in every iteration\n", - " J_history.append(computeCost(X, y, theta))\n", - " \n", - " return theta, J_history" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After you are finished call the implemented `gradientDescent` function and print the computed $\\theta$. We initialize the $\\theta$ parameters to 0 and the learning rate $\\alpha$ to 0.01. Execute the following cell to check your code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# initialize fitting parameters\n", - "theta = np.zeros(2)\n", - "\n", - "# some gradient descent settings\n", - "iterations = 1500\n", - "alpha = 0.01\n", - "\n", - "theta, J_history = gradientDescent(X ,y, theta, alpha, iterations)\n", - "print('Theta found by gradient descent: {:.4f}, {:.4f}'.format(*theta))\n", - "print('Expected theta values (approximately): [-3.6303, 1.1664]')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We will use your final parameters to plot the linear fit. The results should look like the following figure.\n", - "\n", - "![](Figures/regression_result.png)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# plot the linear fit\n", - "plotData(X[:, 1], y)\n", - "pyplot.plot(X[:, 1], np.dot(X, theta), '-')\n", - "pyplot.legend(['Training data', 'Linear regression']);" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Your final values for $\\theta$ will also be used to make predictions on profits in areas of 35,000 and 70,000 people.\n", - "\n", - "
\n", - "Note the way that the following lines use matrix multiplication, rather than explicit summation or looping, to calculate the predictions. This is an example of code vectorization in `numpy`.\n", - "
\n", - "\n", - "
\n", - "Note that the first argument to the `numpy` function `dot` is a python list. `numpy` can internally converts **valid** python lists to numpy arrays when explicitly provided as arguments to `numpy` functions.\n", - "
\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Predict values for population sizes of 35,000 and 70,000\n", - "predict1 = np.dot([1, 3.5], theta)\n", - "print('For population = 35,000, we predict a profit of {:.2f}\\n'.format(predict1*10000))\n", - "\n", - "predict2 = np.dot([1, 7], theta)\n", - "print('For population = 70,000, we predict a profit of {:.2f}\\n'.format(predict2*10000))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions by executing the next cell.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[3] = gradientDescent\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.4 Visualizing $J(\\theta)$\n", - "\n", - "To understand the cost function $J(\\theta)$ better, you will now plot the cost over a 2-dimensional grid of $\\theta_0$ and $\\theta_1$ values. You will not need to code anything new for this part, but you should understand how the code you have written already is creating these images.\n", - "\n", - "In the next cell, the code is set up to calculate $J(\\theta)$ over a grid of values using the `computeCost` function that you wrote. After executing the following cell, you will have a 2-D array of $J(\\theta)$ values. Then, those values are used to produce surface and contour plots of $J(\\theta)$ using the matplotlib `plot_surface` and `contourf` functions. The plots should look something like the following:\n", - "\n", - "![](Figures/cost_function.png)\n", - "\n", - "The purpose of these graphs is to show you how $J(\\theta)$ varies with changes in $\\theta_0$ and $\\theta_1$. The cost function $J(\\theta)$ is bowl-shaped and has a global minimum. (This is easier to see in the contour plot than in the 3D surface plot). This minimum is the optimal point for $\\theta_0$ and $\\theta_1$, and each step of gradient descent moves closer to this point." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# grid over which we will calculate J\n", - "theta0_vals = np.linspace(-10, 10, 100)\n", - "theta1_vals = np.linspace(-1, 4, 100)\n", - "\n", - "# initialize J_vals to a matrix of 0's\n", - "J_vals = np.zeros((theta0_vals.shape[0], theta1_vals.shape[0]))\n", - "\n", - "# Fill out J_vals\n", - "for i, theta0 in enumerate(theta0_vals):\n", - " for j, theta1 in enumerate(theta1_vals):\n", - " J_vals[i, j] = computeCost(X, y, [theta0, theta1])\n", - " \n", - "# Because of the way meshgrids work in the surf command, we need to\n", - "# transpose J_vals before calling surf, or else the axes will be flipped\n", - "J_vals = J_vals.T\n", - "\n", - "# surface plot\n", - "fig = pyplot.figure(figsize=(12, 5))\n", - "ax = fig.add_subplot(121, projection='3d')\n", - "ax.plot_surface(theta0_vals, theta1_vals, J_vals, cmap='viridis')\n", - "pyplot.xlabel('theta0')\n", - "pyplot.ylabel('theta1')\n", - "pyplot.title('Surface')\n", - "\n", - "# contour plot\n", - "# Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100\n", - "ax = pyplot.subplot(122)\n", - "pyplot.contour(theta0_vals, theta1_vals, J_vals, linewidths=2, cmap='viridis', levels=np.logspace(-2, 3, 20))\n", - "pyplot.xlabel('theta0')\n", - "pyplot.ylabel('theta1')\n", - "pyplot.plot(theta[0], theta[1], 'ro', ms=10, lw=2)\n", - "pyplot.title('Contour, showing minimum')\n", - "pass" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Optional Exercises\n", - "\n", - "If you have successfully completed the material above, congratulations! You now understand linear regression and should able to start using it on your own datasets.\n", - "\n", - "For the rest of this programming exercise, we have included the following optional exercises. These exercises will help you gain a deeper understanding of the material, and if you are able to do so, we encourage you to complete them as well. You can still submit your solutions to these exercises to check if your answers are correct.\n", - "\n", - "## 3 Linear regression with multiple variables\n", - "\n", - "In this part, you will implement linear regression with multiple variables to predict the prices of houses. Suppose you are selling your house and you want to know what a good market price would be. One way to do this is to first collect information on recent houses sold and make a model of housing prices.\n", - "\n", - "The file `Data/ex1data2.txt` contains a training set of housing prices in Portland, Oregon. The first column is the size of the house (in square feet), the second column is the number of bedrooms, and the third column is the price\n", - "of the house. \n", - "\n", - "\n", - "### 3.1 Feature Normalization\n", - "\n", - "We start by loading and displaying some values from this dataset. By looking at the values, note that house sizes are about 1000 times the number of bedrooms. When features differ by orders of magnitude, first performing feature scaling can make gradient descent converge much more quickly." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load data\n", - "data = np.loadtxt(os.path.join('Data', 'ex1data2.txt'), delimiter=',')\n", - "X = data[:, :2]\n", - "y = data[:, 2]\n", - "m = y.size\n", - "\n", - "# print out some data points\n", - "print('{:>8s}{:>8s}{:>10s}'.format('X[:,0]', 'X[:, 1]', 'y'))\n", - "print('-'*26)\n", - "for i in range(10):\n", - " print('{:8.0f}{:8.0f}{:10.0f}'.format(X[i, 0], X[i, 1], y[i]))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Your task here is to complete the code in `featureNormalize` function:\n", - "- Subtract the mean value of each feature from the dataset.\n", - "- After subtracting the mean, additionally scale (divide) the feature values by their respective “standard deviations.”\n", - "\n", - "The standard deviation is a way of measuring how much variation there is in the range of values of a particular feature (most data points will lie within ±2 standard deviations of the mean); this is an alternative to taking the range of values (max-min). In `numpy`, you can use the `std` function to compute the standard deviation. \n", - "\n", - "For example, the quantity `X[:, 0]` contains all the values of $x_1$ (house sizes) in the training set, so `np.std(X[:, 0])` computes the standard deviation of the house sizes.\n", - "At the time that the function `featureNormalize` is called, the extra column of 1’s corresponding to $x_0 = 1$ has not yet been added to $X$. \n", - "\n", - "You will do this for all the features and your code should work with datasets of all sizes (any number of features / examples). Note that each column of the matrix $X$ corresponds to one feature.\n", - "\n", - "
\n", - "**Implementation Note:** When normalizing the features, it is important\n", - "to store the values used for normalization - the mean value and the standard deviation used for the computations. After learning the parameters\n", - "from the model, we often want to predict the prices of houses we have not\n", - "seen before. Given a new x value (living room area and number of bedrooms), we must first normalize x using the mean and standard deviation that we had previously computed from the training set.\n", - "
\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def featureNormalize(X):\n", - " \"\"\"\n", - " Normalizes the features in X. returns a normalized version of X where\n", - " the mean value of each feature is 0 and the standard deviation\n", - " is 1. This is often a good preprocessing step to do when working with\n", - " learning algorithms.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The dataset of shape (m x n).\n", - " \n", - " Returns\n", - " -------\n", - " X_norm : array_like\n", - " The normalized dataset of shape (m x n).\n", - " \n", - " Instructions\n", - " ------------\n", - " First, for each feature dimension, compute the mean of the feature\n", - " and subtract it from the dataset, storing the mean value in mu. \n", - " Next, compute the standard deviation of each feature and divide\n", - " each feature by it's standard deviation, storing the standard deviation \n", - " in sigma. \n", - " \n", - " Note that X is a matrix where each column is a feature and each row is\n", - " an example. You needto perform the normalization separately for each feature. \n", - " \n", - " Hint\n", - " ----\n", - " You might find the 'np.mean' and 'np.std' functions useful.\n", - " \"\"\"\n", - " # You need to set these values correctly\n", - " X_norm = X.copy()\n", - " mu = np.zeros(X.shape[1])\n", - " sigma = np.zeros(X.shape[1])\n", - "\n", - " # =========================== YOUR CODE HERE =====================\n", - "\n", - " \n", - " # ================================================================\n", - " return X_norm, mu, sigma" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Execute the next cell to run the implemented `featureNormalize` function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# call featureNormalize on the loaded data\n", - "X_norm, mu, sigma = featureNormalize(X)\n", - "\n", - "print('Computed mean:', mu)\n", - "print('Computed standard deviation:', sigma)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[4] = featureNormalize\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After the `featureNormalize` function is tested, we now add the intercept term to `X_norm`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Add intercept term to X\n", - "X = np.concatenate([np.ones((m, 1)), X_norm], axis=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 3.2 Gradient Descent\n", - "\n", - "Previously, you implemented gradient descent on a univariate regression problem. The only difference now is that there is one more feature in the matrix $X$. The hypothesis function and the batch gradient descent update\n", - "rule remain unchanged. \n", - "\n", - "You should complete the code for the functions `computeCostMulti` and `gradientDescentMulti` to implement the cost function and gradient descent for linear regression with multiple variables. If your code in the previous part (single variable) already supports multiple variables, you can use it here too.\n", - "Make sure your code supports any number of features and is well-vectorized.\n", - "You can use the `shape` property of `numpy` arrays to find out how many features are present in the dataset.\n", - "\n", - "
\n", - "**Implementation Note:** In the multivariate case, the cost function can\n", - "also be written in the following vectorized form:\n", - "\n", - "$$ J(\\theta) = \\frac{1}{2m}(X\\theta - \\vec{y})^T(X\\theta - \\vec{y}) $$\n", - "\n", - "where \n", - "\n", - "$$ X = \\begin{pmatrix}\n", - " - (x^{(1)})^T - \\\\\n", - " - (x^{(2)})^T - \\\\\n", - " \\vdots \\\\\n", - " - (x^{(m)})^T - \\\\ \\\\\n", - " \\end{pmatrix} \\qquad \\mathbf{y} = \\begin{bmatrix} y^{(1)} \\\\ y^{(2)} \\\\ \\vdots \\\\ y^{(m)} \\\\\\end{bmatrix}$$\n", - "\n", - "the vectorized version is efficient when you are working with numerical computing tools like `numpy`. If you are an expert with matrix operations, you can prove to yourself that the two forms are equivalent.\n", - "
\n", - "\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def computeCostMulti(X, y, theta):\n", - " \"\"\"\n", - " Compute cost for linear regression with multiple variables.\n", - " Computes the cost of using theta as the parameter for linear regression to fit the data points in X and y.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The dataset of shape (m x n+1).\n", - " \n", - " y : array_like\n", - " A vector of shape (m, ) for the values at a given data point.\n", - " \n", - " theta : array_like\n", - " The linear regression parameters. A vector of shape (n+1, )\n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The value of the cost function. \n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the cost of a particular choice of theta. You should set J to the cost.\n", - " \"\"\"\n", - " # Initialize some useful values\n", - " m = y.shape[0] # number of training examples\n", - " \n", - " # You need to return the following variable correctly\n", - " J = 0\n", - " \n", - " # ======================= YOUR CODE HERE ===========================\n", - "\n", - " \n", - " # ==================================================================\n", - " return J\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[5] = computeCostMulti\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def gradientDescentMulti(X, y, theta, alpha, num_iters):\n", - " \"\"\"\n", - " Performs gradient descent to learn theta.\n", - " Updates theta by taking num_iters gradient steps with learning rate alpha.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The dataset of shape (m x n+1).\n", - " \n", - " y : array_like\n", - " A vector of shape (m, ) for the values at a given data point.\n", - " \n", - " theta : array_like\n", - " The linear regression parameters. A vector of shape (n+1, )\n", - " \n", - " alpha : float\n", - " The learning rate for gradient descent. \n", - " \n", - " num_iters : int\n", - " The number of iterations to run gradient descent. \n", - " \n", - " Returns\n", - " -------\n", - " theta : array_like\n", - " The learned linear regression parameters. A vector of shape (n+1, ).\n", - " \n", - " J_history : list\n", - " A python list for the values of the cost function after each iteration.\n", - " \n", - " Instructions\n", - " ------------\n", - " Peform a single gradient step on the parameter vector theta.\n", - "\n", - " While debugging, it can be useful to print out the values of \n", - " the cost function (computeCost) and gradient here.\n", - " \"\"\"\n", - " # Initialize some useful values\n", - " m = y.shape[0] # number of training examples\n", - " \n", - " # make a copy of theta, which will be updated by gradient descent\n", - " theta = theta.copy()\n", - " \n", - " J_history = []\n", - " \n", - " for i in range(num_iters):\n", - " # ======================= YOUR CODE HERE ==========================\n", - "\n", - " \n", - " # =================================================================\n", - " \n", - " # save the cost J in every iteration\n", - " J_history.append(computeCostMulti(X, y, theta))\n", - " \n", - " return theta, J_history" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[6] = gradientDescentMulti\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 3.2.1 Optional (ungraded) exercise: Selecting learning rates\n", - "\n", - "In this part of the exercise, you will get to try out different learning rates for the dataset and find a learning rate that converges quickly. You can change the learning rate by modifying the following code and changing the part of the code that sets the learning rate.\n", - "\n", - "Use your implementation of `gradientDescentMulti` function and run gradient descent for about 50 iterations at the chosen learning rate. The function should also return the history of $J(\\theta)$ values in a vector $J$.\n", - "\n", - "After the last iteration, plot the J values against the number of the iterations.\n", - "\n", - "If you picked a learning rate within a good range, your plot look similar as the following Figure. \n", - "\n", - "![](Figures/learning_rate.png)\n", - "\n", - "If your graph looks very different, especially if your value of $J(\\theta)$ increases or even blows up, adjust your learning rate and try again. We recommend trying values of the learning rate $\\alpha$ on a log-scale, at multiplicative steps of about 3 times the previous value (i.e., 0.3, 0.1, 0.03, 0.01 and so on). You may also want to adjust the number of iterations you are running if that will help you see the overall trend in the curve.\n", - "\n", - "
\n", - "**Implementation Note:** If your learning rate is too large, $J(\\theta)$ can diverge and ‘blow up’, resulting in values which are too large for computer calculations. In these situations, `numpy` will tend to return\n", - "NaNs. NaN stands for ‘not a number’ and is often caused by undefined operations that involve −∞ and +∞.\n", - "
\n", - "\n", - "
\n", - "**MATPLOTLIB tip:** To compare how different learning learning rates affect convergence, it is helpful to plot $J$ for several learning rates on the same figure. This can be done by making `alpha` a python list, and looping across the values within this list, and calling the plot function in every iteration of the loop. It is also useful to have a legend to distinguish the different lines within the plot. Search online for `pyplot.legend` for help on showing legends in `matplotlib`.\n", - "
\n", - "\n", - "Notice the changes in the convergence curves as the learning rate changes. With a small learning rate, you should find that gradient descent takes a very long time to converge to the optimal value. Conversely, with a large learning rate, gradient descent might not converge or might even diverge!\n", - "Using the best learning rate that you found, run the script\n", - "to run gradient descent until convergence to find the final values of $\\theta$. Next,\n", - "use this value of $\\theta$ to predict the price of a house with 1650 square feet and\n", - "3 bedrooms. You will use value later to check your implementation of the normal equations. Don’t forget to normalize your features when you make this prediction!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "\"\"\"\n", - "Instructions\n", - "------------\n", - "We have provided you with the following starter code that runs\n", - "gradient descent with a particular learning rate (alpha). \n", - "\n", - "Your task is to first make sure that your functions - `computeCost`\n", - "and `gradientDescent` already work with this starter code and\n", - "support multiple variables.\n", - "\n", - "After that, try running gradient descent with different values of\n", - "alpha and see which one gives you the best result.\n", - "\n", - "Finally, you should complete the code at the end to predict the price\n", - "of a 1650 sq-ft, 3 br house.\n", - "\n", - "Hint\n", - "----\n", - "At prediction, make sure you do the same feature normalization.\n", - "\"\"\"\n", - "# Choose some alpha value - change this\n", - "alpha = 0.1\n", - "num_iters = 400\n", - "\n", - "# init theta and run gradient descent\n", - "theta = np.zeros(3)\n", - "theta, J_history = gradientDescentMulti(X, y, theta, alpha, num_iters)\n", - "\n", - "# Plot the convergence graph\n", - "pyplot.plot(np.arange(len(J_history)), J_history, lw=2)\n", - "pyplot.xlabel('Number of iterations')\n", - "pyplot.ylabel('Cost J')\n", - "\n", - "# Display the gradient descent's result\n", - "print('theta computed from gradient descent: {:s}'.format(str(theta)))\n", - "\n", - "# Estimate the price of a 1650 sq-ft, 3 br house\n", - "# ======================= YOUR CODE HERE ===========================\n", - "# Recall that the first column of X is all-ones. \n", - "# Thus, it does not need to be normalized.\n", - "\n", - "price = 0 # You should change this\n", - "\n", - "# ===================================================================\n", - "\n", - "print('Predicted price of a 1650 sq-ft, 3 br house (using gradient descent): ${:.0f}'.format(price))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You do not need to submit any solutions for this optional (ungraded) part.*" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 3.3 Normal Equations\n", - "\n", - "In the lecture videos, you learned that the closed-form solution to linear regression is\n", - "\n", - "$$ \\theta = \\left( X^T X\\right)^{-1} X^T\\vec{y}$$\n", - "\n", - "Using this formula does not require any feature scaling, and you will get an exact solution in one calculation: there is no “loop until convergence” like in gradient descent. \n", - "\n", - "First, we will reload the data to ensure that the variables have not been modified. Remember that while you do not need to scale your features, we still need to add a column of 1’s to the $X$ matrix to have an intercept term ($\\theta_0$). The code in the next cell will add the column of 1’s to X for you." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load data\n", - "data = np.loadtxt(os.path.join('Data', 'ex1data2.txt'), delimiter=',')\n", - "X = data[:, :2]\n", - "y = data[:, 2]\n", - "m = y.size\n", - "X = np.concatenate([np.ones((m, 1)), X], axis=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Complete the code for the function `normalEqn` below to use the formula above to calculate $\\theta$. \n", - "\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def normalEqn(X, y):\n", - " \"\"\"\n", - " Computes the closed-form solution to linear regression using the normal equations.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The dataset of shape (m x n+1).\n", - " \n", - " y : array_like\n", - " The value at each data point. A vector of shape (m, ).\n", - " \n", - " Returns\n", - " -------\n", - " theta : array_like\n", - " Estimated linear regression parameters. A vector of shape (n+1, ).\n", - " \n", - " Instructions\n", - " ------------\n", - " Complete the code to compute the closed form solution to linear\n", - " regression and put the result in theta.\n", - " \n", - " Hint\n", - " ----\n", - " Look up the function `np.linalg.pinv` for computing matrix inverse.\n", - " \"\"\"\n", - " theta = np.zeros(X.shape[1])\n", - " \n", - " # ===================== YOUR CODE HERE ============================\n", - "\n", - " \n", - " # =================================================================\n", - " return theta" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[7] = normalEqn\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Optional (ungraded) exercise: Now, once you have found $\\theta$ using this\n", - "method, use it to make a price prediction for a 1650-square-foot house with\n", - "3 bedrooms. You should find that gives the same predicted price as the value\n", - "you obtained using the model fit with gradient descent (in Section 3.2.1)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Calculate the parameters from the normal equation\n", - "theta = normalEqn(X, y);\n", - "\n", - "# Display normal equation's result\n", - "print('Theta computed from the normal equations: {:s}'.format(str(theta)));\n", - "\n", - "# Estimate the price of a 1650 sq-ft, 3 br house\n", - "# ====================== YOUR CODE HERE ======================\n", - "\n", - "price = 0 # You should change this\n", - "\n", - "# ============================================================\n", - "\n", - "print('Predicted price of a 1650 sq-ft, 3 br house (using normal equations): ${:.0f}'.format(price))" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/Exercise1/utils.py b/Exercise1/utils.py deleted file mode 100755 index d0c909d5..00000000 --- a/Exercise1/utils.py +++ /dev/null @@ -1,48 +0,0 @@ -import numpy as np -import sys -sys.path.append('..') - -from submission import SubmissionBase - - -class Grader(SubmissionBase): - X1 = np.column_stack((np.ones(20), np.exp(1) + np.exp(2) * np.linspace(0.1, 2, 20))) - Y1 = X1[:, 1] + np.sin(X1[:, 0]) + np.cos(X1[:, 1]) - X2 = np.column_stack((X1, X1[:, 1]**0.5, X1[:, 1]**0.25)) - Y2 = np.power(Y1, 0.5) + Y1 - - def __init__(self): - part_names = ['Warm up exercise', - 'Computing Cost (for one variable)', - 'Gradient Descent (for one variable)', - 'Feature Normalization', - 'Computing Cost (for multiple variables)', - 'Gradient Descent (for multiple variables)', - 'Normal Equations'] - super().__init__('linear-regression', part_names) - - def __iter__(self): - for part_id in range(1, 8): - try: - func = self.functions[part_id] - - # Each part has different expected arguments/different function - if part_id == 1: - res = func() - elif part_id == 2: - res = func(self.X1, self.Y1, np.array([0.5, -0.5])) - elif part_id == 3: - res = func(self.X1, self.Y1, np.array([0.5, -0.5]), 0.01, 10) - elif part_id == 4: - res = func(self.X2[:, 1:4]) - elif part_id == 5: - res = func(self.X2, self.Y2, np.array([0.1, 0.2, 0.3, 0.4])) - elif part_id == 6: - res = func(self.X2, self.Y2, np.array([-0.1, -0.2, -0.3, -0.4]), 0.01, 10) - elif part_id == 7: - res = func(self.X2, self.Y2) - else: - raise KeyError - yield part_id, res - except KeyError: - yield part_id, 0 diff --git a/Exercise2/Figures/decision_boundary1.png b/Exercise2/Figures/decision_boundary1.png deleted file mode 100755 index f1399c5c..00000000 Binary files a/Exercise2/Figures/decision_boundary1.png and /dev/null differ diff --git a/Exercise2/Figures/decision_boundary2.png b/Exercise2/Figures/decision_boundary2.png deleted file mode 100755 index 52c9f755..00000000 Binary files a/Exercise2/Figures/decision_boundary2.png and /dev/null differ diff --git a/Exercise2/Figures/decision_boundary3.png b/Exercise2/Figures/decision_boundary3.png deleted file mode 100755 index 82b6ea1b..00000000 Binary files a/Exercise2/Figures/decision_boundary3.png and /dev/null differ diff --git a/Exercise2/Figures/decision_boundary4.png b/Exercise2/Figures/decision_boundary4.png deleted file mode 100755 index b6c1370b..00000000 Binary files a/Exercise2/Figures/decision_boundary4.png and /dev/null differ diff --git a/Exercise2/exercise2.ipynb b/Exercise2/exercise2.ipynb deleted file mode 100755 index 39983d90..00000000 --- a/Exercise2/exercise2.ipynb +++ /dev/null @@ -1,965 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Programming Exercise 2: Logistic Regression\n", - "\n", - "## Introduction\n", - "\n", - "In this exercise, you will implement logistic regression and apply it to two different datasets. Before starting on the programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics.\n", - "\n", - "All the information you need for solving this assignment is in this notebook, and all the code you will be implementing will take place within this notebook. The assignment can be promptly submitted to the coursera grader directly from this notebook (code and instructions are included below).\n", - "\n", - "Before we begin with the exercises, we need to import all libraries required for this programming exercise. Throughout the course, we will be using [`numpy`](http://www.numpy.org/) for all arrays and matrix operations, and [`matplotlib`](https://matplotlib.org/) for plotting. In this assignment, we will also use [`scipy`](https://docs.scipy.org/doc/scipy/reference/), which contains scientific and numerical computation functions and tools. \n", - "\n", - "You can find instructions on how to install required libraries in the README file in the [github repository](https://github.com/dibgerge/ml-coursera-python-assignments)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# used for manipulating directory paths\n", - "import os\n", - "\n", - "# Scientific and vector computation for python\n", - "import numpy as np\n", - "\n", - "# Plotting library\n", - "from matplotlib import pyplot\n", - "\n", - "# Optimization module in scipy\n", - "from scipy import optimize\n", - "\n", - "# library written for this exercise providing additional functions for assignment submission, and others\n", - "import utils\n", - "\n", - "# define the submission/grader object for this exercise\n", - "grader = utils.Grader()\n", - "\n", - "# tells matplotlib to embed plots within the notebook\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Submission and Grading\n", - "\n", - "\n", - "After completing each part of the assignment, be sure to submit your solutions to the grader. The following is a breakdown of how each part of this exercise is scored.\n", - "\n", - "\n", - "| Section | Part | Submission function | Points \n", - "| :- |:- | :- | :-:\n", - "| 1 | [Sigmoid Function](#section1) | [`sigmoid`](#sigmoid) | 5 \n", - "| 2 | [Compute cost for logistic regression](#section2) | [`costFunction`](#costFunction) | 30 \n", - "| 3 | [Gradient for logistic regression](#section2) | [`costFunction`](#costFunction) | 30 \n", - "| 4 | [Predict Function](#section4) | [`predict`](#predict) | 5 \n", - "| 5 | [Compute cost for regularized LR](#section5) | [`costFunctionReg`](#costFunctionReg) | 15 \n", - "| 6 | [Gradient for regularized LR](#section5) | [`costFunctionReg`](#costFunctionReg) | 15 \n", - "| | Total Points | | 100 \n", - "\n", - "\n", - "\n", - "You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "
\n", - "At the end of each section in this notebook, we have a cell which contains code for submitting the solutions thus far to the grader. Execute the cell to see your score up to the current section. For all your work to be submitted properly, you must execute those cells at least once. They must also be re-executed everytime the submitted function is updated.\n", - "
\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 1 Logistic Regression\n", - "\n", - "In this part of the exercise, you will build a logistic regression model to predict whether a student gets admitted into a university. Suppose that you are the administrator of a university department and\n", - "you want to determine each applicant’s chance of admission based on their results on two exams. You have historical data from previous applicants that you can use as a training set for logistic regression. For each training example, you have the applicant’s scores on two exams and the admissions\n", - "decision. Your task is to build a classification model that estimates an applicant’s probability of admission based the scores from those two exams. \n", - "\n", - "The following cell will load the data and corresponding labels:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load data\n", - "# The first two columns contains the exam scores and the third column\n", - "# contains the label.\n", - "data = np.loadtxt(os.path.join('Data', 'ex2data1.txt'), delimiter=',')\n", - "X, y = data[:, 0:2], data[:, 2]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.1 Visualizing the data\n", - "\n", - "Before starting to implement any learning algorithm, it is always good to visualize the data if possible. We display the data on a 2-dimensional plot by calling the function `plotData`. You will now complete the code in `plotData` so that it displays a figure where the axes are the two exam scores, and the positive and negative examples are shown with different markers.\n", - "\n", - "To help you get more familiar with plotting, we have left `plotData` empty so you can try to implement it yourself. However, this is an optional (ungraded) exercise. We also provide our implementation below so you can\n", - "copy it or refer to it. If you choose to copy our example, make sure you learn\n", - "what each of its commands is doing by consulting the `matplotlib` and `numpy` documentation.\n", - "\n", - "```python\n", - "# Find Indices of Positive and Negative Examples\n", - "pos = y == 1\n", - "neg = y == 0\n", - "\n", - "# Plot Examples\n", - "pyplot.plot(X[pos, 0], X[pos, 1], 'k*', lw=2, ms=10)\n", - "pyplot.plot(X[neg, 0], X[neg, 1], 'ko', mfc='y', ms=8, mec='k', mew=1)\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def plotData(X, y):\n", - " \"\"\"\n", - " Plots the data points X and y into a new figure. Plots the data \n", - " points with * for the positive examples and o for the negative examples.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " An Mx2 matrix representing the dataset. \n", - " \n", - " y : array_like\n", - " Label values for the dataset. A vector of size (M, ).\n", - " \n", - " Instructions\n", - " ------------\n", - " Plot the positive and negative examples on a 2D plot, using the\n", - " option 'k*' for the positive examples and 'ko' for the negative examples. \n", - " \"\"\"\n", - " # Create New Figure\n", - " fig = pyplot.figure()\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - " \n", - " # ============================================================" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now, we call the implemented function to display the loaded data:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plotData(X, y)\n", - "# add axes labels\n", - "pyplot.xlabel('Exam 1 score')\n", - "pyplot.ylabel('Exam 2 score')\n", - "pyplot.legend(['Admitted', 'Not admitted'])\n", - "pass" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 1.2 Implementation\n", - "\n", - "#### 1.2.1 Warmup exercise: sigmoid function\n", - "\n", - "Before you start with the actual cost function, recall that the logistic regression hypothesis is defined as:\n", - "\n", - "$$ h_\\theta(x) = g(\\theta^T x)$$\n", - "\n", - "where function $g$ is the sigmoid function. The sigmoid function is defined as: \n", - "\n", - "$$g(z) = \\frac{1}{1+e^{-z}}$$.\n", - "\n", - "Your first step is to implement this function `sigmoid` so it can be\n", - "called by the rest of your program. When you are finished, try testing a few\n", - "values by calling `sigmoid(x)` in a new cell. For large positive values of `x`, the sigmoid should be close to 1, while for large negative values, the sigmoid should be close to 0. Evaluating `sigmoid(0)` should give you exactly 0.5. Your code should also work with vectors and matrices. **For a matrix, your function should perform the sigmoid function on every element.**\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def sigmoid(z):\n", - " \"\"\"\n", - " Compute sigmoid function given the input z.\n", - " \n", - " Parameters\n", - " ----------\n", - " z : array_like\n", - " The input to the sigmoid function. This can be a 1-D vector \n", - " or a 2-D matrix. \n", - " \n", - " Returns\n", - " -------\n", - " g : array_like\n", - " The computed sigmoid function. g has the same shape as z, since\n", - " the sigmoid is computed element-wise on z.\n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the sigmoid of each value of z (z can be a matrix, vector or scalar).\n", - " \"\"\"\n", - " # convert input to a numpy array\n", - " z = np.array(z)\n", - " \n", - " # You need to return the following variables correctly \n", - " g = np.zeros(z.shape)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - " \n", - "\n", - " # =============================================================\n", - " return g" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The following cell evaluates the sigmoid function at `z=0`. You should get a value of 0.5. You can also try different values for `z` to experiment with the sigmoid function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Test the implementation of sigmoid function here\n", - "z = 0\n", - "g = sigmoid(z)\n", - "\n", - "print('g(', z, ') = ', g)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After completing a part of the exercise, you can submit your solutions for grading by first adding the function you modified to the submission object, and then sending your function to Coursera for grading. \n", - "\n", - "The submission script will prompt you for your login e-mail and submission token. You can obtain a submission token from the web page for the assignment. You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "Execute the following cell to grade your solution to the first part of this exercise.\n", - "\n", - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# appends the implemented function in part 1 to the grader object\n", - "grader[1] = sigmoid\n", - "\n", - "# send the added functions to coursera grader for getting a grade on this part\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### 1.2.2 Cost function and gradient\n", - "\n", - "Now you will implement the cost function and gradient for logistic regression. Before proceeding we add the intercept term to X. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Setup the data matrix appropriately, and add ones for the intercept term\n", - "m, n = X.shape\n", - "\n", - "# Add intercept term to X\n", - "X = np.concatenate([np.ones((m, 1)), X], axis=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now, complete the code for the function `costFunction` to return the cost and gradient. Recall that the cost function in logistic regression is\n", - "\n", - "$$ J(\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left[ -y^{(i)} \\log\\left(h_\\theta\\left( x^{(i)} \\right) \\right) - \\left( 1 - y^{(i)}\\right) \\log \\left( 1 - h_\\theta\\left( x^{(i)} \\right) \\right) \\right]$$\n", - "\n", - "and the gradient of the cost is a vector of the same length as $\\theta$ where the $j^{th}$\n", - "element (for $j = 0, 1, \\cdots , n$) is defined as follows:\n", - "\n", - "$$ \\frac{\\partial J(\\theta)}{\\partial \\theta_j} = \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta \\left( x^{(i)} \\right) - y^{(i)} \\right) x_j^{(i)} $$\n", - "\n", - "Note that while this gradient looks identical to the linear regression gradient, the formula is actually different because linear and logistic regression have different definitions of $h_\\theta(x)$.\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def costFunction(theta, X, y):\n", - " \"\"\"\n", - " Compute cost and gradient for logistic regression. \n", - " \n", - " Parameters\n", - " ----------\n", - " theta : array_like\n", - " The parameters for logistic regression. This a vector\n", - " of shape (n+1, ).\n", - " \n", - " X : array_like\n", - " The input dataset of shape (m x n+1) where m is the total number\n", - " of data points and n is the number of features. We assume the \n", - " intercept has already been added to the input.\n", - " \n", - " y : arra_like\n", - " Labels for the input. This is a vector of shape (m, ).\n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The computed value for the cost function. \n", - " \n", - " grad : array_like\n", - " A vector of shape (n+1, ) which is the gradient of the cost\n", - " function with respect to theta, at the current values of theta.\n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the cost of a particular choice of theta. You should set J to \n", - " the cost. Compute the partial derivatives and set grad to the partial\n", - " derivatives of the cost w.r.t. each parameter in theta.\n", - " \"\"\"\n", - " # Initialize some useful values\n", - " m = y.size # number of training examples\n", - "\n", - " # You need to return the following variables correctly \n", - " J = 0\n", - " grad = np.zeros(theta.shape)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - " \n", - " \n", - " # =============================================================\n", - " return J, grad" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you are done call your `costFunction` using two test cases for $\\theta$ by executing the next cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize fitting parameters\n", - "initial_theta = np.zeros(n+1)\n", - "\n", - "cost, grad = costFunction(initial_theta, X, y)\n", - "\n", - "print('Cost at initial theta (zeros): {:.3f}'.format(cost))\n", - "print('Expected cost (approx): 0.693\\n')\n", - "\n", - "print('Gradient at initial theta (zeros):')\n", - "print('\\t[{:.4f}, {:.4f}, {:.4f}]'.format(*grad))\n", - "print('Expected gradients (approx):\\n\\t[-0.1000, -12.0092, -11.2628]\\n')\n", - "\n", - "# Compute and display cost and gradient with non-zero theta\n", - "test_theta = np.array([-24, 0.2, 0.2])\n", - "cost, grad = costFunction(test_theta, X, y)\n", - "\n", - "print('Cost at test theta: {:.3f}'.format(cost))\n", - "print('Expected cost (approx): 0.218\\n')\n", - "\n", - "print('Gradient at test theta:')\n", - "print('\\t[{:.3f}, {:.3f}, {:.3f}]'.format(*grad))\n", - "print('Expected gradients (approx):\\n\\t[0.043, 2.566, 2.647]')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[2] = costFunction\n", - "grader[3] = costFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 1.2.3 Learning parameters using `scipy.optimize`\n", - "\n", - "In the previous assignment, you found the optimal parameters of a linear regression model by implementing gradient descent. You wrote a cost function and calculated its gradient, then took a gradient descent step accordingly. This time, instead of taking gradient descent steps, you will use the [`scipy.optimize` module](https://docs.scipy.org/doc/scipy/reference/optimize.html). SciPy is a numerical computing library for `python`. It provides an optimization module for root finding and minimization. As of `scipy 1.0`, the function `scipy.optimize.minimize` is the method to use for optimization problems(both constrained and unconstrained).\n", - "\n", - "For logistic regression, you want to optimize the cost function $J(\\theta)$ with parameters $\\theta$.\n", - "Concretely, you are going to use `optimize.minimize` to find the best parameters $\\theta$ for the logistic regression cost function, given a fixed dataset (of X and y values). You will pass to `optimize.minimize` the following inputs:\n", - "- `costFunction`: A cost function that, when given the training set and a particular $\\theta$, computes the logistic regression cost and gradient with respect to $\\theta$ for the dataset (X, y). It is important to note that we only pass the name of the function without the parenthesis. This indicates that we are only providing a reference to this function, and not evaluating the result from this function.\n", - "- `initial_theta`: The initial values of the parameters we are trying to optimize.\n", - "- `(X, y)`: These are additional arguments to the cost function.\n", - "- `jac`: Indication if the cost function returns the Jacobian (gradient) along with cost value. (True)\n", - "- `method`: Optimization method/algorithm to use\n", - "- `options`: Additional options which might be specific to the specific optimization method. In the following, we only tell the algorithm the maximum number of iterations before it terminates.\n", - "\n", - "If you have completed the `costFunction` correctly, `optimize.minimize` will converge on the right optimization parameters and return the final values of the cost and $\\theta$ in a class object. Notice that by using `optimize.minimize`, you did not have to write any loops yourself, or set a learning rate like you did for gradient descent. This is all done by `optimize.minimize`: you only needed to provide a function calculating the cost and the gradient.\n", - "\n", - "In the following, we already have code written to call `optimize.minimize` with the correct arguments." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# set options for optimize.minimize\n", - "options= {'maxiter': 400}\n", - "\n", - "# see documention for scipy's optimize.minimize for description about\n", - "# the different parameters\n", - "# The function returns an object `OptimizeResult`\n", - "# We use truncated Newton algorithm for optimization which is \n", - "# equivalent to MATLAB's fminunc\n", - "# See https://stackoverflow.com/questions/18801002/fminunc-alternate-in-numpy\n", - "res = optimize.minimize(costFunction,\n", - " initial_theta,\n", - " (X, y),\n", - " jac=True,\n", - " method='TNC',\n", - " options=options)\n", - "\n", - "# the fun property of `OptimizeResult` object returns\n", - "# the value of costFunction at optimized theta\n", - "cost = res.fun\n", - "\n", - "# the optimized theta is in the x property\n", - "theta = res.x\n", - "\n", - "# Print theta to screen\n", - "print('Cost at theta found by optimize.minimize: {:.3f}'.format(cost))\n", - "print('Expected cost (approx): 0.203\\n');\n", - "\n", - "print('theta:')\n", - "print('\\t[{:.3f}, {:.3f}, {:.3f}]'.format(*theta))\n", - "print('Expected theta (approx):\\n\\t[-25.161, 0.206, 0.201]')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once `optimize.minimize` completes, we want to use the final value for $\\theta$ to visualize the decision boundary on the training data as shown in the figure below. \n", - "\n", - "![](Figures/decision_boundary1.png)\n", - "\n", - "To do so, we have written a function `plotDecisionBoundary` for plotting the decision boundary on top of training data. You do not need to write any code for plotting the decision boundary, but we also encourage you to look at the code in `plotDecisionBoundary` to see how to plot such a boundary using the $\\theta$ values. You can find this function in the `utils.py` file which comes with this assignment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Plot Boundary\n", - "utils.plotDecisionBoundary(plotData, theta, X, y)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### 1.2.4 Evaluating logistic regression\n", - "\n", - "After learning the parameters, you can use the model to predict whether a particular student will be admitted. For a student with an Exam 1 score of 45 and an Exam 2 score of 85, you should expect to see an admission\n", - "probability of 0.776. Another way to evaluate the quality of the parameters we have found is to see how well the learned model predicts on our training set. In this part, your task is to complete the code in function `predict`. The predict function will produce “1” or “0” predictions given a dataset and a learned parameter vector $\\theta$. \n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def predict(theta, X):\n", - " \"\"\"\n", - " Predict whether the label is 0 or 1 using learned logistic regression.\n", - " Computes the predictions for X using a threshold at 0.5 \n", - " (i.e., if sigmoid(theta.T*x) >= 0.5, predict 1)\n", - " \n", - " Parameters\n", - " ----------\n", - " theta : array_like\n", - " Parameters for logistic regression. A vecotor of shape (n+1, ).\n", - " \n", - " X : array_like\n", - " The data to use for computing predictions. The rows is the number \n", - " of points to compute predictions, and columns is the number of\n", - " features.\n", - "\n", - " Returns\n", - " -------\n", - " p : array_like\n", - " Predictions and 0 or 1 for each row in X. \n", - " \n", - " Instructions\n", - " ------------\n", - " Complete the following code to make predictions using your learned \n", - " logistic regression parameters.You should set p to a vector of 0's and 1's \n", - " \"\"\"\n", - " m = X.shape[0] # Number of training examples\n", - "\n", - " # You need to return the following variables correctly\n", - " p = np.zeros(m)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - " \n", - " \n", - " # ============================================================\n", - " return p" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After you have completed the code in `predict`, we proceed to report the training accuracy of your classifier by computing the percentage of examples it got correct." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Predict probability for a student with score 45 on exam 1 \n", - "# and score 85 on exam 2 \n", - "prob = sigmoid(np.dot([1, 45, 85], theta))\n", - "print('For a student with scores 45 and 85,'\n", - " 'we predict an admission probability of {:.3f}'.format(prob))\n", - "print('Expected value: 0.775 +/- 0.002\\n')\n", - "\n", - "# Compute accuracy on our training set\n", - "p = predict(theta, X)\n", - "print('Train Accuracy: {:.2f} %'.format(np.mean(p == y) * 100))\n", - "print('Expected accuracy (approx): 89.00 %')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[4] = predict\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2 Regularized logistic regression\n", - "\n", - "In this part of the exercise, you will implement regularized logistic regression to predict whether microchips from a fabrication plant passes quality assurance (QA). During QA, each microchip goes through various tests to ensure it is functioning correctly.\n", - "Suppose you are the product manager of the factory and you have the test results for some microchips on two different tests. From these two tests, you would like to determine whether the microchips should be accepted or rejected. To help you make the decision, you have a dataset of test results on past microchips, from which you can build a logistic regression model.\n", - "\n", - "First, we load the data from a CSV file:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load Data\n", - "# The first two columns contains the X values and the third column\n", - "# contains the label (y).\n", - "data = np.loadtxt(os.path.join('Data', 'ex2data2.txt'), delimiter=',')\n", - "X = data[:, :2]\n", - "y = data[:, 2]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.1 Visualize the data\n", - "\n", - "Similar to the previous parts of this exercise, `plotData` is used to generate a figure, where the axes are the two test scores, and the positive (y = 1, accepted) and negative (y = 0, rejected) examples are shown with\n", - "different markers." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plotData(X, y)\n", - "# Labels and Legend\n", - "pyplot.xlabel('Microchip Test 1')\n", - "pyplot.ylabel('Microchip Test 2')\n", - "\n", - "# Specified in plot order\n", - "pyplot.legend(['y = 1', 'y = 0'], loc='upper right')\n", - "pass" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The above figure shows that our dataset cannot be separated into positive and negative examples by a straight-line through the plot. Therefore, a straight-forward application of logistic regression will not perform well on this dataset since logistic regression will only be able to find a linear decision boundary.\n", - "\n", - "### 2.2 Feature mapping\n", - "\n", - "One way to fit the data better is to create more features from each data point. In the function `mapFeature` defined in the file `utils.py`, we will map the features into all polynomial terms of $x_1$ and $x_2$ up to the sixth power.\n", - "\n", - "$$ \\text{mapFeature}(x) = \\begin{bmatrix} 1 & x_1 & x_2 & x_1^2 & x_1 x_2 & x_2^2 & x_1^3 & \\dots & x_1 x_2^5 & x_2^6 \\end{bmatrix}^T $$\n", - "\n", - "As a result of this mapping, our vector of two features (the scores on two QA tests) has been transformed into a 28-dimensional vector. A logistic regression classifier trained on this higher-dimension feature vector will have a more complex decision boundary and will appear nonlinear when drawn in our 2-dimensional plot.\n", - "While the feature mapping allows us to build a more expressive classifier, it also more susceptible to overfitting. In the next parts of the exercise, you will implement regularized logistic regression to fit the data and also see for yourself how regularization can help combat the overfitting problem.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Note that mapFeature also adds a column of ones for us, so the intercept\n", - "# term is handled\n", - "X = utils.mapFeature(X[:, 0], X[:, 1])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.3 Cost function and gradient\n", - "\n", - "Now you will implement code to compute the cost function and gradient for regularized logistic regression. Complete the code for the function `costFunctionReg` below to return the cost and gradient.\n", - "\n", - "Recall that the regularized cost function in logistic regression is\n", - "\n", - "$$ J(\\theta) = \\frac{1}{m} \\sum_{i=1}^m \\left[ -y^{(i)}\\log \\left( h_\\theta \\left(x^{(i)} \\right) \\right) - \\left( 1 - y^{(i)} \\right) \\log \\left( 1 - h_\\theta \\left( x^{(i)} \\right) \\right) \\right] + \\frac{\\lambda}{2m} \\sum_{j=1}^n \\theta_j^2 $$\n", - "\n", - "Note that you should not regularize the parameters $\\theta_0$. The gradient of the cost function is a vector where the $j^{th}$ element is defined as follows:\n", - "\n", - "$$ \\frac{\\partial J(\\theta)}{\\partial \\theta_0} = \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta \\left(x^{(i)}\\right) - y^{(i)} \\right) x_j^{(i)} \\qquad \\text{for } j =0 $$\n", - "\n", - "$$ \\frac{\\partial J(\\theta)}{\\partial \\theta_j} = \\left( \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta \\left(x^{(i)}\\right) - y^{(i)} \\right) x_j^{(i)} \\right) + \\frac{\\lambda}{m}\\theta_j \\qquad \\text{for } j \\ge 1 $$\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def costFunctionReg(theta, X, y, lambda_):\n", - " \"\"\"\n", - " Compute cost and gradient for logistic regression with regularization.\n", - " \n", - " Parameters\n", - " ----------\n", - " theta : array_like\n", - " Logistic regression parameters. A vector with shape (n, ). n is \n", - " the number of features including any intercept. If we have mapped\n", - " our initial features into polynomial features, then n is the total \n", - " number of polynomial features. \n", - " \n", - " X : array_like\n", - " The data set with shape (m x n). m is the number of examples, and\n", - " n is the number of features (after feature mapping).\n", - " \n", - " y : array_like\n", - " The data labels. A vector with shape (m, ).\n", - " \n", - " lambda_ : float\n", - " The regularization parameter. \n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The computed value for the regularized cost function. \n", - " \n", - " grad : array_like\n", - " A vector of shape (n, ) which is the gradient of the cost\n", - " function with respect to theta, at the current values of theta.\n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the cost `J` of a particular choice of theta.\n", - " Compute the partial derivatives and set `grad` to the partial\n", - " derivatives of the cost w.r.t. each parameter in theta.\n", - " \"\"\"\n", - " # Initialize some useful values\n", - " m = y.size # number of training examples\n", - "\n", - " # You need to return the following variables correctly \n", - " J = 0\n", - " grad = np.zeros(theta.shape)\n", - "\n", - " # ===================== YOUR CODE HERE ======================\n", - "\n", - " \n", - " \n", - " # =============================================================\n", - " return J, grad" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you are done with the `costFunctionReg`, we call it below using the initial value of $\\theta$ (initialized to all zeros), and also another test case where $\\theta$ is all ones." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize fitting parameters\n", - "initial_theta = np.zeros(X.shape[1])\n", - "\n", - "# Set regularization parameter lambda to 1\n", - "# DO NOT use `lambda` as a variable name in python\n", - "# because it is a python keyword\n", - "lambda_ = 1\n", - "\n", - "# Compute and display initial cost and gradient for regularized logistic\n", - "# regression\n", - "cost, grad = costFunctionReg(initial_theta, X, y, lambda_)\n", - "\n", - "print('Cost at initial theta (zeros): {:.3f}'.format(cost))\n", - "print('Expected cost (approx) : 0.693\\n')\n", - "\n", - "print('Gradient at initial theta (zeros) - first five values only:')\n", - "print('\\t[{:.4f}, {:.4f}, {:.4f}, {:.4f}, {:.4f}]'.format(*grad[:5]))\n", - "print('Expected gradients (approx) - first five values only:')\n", - "print('\\t[0.0085, 0.0188, 0.0001, 0.0503, 0.0115]\\n')\n", - "\n", - "\n", - "# Compute and display cost and gradient\n", - "# with all-ones theta and lambda = 10\n", - "test_theta = np.ones(X.shape[1])\n", - "cost, grad = costFunctionReg(test_theta, X, y, 10)\n", - "\n", - "print('------------------------------\\n')\n", - "print('Cost at test theta : {:.2f}'.format(cost))\n", - "print('Expected cost (approx): 3.16\\n')\n", - "\n", - "print('Gradient at initial theta (zeros) - first five values only:')\n", - "print('\\t[{:.4f}, {:.4f}, {:.4f}, {:.4f}, {:.4f}]'.format(*grad[:5]))\n", - "print('Expected gradients (approx) - first five values only:')\n", - "print('\\t[0.3460, 0.1614, 0.1948, 0.2269, 0.0922]')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[5] = costFunctionReg\n", - "grader[6] = costFunctionReg\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 2.3.1 Learning parameters using `scipy.optimize.minimize`\n", - "\n", - "Similar to the previous parts, you will use `optimize.minimize` to learn the optimal parameters $\\theta$. If you have completed the cost and gradient for regularized logistic regression (`costFunctionReg`) correctly, you should be able to step through the next part of to learn the parameters $\\theta$ using `optimize.minimize`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.4 Plotting the decision boundary\n", - "\n", - "To help you visualize the model learned by this classifier, we have provided the function `plotDecisionBoundary` which plots the (non-linear) decision boundary that separates the positive and negative examples. In `plotDecisionBoundary`, we plot the non-linear decision boundary by computing the classifier’s predictions on an evenly spaced grid and then and draw a contour plot where the predictions change from y = 0 to y = 1. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.5 Optional (ungraded) exercises\n", - "\n", - "In this part of the exercise, you will get to try out different regularization parameters for the dataset to understand how regularization prevents overfitting.\n", - "\n", - "Notice the changes in the decision boundary as you vary $\\lambda$. With a small\n", - "$\\lambda$, you should find that the classifier gets almost every training example correct, but draws a very complicated boundary, thus overfitting the data. See the following figures for the decision boundaries you should get for different values of $\\lambda$. \n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " No regularization (overfitting)\n", - " \n", - " Decision boundary with regularization\n", - " \n", - " \n", - " Decision boundary with too much regularization\n", - " \n", - "
\n", - "\n", - "This is not a good decision boundary: for example, it predicts that a point at $x = (−0.25, 1.5)$ is accepted $(y = 1)$, which seems to be an incorrect decision given the training set.\n", - "With a larger $\\lambda$, you should see a plot that shows an simpler decision boundary which still separates the positives and negatives fairly well. However, if $\\lambda$ is set to too high a value, you will not get a good fit and the decision boundary will not follow the data so well, thus underfitting the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize fitting parameters\n", - "initial_theta = np.zeros(X.shape[1])\n", - "\n", - "# Set regularization parameter lambda to 1 (you should vary this)\n", - "lambda_ = 1\n", - "\n", - "# set options for optimize.minimize\n", - "options= {'maxiter': 100}\n", - "\n", - "res = optimize.minimize(costFunctionReg,\n", - " initial_theta,\n", - " (X, y, lambda_),\n", - " jac=True,\n", - " method='TNC',\n", - " options=options)\n", - "\n", - "# the fun property of OptimizeResult object returns\n", - "# the value of costFunction at optimized theta\n", - "cost = res.fun\n", - "\n", - "# the optimized theta is in the x property of the result\n", - "theta = res.x\n", - "\n", - "utils.plotDecisionBoundary(plotData, theta, X, y)\n", - "pyplot.xlabel('Microchip Test 1')\n", - "pyplot.ylabel('Microchip Test 2')\n", - "pyplot.legend(['y = 1', 'y = 0'])\n", - "pyplot.grid(False)\n", - "pyplot.title('lambda = %0.2f' % lambda_)\n", - "\n", - "# Compute accuracy on our training set\n", - "p = predict(theta, X)\n", - "\n", - "print('Train Accuracy: %.1f %%' % (np.mean(p == y) * 100))\n", - "print('Expected accuracy (with lambda = 1): 83.1 % (approx)\\n')\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You do not need to submit any solutions for these optional (ungraded) exercises.*" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/Exercise2/utils.py b/Exercise2/utils.py deleted file mode 100755 index 7c52dbe4..00000000 --- a/Exercise2/utils.py +++ /dev/null @@ -1,147 +0,0 @@ -import sys -import numpy as np -from matplotlib import pyplot - -sys.path.append('..') -from submission import SubmissionBase - - -def mapFeature(X1, X2, degree=6): - """ - Maps the two input features to quadratic features used in the regularization exercise. - - Returns a new feature array with more features, comprising of - X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc.. - - Parameters - ---------- - X1 : array_like - A vector of shape (m, 1), containing one feature for all examples. - - X2 : array_like - A vector of shape (m, 1), containing a second feature for all examples. - Inputs X1, X2 must be the same size. - - degree: int, optional - The polynomial degree. - - Returns - ------- - : array_like - A matrix of of m rows, and columns depend on the degree of polynomial. - """ - if X1.ndim > 0: - out = [np.ones(X1.shape[0])] - else: - out = [np.ones(1)] - - for i in range(1, degree + 1): - for j in range(i + 1): - out.append((X1 ** (i - j)) * (X2 ** j)) - - if X1.ndim > 0: - return np.stack(out, axis=1) - else: - return np.array(out) - - -def plotDecisionBoundary(plotData, theta, X, y): - """ - Plots the data points X and y into a new figure with the decision boundary defined by theta. - Plots the data points with * for the positive examples and o for the negative examples. - - Parameters - ---------- - plotData : func - A function reference for plotting the X, y data. - - theta : array_like - Parameters for logistic regression. A vector of shape (n+1, ). - - X : array_like - The input dataset. X is assumed to be a either: - 1) Mx3 matrix, where the first column is an all ones column for the intercept. - 2) MxN, N>3 matrix, where the first column is all ones. - - y : array_like - Vector of data labels of shape (m, ). - """ - # make sure theta is a numpy array - theta = np.array(theta) - - # Plot Data (remember first column in X is the intercept) - plotData(X[:, 1:3], y) - - if X.shape[1] <= 3: - # Only need 2 points to define a line, so choose two endpoints - plot_x = np.array([np.min(X[:, 1]) - 2, np.max(X[:, 1]) + 2]) - - # Calculate the decision boundary line - plot_y = (-1. / theta[2]) * (theta[1] * plot_x + theta[0]) - - # Plot, and adjust axes for better viewing - pyplot.plot(plot_x, plot_y) - - # Legend, specific for the exercise - pyplot.legend(['Admitted', 'Not admitted', 'Decision Boundary']) - pyplot.xlim([30, 100]) - pyplot.ylim([30, 100]) - else: - # Here is the grid range - u = np.linspace(-1, 1.5, 50) - v = np.linspace(-1, 1.5, 50) - - z = np.zeros((u.size, v.size)) - # Evaluate z = theta*x over the grid - for i, ui in enumerate(u): - for j, vj in enumerate(v): - z[i, j] = np.dot(mapFeature(ui, vj), theta) - - z = z.T # important to transpose z before calling contour - # print(z) - - # Plot z = 0 - pyplot.contour(u, v, z, levels=[0], linewidths=2, colors='g') - pyplot.contourf(u, v, z, levels=[np.min(z), 0, np.max(z)], cmap='Greens', alpha=0.4) - - -class Grader(SubmissionBase): - X = np.stack([np.ones(20), - np.exp(1) * np.sin(np.arange(1, 21)), - np.exp(0.5) * np.cos(np.arange(1, 21))], axis=1) - - y = (np.sin(X[:, 0] + X[:, 1]) > 0).astype(float) - - def __init__(self): - part_names = ['Sigmoid Function', - 'Logistic Regression Cost', - 'Logistic Regression Gradient', - 'Predict', - 'Regularized Logistic Regression Cost', - 'Regularized Logistic Regression Gradient'] - super().__init__('logistic-regression', part_names) - - def __iter__(self): - for part_id in range(1, 7): - try: - func = self.functions[part_id] - - # Each part has different expected arguments/different function - if part_id == 1: - res = func(self.X) - elif part_id == 2: - res = func(np.array([0.25, 0.5, -0.5]), self.X, self.y) - elif part_id == 3: - J, grad = func(np.array([0.25, 0.5, -0.5]), self.X, self.y) - res = grad - elif part_id == 4: - res = func(np.array([0.25, 0.5, -0.5]), self.X) - elif part_id == 5: - res = func(np.array([0.25, 0.5, -0.5]), self.X, self.y, 0.1) - elif part_id == 6: - res = func(np.array([0.25, 0.5, -0.5]), self.X, self.y, 0.1)[1] - else: - raise KeyError - yield part_id, res - except KeyError: - yield part_id, 0 diff --git a/Exercise3/Figures/neuralnetwork.png b/Exercise3/Figures/neuralnetwork.png deleted file mode 100755 index 140fdb01..00000000 Binary files a/Exercise3/Figures/neuralnetwork.png and /dev/null differ diff --git a/Exercise3/exercise3.ipynb b/Exercise3/exercise3.ipynb deleted file mode 100755 index e37be91f..00000000 --- a/Exercise3/exercise3.ipynb +++ /dev/null @@ -1,923 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Programming Exercise 3\n", - "# Multi-class Classification and Neural Networks\n", - "\n", - "## Introduction\n", - "\n", - "\n", - "In this exercise, you will implement one-vs-all logistic regression and neural networks to recognize handwritten digits. Before starting the programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics. \n", - "\n", - "All the information you need for solving this assignment is in this notebook, and all the code you will be implementing will take place within this notebook. The assignment can be promptly submitted to the coursera grader directly from this notebook (code and instructions are included below).\n", - "\n", - "Before we begin with the exercises, we need to import all libraries required for this programming exercise. Throughout the course, we will be using [`numpy`](http://www.numpy.org/) for all arrays and matrix operations, [`matplotlib`](https://matplotlib.org/) for plotting, and [`scipy`](https://docs.scipy.org/doc/scipy/reference/) for scientific and numerical computation functions and tools. You can find instructions on how to install required libraries in the README file in the [github repository](https://github.com/dibgerge/ml-coursera-python-assignments)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# used for manipulating directory paths\n", - "import os\n", - "\n", - "# Scientific and vector computation for python\n", - "import numpy as np\n", - "\n", - "# Plotting library\n", - "from matplotlib import pyplot\n", - "\n", - "# Optimization module in scipy\n", - "from scipy import optimize\n", - "\n", - "# will be used to load MATLAB mat datafile format\n", - "from scipy.io import loadmat\n", - "\n", - "# library written for this exercise providing additional functions for assignment submission, and others\n", - "import utils\n", - "\n", - "# define the submission/grader object for this exercise\n", - "grader = utils.Grader()\n", - "\n", - "# tells matplotlib to embed plots within the notebook\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Submission and Grading\n", - "\n", - "\n", - "After completing each part of the assignment, be sure to submit your solutions to the grader. The following is a breakdown of how each part of this exercise is scored.\n", - "\n", - "\n", - "| Section | Part | Submission function | Points \n", - "| :- |:- | :- | :-: \n", - "| 1 | [Regularized Logistic Regression](#section1) | [`lrCostFunction`](#lrCostFunction) | 30 \n", - "| 2 | [One-vs-all classifier training](#section2) | [`oneVsAll`](#oneVsAll) | 20 \n", - "| 3 | [One-vs-all classifier prediction](#section3) | [`predictOneVsAll`](#predictOneVsAll) | 20 \n", - "| 4 | [Neural Network Prediction Function](#section4) | [`predict`](#predict) | 30\n", - "| | Total Points | | 100 \n", - "\n", - "\n", - "You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "
\n", - "At the end of each section in this notebook, we have a cell which contains code for submitting the solutions thus far to the grader. Execute the cell to see your score up to the current section. For all your work to be submitted properly, you must execute those cells at least once. They must also be re-executed everytime the submitted function is updated.\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 1 Multi-class Classification\n", - "\n", - "For this exercise, you will use logistic regression and neural networks to recognize handwritten digits (from 0 to 9). Automated handwritten digit recognition is widely used today - from recognizing zip codes (postal codes)\n", - "on mail envelopes to recognizing amounts written on bank checks. This exercise will show you how the methods you have learned can be used for this classification task.\n", - "\n", - "In the first part of the exercise, you will extend your previous implementation of logistic regression and apply it to one-vs-all classification.\n", - "\n", - "### 1.1 Dataset\n", - "\n", - "You are given a data set in `ex3data1.mat` that contains 5000 training examples of handwritten digits (This is a subset of the [MNIST](http://yann.lecun.com/exdb/mnist) handwritten digit dataset). The `.mat` format means that that the data has been saved in a native Octave/MATLAB matrix format, instead of a text (ASCII) format like a csv-file. We use the `.mat` format here because this is the dataset provided in the MATLAB version of this assignment. Fortunately, python provides mechanisms to load MATLAB native format using the `loadmat` function within the `scipy.io` module. This function returns a python dictionary with keys containing the variable names within the `.mat` file. \n", - "\n", - "There are 5000 training examples in `ex3data1.mat`, where each training example is a 20 pixel by 20 pixel grayscale image of the digit. Each pixel is represented by a floating point number indicating the grayscale intensity at that location. The 20 by 20 grid of pixels is “unrolled” into a 400-dimensional vector. Each of these training examples becomes a single row in our data matrix `X`. This gives us a 5000 by 400 matrix `X` where every row is a training example for a handwritten digit image.\n", - "\n", - "$$ X = \\begin{bmatrix} - \\: (x^{(1)})^T \\: - \\\\ -\\: (x^{(2)})^T \\:- \\\\ \\vdots \\\\ - \\: (x^{(m)})^T \\:- \\end{bmatrix} $$\n", - "\n", - "The second part of the training set is a 5000-dimensional vector `y` that contains labels for the training set. \n", - "We start the exercise by first loading the dataset. Execute the cell below, you do not need to write any code here." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# 20x20 Input Images of Digits\n", - "input_layer_size = 400\n", - "\n", - "# 10 labels, from 1 to 10 (note that we have mapped \"0\" to label 10)\n", - "num_labels = 10\n", - "\n", - "# training data stored in arrays X, y\n", - "data = loadmat(os.path.join('Data', 'ex3data1.mat'))\n", - "X, y = data['X'], data['y'].ravel()\n", - "\n", - "# set the zero digit to 0, rather than its mapped 10 in this dataset\n", - "# This is an artifact due to the fact that this dataset was used in \n", - "# MATLAB where there is no index 0\n", - "y[y == 10] = 0\n", - "\n", - "m = y.size" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.2 Visualizing the data\n", - "\n", - "You will begin by visualizing a subset of the training set. In the following cell, the code randomly selects selects 100 rows from `X` and passes those rows to the `displayData` function. This function maps each row to a 20 pixel by 20 pixel grayscale image and displays the images together. We have provided the `displayData` function in the file `utils.py`. You are encouraged to examine the code to see how it works. Run the following cell to visualize the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Randomly select 100 data points to display\n", - "rand_indices = np.random.choice(m, 100, replace=False)\n", - "sel = X[rand_indices, :]\n", - "\n", - "utils.displayData(sel)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "collapsed": true - }, - "source": [ - "### 1.3 Vectorizing Logistic Regression\n", - "\n", - "You will be using multiple one-vs-all logistic regression models to build a multi-class classifier. Since there are 10 classes, you will need to train 10 separate logistic regression classifiers. To make this training efficient, it is important to ensure that your code is well vectorized. In this section, you will implement a vectorized version of logistic regression that does not employ any `for` loops. You can use your code in the previous exercise as a starting point for this exercise. \n", - "\n", - "To test your vectorized logistic regression, we will use custom data as defined in the following cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# test values for the parameters theta\n", - "theta_t = np.array([-2, -1, 1, 2], dtype=float)\n", - "\n", - "# test values for the inputs\n", - "X_t = np.concatenate([np.ones((5, 1)), np.arange(1, 16).reshape(5, 3, order='F')/10.0], axis=1)\n", - "\n", - "# test values for the labels\n", - "y_t = np.array([1, 0, 1, 0, 1])\n", - "\n", - "# test value for the regularization parameter\n", - "lambda_t = 3" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### 1.3.1 Vectorizing the cost function \n", - "\n", - "We will begin by writing a vectorized version of the cost function. Recall that in (unregularized) logistic regression, the cost function is\n", - "\n", - "$$ J(\\theta) = \\frac{1}{m} \\sum_{i=1}^m \\left[ -y^{(i)} \\log \\left( h_\\theta\\left( x^{(i)} \\right) \\right) - \\left(1 - y^{(i)} \\right) \\log \\left(1 - h_\\theta \\left( x^{(i)} \\right) \\right) \\right] $$\n", - "\n", - "To compute each element in the summation, we have to compute $h_\\theta(x^{(i)})$ for every example $i$, where $h_\\theta(x^{(i)}) = g(\\theta^T x^{(i)})$ and $g(z) = \\frac{1}{1+e^{-z}}$ is the sigmoid function. It turns out that we can compute this quickly for all our examples by using matrix multiplication. Let us define $X$ and $\\theta$ as\n", - "\n", - "$$ X = \\begin{bmatrix} - \\left( x^{(1)} \\right)^T - \\\\ - \\left( x^{(2)} \\right)^T - \\\\ \\vdots \\\\ - \\left( x^{(m)} \\right)^T - \\end{bmatrix} \\qquad \\text{and} \\qquad \\theta = \\begin{bmatrix} \\theta_0 \\\\ \\theta_1 \\\\ \\vdots \\\\ \\theta_n \\end{bmatrix} $$\n", - "\n", - "Then, by computing the matrix product $X\\theta$, we have: \n", - "\n", - "$$ X\\theta = \\begin{bmatrix} - \\left( x^{(1)} \\right)^T\\theta - \\\\ - \\left( x^{(2)} \\right)^T\\theta - \\\\ \\vdots \\\\ - \\left( x^{(m)} \\right)^T\\theta - \\end{bmatrix} = \\begin{bmatrix} - \\theta^T x^{(1)} - \\\\ - \\theta^T x^{(2)} - \\\\ \\vdots \\\\ - \\theta^T x^{(m)} - \\end{bmatrix} $$\n", - "\n", - "In the last equality, we used the fact that $a^Tb = b^Ta$ if $a$ and $b$ are vectors. This allows us to compute the products $\\theta^T x^{(i)}$ for all our examples $i$ in one line of code.\n", - "\n", - "#### 1.3.2 Vectorizing the gradient\n", - "\n", - "Recall that the gradient of the (unregularized) logistic regression cost is a vector where the $j^{th}$ element is defined as\n", - "\n", - "$$ \\frac{\\partial J }{\\partial \\theta_j} = \\frac{1}{m} \\sum_{i=1}^m \\left( \\left( h_\\theta\\left(x^{(i)}\\right) - y^{(i)} \\right)x_j^{(i)} \\right) $$\n", - "\n", - "To vectorize this operation over the dataset, we start by writing out all the partial derivatives explicitly for all $\\theta_j$,\n", - "\n", - "$$\n", - "\\begin{align*}\n", - "\\begin{bmatrix} \n", - "\\frac{\\partial J}{\\partial \\theta_0} \\\\\n", - "\\frac{\\partial J}{\\partial \\theta_1} \\\\\n", - "\\frac{\\partial J}{\\partial \\theta_2} \\\\\n", - "\\vdots \\\\\n", - "\\frac{\\partial J}{\\partial \\theta_n}\n", - "\\end{bmatrix} = &\n", - "\\frac{1}{m} \\begin{bmatrix}\n", - "\\sum_{i=1}^m \\left( \\left(h_\\theta\\left(x^{(i)}\\right) - y^{(i)} \\right)x_0^{(i)}\\right) \\\\\n", - "\\sum_{i=1}^m \\left( \\left(h_\\theta\\left(x^{(i)}\\right) - y^{(i)} \\right)x_1^{(i)}\\right) \\\\\n", - "\\sum_{i=1}^m \\left( \\left(h_\\theta\\left(x^{(i)}\\right) - y^{(i)} \\right)x_2^{(i)}\\right) \\\\\n", - "\\vdots \\\\\n", - "\\sum_{i=1}^m \\left( \\left(h_\\theta\\left(x^{(i)}\\right) - y^{(i)} \\right)x_n^{(i)}\\right) \\\\\n", - "\\end{bmatrix} \\\\\n", - "= & \\frac{1}{m} \\sum_{i=1}^m \\left( \\left(h_\\theta\\left(x^{(i)}\\right) - y^{(i)} \\right)x^{(i)}\\right) \\\\\n", - "= & \\frac{1}{m} X^T \\left( h_\\theta(x) - y\\right)\n", - "\\end{align*}\n", - "$$\n", - "\n", - "where\n", - "\n", - "$$ h_\\theta(x) - y = \n", - "\\begin{bmatrix}\n", - "h_\\theta\\left(x^{(1)}\\right) - y^{(1)} \\\\\n", - "h_\\theta\\left(x^{(2)}\\right) - y^{(2)} \\\\\n", - "\\vdots \\\\\n", - "h_\\theta\\left(x^{(m)}\\right) - y^{(m)} \n", - "\\end{bmatrix} $$\n", - "\n", - "Note that $x^{(i)}$ is a vector, while $h_\\theta\\left(x^{(i)}\\right) - y^{(i)}$ is a scalar (single number).\n", - "To understand the last step of the derivation, let $\\beta_i = (h_\\theta\\left(x^{(m)}\\right) - y^{(m)})$ and\n", - "observe that:\n", - "\n", - "$$ \\sum_i \\beta_ix^{(i)} = \\begin{bmatrix} \n", - "| & | & & | \\\\\n", - "x^{(1)} & x^{(2)} & \\cdots & x^{(m)} \\\\\n", - "| & | & & | \n", - "\\end{bmatrix}\n", - "\\begin{bmatrix}\n", - "\\beta_1 \\\\\n", - "\\beta_2 \\\\\n", - "\\vdots \\\\\n", - "\\beta_m\n", - "\\end{bmatrix} = x^T \\beta\n", - "$$\n", - "\n", - "where the values $\\beta_i = \\left( h_\\theta(x^{(i)} - y^{(i)} \\right)$.\n", - "\n", - "The expression above allows us to compute all the partial derivatives\n", - "without any loops. If you are comfortable with linear algebra, we encourage you to work through the matrix multiplications above to convince yourself that the vectorized version does the same computations. \n", - "\n", - "Your job is to write the unregularized cost function `lrCostFunction` which returns both the cost function $J(\\theta)$ and its gradient $\\frac{\\partial J}{\\partial \\theta}$. Your implementation should use the strategy we presented above to calculate $\\theta^T x^{(i)}$. You should also use a vectorized approach for the rest of the cost function. A fully vectorized version of `lrCostFunction` should not contain any loops.\n", - "\n", - "
\n", - "**Debugging Tip:** Vectorizing code can sometimes be tricky. One common strategy for debugging is to print out the sizes of the matrices you are working with using the `shape` property of `numpy` arrays. For example, given a data matrix $X$ of size $100 \\times 20$ (100 examples, 20 features) and $\\theta$, a vector with size $20$, you can observe that `np.dot(X, theta)` is a valid multiplication operation, while `np.dot(theta, X)` is not. Furthermore, if you have a non-vectorized version of your code, you can compare the output of your vectorized code and non-vectorized code to make sure that they produce the same outputs.\n", - "
\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def lrCostFunction(theta, X, y, lambda_):\n", - " \"\"\"\n", - " Computes the cost of using theta as the parameter for regularized\n", - " logistic regression and the gradient of the cost w.r.t. to the parameters.\n", - " \n", - " Parameters\n", - " ----------\n", - " theta : array_like\n", - " Logistic regression parameters. A vector with shape (n, ). n is \n", - " the number of features including any intercept. \n", - " \n", - " X : array_like\n", - " The data set with shape (m x n). m is the number of examples, and\n", - " n is the number of features (including intercept).\n", - " \n", - " y : array_like\n", - " The data labels. A vector with shape (m, ).\n", - " \n", - " lambda_ : float\n", - " The regularization parameter. \n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The computed value for the regularized cost function. \n", - " \n", - " grad : array_like\n", - " A vector of shape (n, ) which is the gradient of the cost\n", - " function with respect to theta, at the current values of theta.\n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the cost of a particular choice of theta. You should set J to the cost.\n", - " Compute the partial derivatives and set grad to the partial\n", - " derivatives of the cost w.r.t. each parameter in theta\n", - " \n", - " Hint 1\n", - " ------\n", - " The computation of the cost function and gradients can be efficiently\n", - " vectorized. For example, consider the computation\n", - " \n", - " sigmoid(X * theta)\n", - " \n", - " Each row of the resulting matrix will contain the value of the prediction\n", - " for that example. You can make use of this to vectorize the cost function\n", - " and gradient computations. \n", - " \n", - " Hint 2\n", - " ------\n", - " When computing the gradient of the regularized cost function, there are\n", - " many possible vectorized solutions, but one solution looks like:\n", - " \n", - " grad = (unregularized gradient for logistic regression)\n", - " temp = theta \n", - " temp[0] = 0 # because we don't add anything for j = 0\n", - " grad = grad + YOUR_CODE_HERE (using the temp variable)\n", - " \n", - " Hint 3\n", - " ------\n", - " We have provided the implementatation of the sigmoid function within \n", - " the file `utils.py`. At the start of the notebook, we imported this file\n", - " as a module. Thus to access the sigmoid function within that file, you can\n", - " do the following: `utils.sigmoid(z)`.\n", - " \n", - " \"\"\"\n", - " #Initialize some useful values\n", - " m = y.size\n", - " \n", - " # convert labels to ints if their type is bool\n", - " if y.dtype == bool:\n", - " y = y.astype(int)\n", - " \n", - " # You need to return the following variables correctly\n", - " J = 0\n", - " grad = np.zeros(theta.shape)\n", - " \n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - " \n", - " # =============================================================\n", - " return J, grad" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 1.3.3 Vectorizing regularized logistic regression\n", - "\n", - "After you have implemented vectorization for logistic regression, you will now\n", - "add regularization to the cost function. Recall that for regularized logistic\n", - "regression, the cost function is defined as\n", - "\n", - "$$ J(\\theta) = \\frac{1}{m} \\sum_{i=1}^m \\left[ -y^{(i)} \\log \\left(h_\\theta\\left(x^{(i)} \\right)\\right) - \\left( 1 - y^{(i)} \\right) \\log\\left(1 - h_\\theta \\left(x^{(i)} \\right) \\right) \\right] + \\frac{\\lambda}{2m} \\sum_{j=1}^n \\theta_j^2 $$\n", - "\n", - "Note that you should not be regularizing $\\theta_0$ which is used for the bias term.\n", - "Correspondingly, the partial derivative of regularized logistic regression cost for $\\theta_j$ is defined as\n", - "\n", - "$$\n", - "\\begin{align*}\n", - "& \\frac{\\partial J(\\theta)}{\\partial \\theta_0} = \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta\\left( x^{(i)} \\right) - y^{(i)} \\right) x_j^{(i)} & \\text{for } j = 0 \\\\\n", - "& \\frac{\\partial J(\\theta)}{\\partial \\theta_0} = \\left( \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta\\left( x^{(i)} \\right) - y^{(i)} \\right) x_j^{(i)} \\right) + \\frac{\\lambda}{m} \\theta_j & \\text{for } j \\ge 1\n", - "\\end{align*}\n", - "$$\n", - "\n", - "Now modify your code in lrCostFunction in the [**previous cell**](#lrCostFunction) to account for regularization. Once again, you should not put any loops into your code.\n", - "\n", - "
\n", - "**python/numpy Tip:** When implementing the vectorization for regularized logistic regression, you might often want to only sum and update certain elements of $\\theta$. In `numpy`, you can index into the matrices to access and update only certain elements. For example, A[:, 3:5]\n", - "= B[:, 1:3] will replaces the columns with index 3 to 5 of A with the columns with index 1 to 3 from B. To select columns (or rows) until the end of the matrix, you can leave the right hand side of the colon blank. For example, A[:, 2:] will only return elements from the $3^{rd}$ to last columns of $A$. If you leave the left hand size of the colon blank, you will select elements from the beginning of the matrix. For example, A[:, :2] selects the first two columns, and is equivalent to A[:, 0:2]. In addition, you can use negative indices to index arrays from the end. Thus, A[:, :-1] selects all columns of A except the last column, and A[:, -5:] selects the $5^{th}$ column from the end to the last column. Thus, you could use this together with the sum and power ($^{**}$) operations to compute the sum of only the elements you are interested in (e.g., `np.sum(z[1:]**2)`). In the starter code, `lrCostFunction`, we have also provided hints on yet another possible method computing the regularized gradient.\n", - "
\n", - "\n", - "Once you finished your implementation, you can call the function `lrCostFunction` to test your solution using the following cell:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "J, grad = lrCostFunction(theta_t, X_t, y_t, lambda_t)\n", - "\n", - "print('Cost : {:.6f}'.format(J))\n", - "print('Expected cost: 2.534819')\n", - "print('-----------------------')\n", - "print('Gradients:')\n", - "print(' [{:.6f}, {:.6f}, {:.6f}, {:.6f}]'.format(*grad))\n", - "print('Expected gradients:')\n", - "print(' [0.146561, -0.548558, 0.724722, 1.398003]');" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After completing a part of the exercise, you can submit your solutions for grading by first adding the function you modified to the submission object, and then sending your function to Coursera for grading. \n", - "\n", - "The submission script will prompt you for your login e-mail and submission token. You can obtain a submission token from the web page for the assignment. You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "*Execute the following cell to grade your solution to the first part of this exercise.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# appends the implemented function in part 1 to the grader object\n", - "grader[1] = lrCostFunction\n", - "\n", - "# send the added functions to coursera grader for getting a grade on this part\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 1.4 One-vs-all Classification\n", - "\n", - "In this part of the exercise, you will implement one-vs-all classification by training multiple regularized logistic regression classifiers, one for each of the $K$ classes in our dataset. In the handwritten digits dataset, $K = 10$, but your code should work for any value of $K$. \n", - "\n", - "You should now complete the code for the function `oneVsAll` below, to train one classifier for each class. In particular, your code should return all the classifier parameters in a matrix $\\theta \\in \\mathbb{R}^{K \\times (N +1)}$, where each row of $\\theta$ corresponds to the learned logistic regression parameters for one class. You can do this with a “for”-loop from $0$ to $K-1$, training each classifier independently.\n", - "\n", - "Note that the `y` argument to this function is a vector of labels from 0 to 9. When training the classifier for class $k \\in \\{0, ..., K-1\\}$, you will want a K-dimensional vector of labels $y$, where $y_j \\in 0, 1$ indicates whether the $j^{th}$ training instance belongs to class $k$ $(y_j = 1)$, or if it belongs to a different\n", - "class $(y_j = 0)$. You may find logical arrays helpful for this task. \n", - "\n", - "Furthermore, you will be using scipy's `optimize.minimize` for this exercise. \n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def oneVsAll(X, y, num_labels, lambda_):\n", - " \"\"\"\n", - " Trains num_labels logistic regression classifiers and returns\n", - " each of these classifiers in a matrix all_theta, where the i-th\n", - " row of all_theta corresponds to the classifier for label i.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The input dataset of shape (m x n). m is the number of \n", - " data points, and n is the number of features. Note that we \n", - " do not assume that the intercept term (or bias) is in X, however\n", - " we provide the code below to add the bias term to X. \n", - " \n", - " y : array_like\n", - " The data labels. A vector of shape (m, ).\n", - " \n", - " num_labels : int\n", - " Number of possible labels.\n", - " \n", - " lambda_ : float\n", - " The logistic regularization parameter.\n", - " \n", - " Returns\n", - " -------\n", - " all_theta : array_like\n", - " The trained parameters for logistic regression for each class.\n", - " This is a matrix of shape (K x n+1) where K is number of classes\n", - " (ie. `numlabels`) and n is number of features without the bias.\n", - " \n", - " Instructions\n", - " ------------\n", - " You should complete the following code to train `num_labels`\n", - " logistic regression classifiers with regularization parameter `lambda_`. \n", - " \n", - " Hint\n", - " ----\n", - " You can use y == c to obtain a vector of 1's and 0's that tell you\n", - " whether the ground truth is true/false for this class.\n", - " \n", - " Note\n", - " ----\n", - " For this assignment, we recommend using `scipy.optimize.minimize(method='CG')`\n", - " to optimize the cost function. It is okay to use a for-loop \n", - " (`for c in range(num_labels):`) to loop over the different classes.\n", - " \n", - " Example Code\n", - " ------------\n", - " \n", - " # Set Initial theta\n", - " initial_theta = np.zeros(n + 1)\n", - " \n", - " # Set options for minimize\n", - " options = {'maxiter': 50}\n", - " \n", - " # Run minimize to obtain the optimal theta. This function will \n", - " # return a class object where theta is in `res.x` and cost in `res.fun`\n", - " res = optimize.minimize(lrCostFunction, \n", - " initial_theta, \n", - " (X, (y == c), lambda_), \n", - " jac=True, \n", - " method='TNC',\n", - " options=options) \n", - " \"\"\"\n", - " # Some useful variables\n", - " m, n = X.shape\n", - " \n", - " # You need to return the following variables correctly \n", - " all_theta = np.zeros((num_labels, n + 1))\n", - "\n", - " # Add ones to the X data matrix\n", - " X = np.concatenate([np.ones((m, 1)), X], axis=1)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - " \n", - "\n", - "\n", - " # ============================================================\n", - " return all_theta" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After you have completed the code for `oneVsAll`, the following cell will use your implementation to train a multi-class classifier. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "lambda_ = 0.1\n", - "all_theta = oneVsAll(X, y, num_labels, lambda_)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[2] = oneVsAll\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### 1.4.1 One-vs-all Prediction\n", - "\n", - "After training your one-vs-all classifier, you can now use it to predict the digit contained in a given image. For each input, you should compute the “probability” that it belongs to each class using the trained logistic regression classifiers. Your one-vs-all prediction function will pick the class for which the corresponding logistic regression classifier outputs the highest probability and return the class label (0, 1, ..., K-1) as the prediction for the input example. You should now complete the code in the function `predictOneVsAll` to use the one-vs-all classifier for making predictions. \n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def predictOneVsAll(all_theta, X):\n", - " \"\"\"\n", - " Return a vector of predictions for each example in the matrix X. \n", - " Note that X contains the examples in rows. all_theta is a matrix where\n", - " the i-th row is a trained logistic regression theta vector for the \n", - " i-th class. You should set p to a vector of values from 0..K-1 \n", - " (e.g., p = [0, 2, 0, 1] predicts classes 0, 2, 0, 1 for 4 examples) .\n", - " \n", - " Parameters\n", - " ----------\n", - " all_theta : array_like\n", - " The trained parameters for logistic regression for each class.\n", - " This is a matrix of shape (K x n+1) where K is number of classes\n", - " and n is number of features without the bias.\n", - " \n", - " X : array_like\n", - " Data points to predict their labels. This is a matrix of shape \n", - " (m x n) where m is number of data points to predict, and n is number \n", - " of features without the bias term. Note we add the bias term for X in \n", - " this function. \n", - " \n", - " Returns\n", - " -------\n", - " p : array_like\n", - " The predictions for each data point in X. This is a vector of shape (m, ).\n", - " \n", - " Instructions\n", - " ------------\n", - " Complete the following code to make predictions using your learned logistic\n", - " regression parameters (one-vs-all). You should set p to a vector of predictions\n", - " (from 0 to num_labels-1).\n", - " \n", - " Hint\n", - " ----\n", - " This code can be done all vectorized using the numpy argmax function.\n", - " In particular, the argmax function returns the index of the max element,\n", - " for more information see '?np.argmax' or search online. If your examples\n", - " are in rows, then, you can use np.argmax(A, axis=1) to obtain the index \n", - " of the max for each row.\n", - " \"\"\"\n", - " m = X.shape[0];\n", - " num_labels = all_theta.shape[0]\n", - "\n", - " # You need to return the following variables correctly \n", - " p = np.zeros(m)\n", - "\n", - " # Add ones to the X data matrix\n", - " X = np.concatenate([np.ones((m, 1)), X], axis=1)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - " \n", - " # ============================================================\n", - " return p" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you are done, call your `predictOneVsAll` function using the learned value of $\\theta$. You should see that the training set accuracy is about 95.1% (i.e., it classifies 95.1% of the examples in the training set correctly)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pred = predictOneVsAll(all_theta, X)\n", - "print('Training Set Accuracy: {:.2f}%'.format(np.mean(pred == y) * 100))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[3] = predictOneVsAll\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2 Neural Networks\n", - "\n", - "In the previous part of this exercise, you implemented multi-class logistic regression to recognize handwritten digits. However, logistic regression cannot form more complex hypotheses as it is only a linear classifier (You could add more features - such as polynomial features - to logistic regression, but that can be very expensive to train).\n", - "\n", - "In this part of the exercise, you will implement a neural network to recognize handwritten digits using the same training set as before. The neural network will be able to represent complex models that form non-linear hypotheses. For this week, you will be using parameters from a neural network that we have already trained. Your goal is to implement the feedforward propagation algorithm to use our weights for prediction. In next week’s exercise, you will write the backpropagation algorithm for learning the neural network parameters. \n", - "\n", - "We start by first reloading and visualizing the dataset which contains the MNIST handwritten digits (this is the same as we did in the first part of this exercise, we reload it here to ensure the variables have not been modified). " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# training data stored in arrays X, y\n", - "data = loadmat(os.path.join('Data', 'ex3data1.mat'))\n", - "X, y = data['X'], data['y'].ravel()\n", - "\n", - "# set the zero digit to 0, rather than its mapped 10 in this dataset\n", - "# This is an artifact due to the fact that this dataset was used in \n", - "# MATLAB where there is no index 0\n", - "y[y == 10] = 0\n", - "\n", - "# get number of examples in dataset\n", - "m = y.size\n", - "\n", - "# randomly permute examples, to be used for visualizing one \n", - "# picture at a time\n", - "indices = np.random.permutation(m)\n", - "\n", - "# Randomly select 100 data points to display\n", - "rand_indices = np.random.choice(m, 100, replace=False)\n", - "sel = X[rand_indices, :]\n", - "\n", - "utils.displayData(sel)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.1 Model representation \n", - "\n", - "Our neural network is shown in the following figure.\n", - "\n", - "![Neural network](Figures/neuralnetwork.png)\n", - "\n", - "It has 3 layers: an input layer, a hidden layer and an output layer. Recall that our inputs are pixel values of digit images. Since the images are of size 20×20, this gives us 400 input layer units (excluding the extra bias unit which always outputs +1). As before, the training data will be loaded into the variables X and y. \n", - "\n", - "You have been provided with a set of network parameters ($\\Theta^{(1)}$, $\\Theta^{(2)}$) already trained by us. These are stored in `ex3weights.mat`. The following cell loads those parameters into `Theta1` and `Theta2`. The parameters have dimensions that are sized for a neural network with 25 units in the second layer and 10 output units (corresponding to the 10 digit classes)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Setup the parameters you will use for this exercise\n", - "input_layer_size = 400 # 20x20 Input Images of Digits\n", - "hidden_layer_size = 25 # 25 hidden units\n", - "num_labels = 10 # 10 labels, from 0 to 9\n", - "\n", - "# Load the .mat file, which returns a dictionary \n", - "weights = loadmat(os.path.join('Data', 'ex3weights.mat'))\n", - "\n", - "# get the model weights from the dictionary\n", - "# Theta1 has size 25 x 401\n", - "# Theta2 has size 10 x 26\n", - "Theta1, Theta2 = weights['Theta1'], weights['Theta2']\n", - "\n", - "# swap first and last columns of Theta2, due to legacy from MATLAB indexing, \n", - "# since the weight file ex3weights.mat was saved based on MATLAB indexing\n", - "Theta2 = np.roll(Theta2, 1, axis=0)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.2 Feedforward Propagation and Prediction\n", - "\n", - "Now you will implement feedforward propagation for the neural network. You will need to complete the code in the function `predict` to return the neural network’s prediction. You should implement the feedforward computation that computes $h_\\theta(x^{(i)})$ for every example $i$ and returns the associated predictions. Similar to the one-vs-all classification strategy, the prediction from the neural network will be the label that has the largest output $\\left( h_\\theta(x) \\right)_k$.\n", - "\n", - "
\n", - "**Implementation Note:** The matrix $X$ contains the examples in rows. When you complete the code in the function `predict`, you will need to add the column of 1’s to the matrix. The matrices `Theta1` and `Theta2` contain the parameters for each unit in rows. Specifically, the first row of `Theta1` corresponds to the first hidden unit in the second layer. In `numpy`, when you compute $z^{(2)} = \\theta^{(1)}a^{(1)}$, be sure that you index (and if necessary, transpose) $X$ correctly so that you get $a^{(l)}$ as a 1-D vector.\n", - "
\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def predict(Theta1, Theta2, X):\n", - " \"\"\"\n", - " Predict the label of an input given a trained neural network.\n", - " \n", - " Parameters\n", - " ----------\n", - " Theta1 : array_like\n", - " Weights for the first layer in the neural network.\n", - " It has shape (2nd hidden layer size x input size)\n", - " \n", - " Theta2: array_like\n", - " Weights for the second layer in the neural network. \n", - " It has shape (output layer size x 2nd hidden layer size)\n", - " \n", - " X : array_like\n", - " The image inputs having shape (number of examples x image dimensions).\n", - " \n", - " Return \n", - " ------\n", - " p : array_like\n", - " Predictions vector containing the predicted label for each example.\n", - " It has a length equal to the number of examples.\n", - " \n", - " Instructions\n", - " ------------\n", - " Complete the following code to make predictions using your learned neural\n", - " network. You should set p to a vector containing labels \n", - " between 0 to (num_labels-1).\n", - " \n", - " Hint\n", - " ----\n", - " This code can be done all vectorized using the numpy argmax function.\n", - " In particular, the argmax function returns the index of the max element,\n", - " for more information see '?np.argmax' or search online. If your examples\n", - " are in rows, then, you can use np.argmax(A, axis=1) to obtain the index\n", - " of the max for each row.\n", - " \n", - " Note\n", - " ----\n", - " Remember, we have supplied the `sigmoid` function in the `utils.py` file. \n", - " You can use this function by calling `utils.sigmoid(z)`, where you can \n", - " replace `z` by the required input variable to sigmoid.\n", - " \"\"\"\n", - " # Make sure the input has two dimensions\n", - " if X.ndim == 1:\n", - " X = X[None] # promote to 2-dimensions\n", - " \n", - " # useful variables\n", - " m = X.shape[0]\n", - " num_labels = Theta2.shape[0]\n", - "\n", - " # You need to return the following variables correctly \n", - " p = np.zeros(X.shape[0])\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - "\n", - " # =============================================================\n", - " return p" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you are done, call your predict function using the loaded set of parameters for `Theta1` and `Theta2`. You should see that the accuracy is about 97.5%." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pred = predict(Theta1, Theta2, X)\n", - "print('Training Set Accuracy: {:.1f}%'.format(np.mean(pred == y) * 100))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After that, we will display images from the training set one at a time, while at the same time printing out the predicted label for the displayed image. \n", - "\n", - "Run the following cell to display a single image the the neural network's prediction. You can run the cell multiple time to see predictions for different images." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "if indices.size > 0:\n", - " i, indices = indices[0], indices[1:]\n", - " utils.displayData(X[i, :], figsize=(4, 4))\n", - " pred = predict(Theta1, Theta2, X[i, :])\n", - " print('Neural Network Prediction: {}'.format(*pred))\n", - "else:\n", - " print('No more images to display!')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[4] = predict\n", - "grader.grade()" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/Exercise3/utils.py b/Exercise3/utils.py deleted file mode 100755 index 633a5636..00000000 --- a/Exercise3/utils.py +++ /dev/null @@ -1,104 +0,0 @@ -import sys -import numpy as np -from matplotlib import pyplot - -sys.path.append('..') -from submission import SubmissionBase - - -def displayData(X, example_width=None, figsize=(10, 10)): - """ - Displays 2D data stored in X in a nice grid. - """ - # Compute rows, cols - if X.ndim == 2: - m, n = X.shape - elif X.ndim == 1: - n = X.size - m = 1 - X = X[None] # Promote to a 2 dimensional array - else: - raise IndexError('Input X should be 1 or 2 dimensional.') - - example_width = example_width or int(np.round(np.sqrt(n))) - example_height = n / example_width - - # Compute number of items to display - display_rows = int(np.floor(np.sqrt(m))) - display_cols = int(np.ceil(m / display_rows)) - - fig, ax_array = pyplot.subplots(display_rows, display_cols, figsize=figsize) - fig.subplots_adjust(wspace=0.025, hspace=0.025) - - ax_array = [ax_array] if m == 1 else ax_array.ravel() - - for i, ax in enumerate(ax_array): - ax.imshow(X[i].reshape(example_width, example_width, order='F'), - cmap='Greys', extent=[0, 1, 0, 1]) - ax.axis('off') - - -def sigmoid(z): - """ - Computes the sigmoid of z. - """ - return 1.0 / (1.0 + np.exp(-z)) - - -class Grader(SubmissionBase): - # Random Test Cases - X = np.stack([np.ones(20), - np.exp(1) * np.sin(np.arange(1, 21)), - np.exp(0.5) * np.cos(np.arange(1, 21))], axis=1) - - y = (np.sin(X[:, 0] + X[:, 1]) > 0).astype(float) - - Xm = np.array([[-1, -1], - [-1, -2], - [-2, -1], - [-2, -2], - [1, 1], - [1, 2], - [2, 1], - [2, 2], - [-1, 1], - [-1, 2], - [-2, 1], - [-2, 2], - [1, -1], - [1, -2], - [-2, -1], - [-2, -2]]) - ym = np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]) - - t1 = np.sin(np.reshape(np.arange(1, 25, 2), (4, 3), order='F')) - t2 = np.cos(np.reshape(np.arange(1, 41, 2), (4, 5), order='F')) - - def __init__(self): - part_names = ['Regularized Logistic Regression', - 'One-vs-All Classifier Training', - 'One-vs-All Classifier Prediction', - 'Neural Network Prediction Function'] - - super().__init__('multi-class-classification-and-neural-networks', part_names) - - def __iter__(self): - for part_id in range(1, 5): - try: - func = self.functions[part_id] - - # Each part has different expected arguments/different function - if part_id == 1: - res = func(np.array([0.25, 0.5, -0.5]), self.X, self.y, 0.1) - res = np.hstack(res).tolist() - elif part_id == 2: - res = func(self.Xm, self.ym, 4, 0.1) - elif part_id == 3: - res = func(self.t1, self.Xm) + 1 - elif part_id == 4: - res = func(self.t1, self.t2, self.Xm) + 1 - else: - raise KeyError - yield part_id, res - except KeyError: - yield part_id, 0 diff --git a/Exercise4/Figures/ex4-backpropagation.png b/Exercise4/Figures/ex4-backpropagation.png deleted file mode 100755 index 62e1861f..00000000 Binary files a/Exercise4/Figures/ex4-backpropagation.png and /dev/null differ diff --git a/Exercise4/Figures/neural_network.png b/Exercise4/Figures/neural_network.png deleted file mode 100755 index 140fdb01..00000000 Binary files a/Exercise4/Figures/neural_network.png and /dev/null differ diff --git a/Exercise4/exercise4.ipynb b/Exercise4/exercise4.ipynb deleted file mode 100755 index ab2e6145..00000000 --- a/Exercise4/exercise4.ipynb +++ /dev/null @@ -1,928 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Programming Exercise 4: Neural Networks Learning\n", - "\n", - "## Introduction\n", - "\n", - "In this exercise, you will implement the backpropagation algorithm for neural networks and apply it to the task of hand-written digit recognition. Before starting on the programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics.\n", - "\n", - "\n", - "All the information you need for solving this assignment is in this notebook, and all the code you will be implementing will take place within this notebook. The assignment can be promptly submitted to the coursera grader directly from this notebook (code and instructions are included below).\n", - "\n", - "Before we begin with the exercises, we need to import all libraries required for this programming exercise. Throughout the course, we will be using [`numpy`](http://www.numpy.org/) for all arrays and matrix operations, [`matplotlib`](https://matplotlib.org/) for plotting, and [`scipy`](https://docs.scipy.org/doc/scipy/reference/) for scientific and numerical computation functions and tools. You can find instructions on how to install required libraries in the README file in the [github repository](https://github.com/dibgerge/ml-coursera-python-assignments)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# used for manipulating directory paths\n", - "import os\n", - "\n", - "# Scientific and vector computation for python\n", - "import numpy as np\n", - "\n", - "# Plotting library\n", - "from matplotlib import pyplot\n", - "\n", - "# Optimization module in scipy\n", - "from scipy import optimize\n", - "\n", - "# will be used to load MATLAB mat datafile format\n", - "from scipy.io import loadmat\n", - "\n", - "# library written for this exercise providing additional functions for assignment submission, and others\n", - "import utils\n", - "\n", - "# define the submission/grader object for this exercise\n", - "grader = utils.Grader()\n", - "\n", - "# tells matplotlib to embed plots within the notebook\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Submission and Grading\n", - "\n", - "\n", - "After completing each part of the assignment, be sure to submit your solutions to the grader. The following is a breakdown of how each part of this exercise is scored.\n", - "\n", - "\n", - "| Section | Part | Submission function | Points \n", - "| :- |:- | :- | :-: \n", - "| 1 | [Feedforward and Cost Function](#section1) | [`nnCostFunction`](#nnCostFunction) | 30 \n", - "| 2 | [Regularized Cost Function](#section2) | [`nnCostFunction`](#nnCostFunction) | 15 \n", - "| 3 | [Sigmoid Gradient](#section3) | [`sigmoidGradient`](#sigmoidGradient) | 5 \n", - "| 4 | [Neural Net Gradient Function (Backpropagation)](#section4) | [`nnCostFunction`](#nnCostFunction) | 40 \n", - "| 5 | [Regularized Gradient](#section5) | [`nnCostFunction`](#nnCostFunction) |10 \n", - "| | Total Points | | 100 \n", - "\n", - "\n", - "You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "
\n", - "At the end of each section in this notebook, we have a cell which contains code for submitting the solutions thus far to the grader. Execute the cell to see your score up to the current section. For all your work to be submitted properly, you must execute those cells at least once.\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Neural Networks\n", - "\n", - "In the previous exercise, you implemented feedforward propagation for neural networks and used it to predict handwritten digits with the weights we provided. In this exercise, you will implement the backpropagation algorithm to learn the parameters for the neural network.\n", - "\n", - "We start the exercise by first loading the dataset. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# training data stored in arrays X, y\n", - "data = loadmat(os.path.join('Data', 'ex4data1.mat'))\n", - "X, y = data['X'], data['y'].ravel()\n", - "\n", - "# set the zero digit to 0, rather than its mapped 10 in this dataset\n", - "# This is an artifact due to the fact that this dataset was used in \n", - "# MATLAB where there is no index 0\n", - "y[y == 10] = 0\n", - "\n", - "# Number of training examples\n", - "m = y.size" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.1 Visualizing the data\n", - "\n", - "You will begin by visualizing a subset of the training set, using the function `displayData`, which is the same function we used in Exercise 3. It is provided in the `utils.py` file for this assignment as well. The dataset is also the same one you used in the previous exercise.\n", - "\n", - "There are 5000 training examples in `ex4data1.mat`, where each training example is a 20 pixel by 20 pixel grayscale image of the digit. Each pixel is represented by a floating point number indicating the grayscale intensity at that location. The 20 by 20 grid of pixels is “unrolled” into a 400-dimensional vector. Each\n", - "of these training examples becomes a single row in our data matrix $X$. This gives us a 5000 by 400 matrix $X$ where every row is a training example for a handwritten digit image.\n", - "\n", - "$$ X = \\begin{bmatrix} - \\left(x^{(1)} \\right)^T - \\\\\n", - "- \\left(x^{(2)} \\right)^T - \\\\\n", - "\\vdots \\\\\n", - "- \\left(x^{(m)} \\right)^T - \\\\\n", - "\\end{bmatrix}\n", - "$$\n", - "\n", - "The second part of the training set is a 5000-dimensional vector `y` that contains labels for the training set. \n", - "The following cell randomly selects 100 images from the dataset and plots them." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Randomly select 100 data points to display\n", - "rand_indices = np.random.choice(m, 100, replace=False)\n", - "sel = X[rand_indices, :]\n", - "\n", - "utils.displayData(sel)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.2 Model representation\n", - "\n", - "Our neural network is shown in the following figure.\n", - "\n", - "![](Figures/neural_network.png)\n", - "\n", - "It has 3 layers - an input layer, a hidden layer and an output layer. Recall that our inputs are pixel values\n", - "of digit images. Since the images are of size $20 \\times 20$, this gives us 400 input layer units (not counting the extra bias unit which always outputs +1). The training data was loaded into the variables `X` and `y` above.\n", - "\n", - "You have been provided with a set of network parameters ($\\Theta^{(1)}, \\Theta^{(2)}$) already trained by us. These are stored in `ex4weights.mat` and will be loaded in the next cell of this notebook into `Theta1` and `Theta2`. The parameters have dimensions that are sized for a neural network with 25 units in the second layer and 10 output units (corresponding to the 10 digit classes)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Setup the parameters you will use for this exercise\n", - "input_layer_size = 400 # 20x20 Input Images of Digits\n", - "hidden_layer_size = 25 # 25 hidden units\n", - "num_labels = 10 # 10 labels, from 0 to 9\n", - "\n", - "# Load the weights into variables Theta1 and Theta2\n", - "weights = loadmat(os.path.join('Data', 'ex4weights.mat'))\n", - "\n", - "# Theta1 has size 25 x 401\n", - "# Theta2 has size 10 x 26\n", - "Theta1, Theta2 = weights['Theta1'], weights['Theta2']\n", - "\n", - "# swap first and last columns of Theta2, due to legacy from MATLAB indexing, \n", - "# since the weight file ex3weights.mat was saved based on MATLAB indexing\n", - "Theta2 = np.roll(Theta2, 1, axis=0)\n", - "\n", - "# Unroll parameters \n", - "nn_params = np.concatenate([Theta1.ravel(), Theta2.ravel()])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 1.3 Feedforward and cost function\n", - "\n", - "Now you will implement the cost function and gradient for the neural network. First, complete the code for the function `nnCostFunction` in the next cell to return the cost.\n", - "\n", - "Recall that the cost function for the neural network (without regularization) is:\n", - "\n", - "$$ J(\\theta) = \\frac{1}{m} \\sum_{i=1}^{m}\\sum_{k=1}^{K} \\left[ - y_k^{(i)} \\log \\left( \\left( h_\\theta \\left( x^{(i)} \\right) \\right)_k \\right) - \\left( 1 - y_k^{(i)} \\right) \\log \\left( 1 - \\left( h_\\theta \\left( x^{(i)} \\right) \\right)_k \\right) \\right]$$\n", - "\n", - "where $h_\\theta \\left( x^{(i)} \\right)$ is computed as shown in the neural network figure above, and K = 10 is the total number of possible labels. Note that $h_\\theta(x^{(i)})_k = a_k^{(3)}$ is the activation (output\n", - "value) of the $k^{th}$ output unit. Also, recall that whereas the original labels (in the variable y) were 0, 1, ..., 9, for the purpose of training a neural network, we need to encode the labels as vectors containing only values 0 or 1, so that\n", - "\n", - "$$ y = \n", - "\\begin{bmatrix} 1 \\\\ 0 \\\\ 0 \\\\\\vdots \\\\ 0 \\end{bmatrix}, \\quad\n", - "\\begin{bmatrix} 0 \\\\ 1 \\\\ 0 \\\\ \\vdots \\\\ 0 \\end{bmatrix}, \\quad \\cdots \\quad \\text{or} \\qquad\n", - "\\begin{bmatrix} 0 \\\\ 0 \\\\ 0 \\\\ \\vdots \\\\ 1 \\end{bmatrix}.\n", - "$$\n", - "\n", - "For example, if $x^{(i)}$ is an image of the digit 5, then the corresponding $y^{(i)}$ (that you should use with the cost function) should be a 10-dimensional vector with $y_5 = 1$, and the other elements equal to 0.\n", - "\n", - "You should implement the feedforward computation that computes $h_\\theta(x^{(i)})$ for every example $i$ and sum the cost over all examples. **Your code should also work for a dataset of any size, with any number of labels** (you can assume that there are always at least $K \\ge 3$ labels).\n", - "\n", - "
\n", - "**Implementation Note:** The matrix $X$ contains the examples in rows (i.e., X[i,:] is the i-th training example $x^{(i)}$, expressed as a $n \\times 1$ vector.) When you complete the code in `nnCostFunction`, you will need to add the column of 1’s to the X matrix. The parameters for each unit in the neural network is represented in Theta1 and Theta2 as one row. Specifically, the first row of Theta1 corresponds to the first hidden unit in the second layer. You can use a for-loop over the examples to compute the cost.\n", - "
\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def nnCostFunction(nn_params,\n", - " input_layer_size,\n", - " hidden_layer_size,\n", - " num_labels,\n", - " X, y, lambda_=0.0):\n", - " \"\"\"\n", - " Implements the neural network cost function and gradient for a two layer neural \n", - " network which performs classification. \n", - " \n", - " Parameters\n", - " ----------\n", - " nn_params : array_like\n", - " The parameters for the neural network which are \"unrolled\" into \n", - " a vector. This needs to be converted back into the weight matrices Theta1\n", - " and Theta2.\n", - " \n", - " input_layer_size : int\n", - " Number of features for the input layer. \n", - " \n", - " hidden_layer_size : int\n", - " Number of hidden units in the second layer.\n", - " \n", - " num_labels : int\n", - " Total number of labels, or equivalently number of units in output layer. \n", - " \n", - " X : array_like\n", - " Input dataset. A matrix of shape (m x input_layer_size).\n", - " \n", - " y : array_like\n", - " Dataset labels. A vector of shape (m,).\n", - " \n", - " lambda_ : float, optional\n", - " Regularization parameter.\n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The computed value for the cost function at the current weight values.\n", - " \n", - " grad : array_like\n", - " An \"unrolled\" vector of the partial derivatives of the concatenatation of\n", - " neural network weights Theta1 and Theta2.\n", - " \n", - " Instructions\n", - " ------------\n", - " You should complete the code by working through the following parts.\n", - " \n", - " - Part 1: Feedforward the neural network and return the cost in the \n", - " variable J. After implementing Part 1, you can verify that your\n", - " cost function computation is correct by verifying the cost\n", - " computed in the following cell.\n", - " \n", - " - Part 2: Implement the backpropagation algorithm to compute the gradients\n", - " Theta1_grad and Theta2_grad. You should return the partial derivatives of\n", - " the cost function with respect to Theta1 and Theta2 in Theta1_grad and\n", - " Theta2_grad, respectively. After implementing Part 2, you can check\n", - " that your implementation is correct by running checkNNGradients provided\n", - " in the utils.py module.\n", - " \n", - " Note: The vector y passed into the function is a vector of labels\n", - " containing values from 0..K-1. You need to map this vector into a \n", - " binary vector of 1's and 0's to be used with the neural network\n", - " cost function.\n", - " \n", - " Hint: We recommend implementing backpropagation using a for-loop\n", - " over the training examples if you are implementing it for the \n", - " first time.\n", - " \n", - " - Part 3: Implement regularization with the cost function and gradients.\n", - " \n", - " Hint: You can implement this around the code for\n", - " backpropagation. That is, you can compute the gradients for\n", - " the regularization separately and then add them to Theta1_grad\n", - " and Theta2_grad from Part 2.\n", - " \n", - " Note \n", - " ----\n", - " We have provided an implementation for the sigmoid function in the file \n", - " `utils.py` accompanying this assignment.\n", - " \"\"\"\n", - " # Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices\n", - " # for our 2 layer neural network\n", - " Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],\n", - " (hidden_layer_size, (input_layer_size + 1)))\n", - "\n", - " Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],\n", - " (num_labels, (hidden_layer_size + 1)))\n", - "\n", - " # Setup some useful variables\n", - " m = y.size\n", - " \n", - " # You need to return the following variables correctly \n", - " J = 0\n", - " Theta1_grad = np.zeros(Theta1.shape)\n", - " Theta2_grad = np.zeros(Theta2.shape)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - " \n", - " \n", - " # ================================================================\n", - " # Unroll gradients\n", - " # grad = np.concatenate([Theta1_grad.ravel(order=order), Theta2_grad.ravel(order=order)])\n", - " grad = np.concatenate([Theta1_grad.ravel(), Theta2_grad.ravel()])\n", - "\n", - " return J, grad" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "Use the following links to go back to the different parts of this exercise that require to modify the function `nnCostFunction`.
\n", - "\n", - "Back to:\n", - "- [Feedforward and cost function](#section1)\n", - "- [Regularized cost](#section2)\n", - "- [Neural Network Gradient (Backpropagation)](#section4)\n", - "- [Regularized Gradient](#section5)\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you are done, call your `nnCostFunction` using the loaded set of parameters for `Theta1` and `Theta2`. You should see that the cost is about 0.287629." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "lambda_ = 0\n", - "J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size,\n", - " num_labels, X, y, lambda_)\n", - "print('Cost at parameters (loaded from ex4weights): %.6f ' % J)\n", - "print('The cost should be about : 0.287629.')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader = utils.Grader()\n", - "grader[1] = nnCostFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 1.4 Regularized cost function\n", - "\n", - "The cost function for neural networks with regularization is given by:\n", - "\n", - "\n", - "$$ J(\\theta) = \\frac{1}{m} \\sum_{i=1}^{m}\\sum_{k=1}^{K} \\left[ - y_k^{(i)} \\log \\left( \\left( h_\\theta \\left( x^{(i)} \\right) \\right)_k \\right) - \\left( 1 - y_k^{(i)} \\right) \\log \\left( 1 - \\left( h_\\theta \\left( x^{(i)} \\right) \\right)_k \\right) \\right] + \\frac{\\lambda}{2 m} \\left[ \\sum_{j=1}^{25} \\sum_{k=1}^{400} \\left( \\Theta_{j,k}^{(1)} \\right)^2 + \\sum_{j=1}^{10} \\sum_{k=1}^{25} \\left( \\Theta_{j,k}^{(2)} \\right)^2 \\right] $$\n", - "\n", - "You can assume that the neural network will only have 3 layers - an input layer, a hidden layer and an output layer. However, your code should work for any number of input units, hidden units and outputs units. While we\n", - "have explicitly listed the indices above for $\\Theta^{(1)}$ and $\\Theta^{(2)}$ for clarity, do note that your code should in general work with $\\Theta^{(1)}$ and $\\Theta^{(2)}$ of any size. Note that you should not be regularizing the terms that correspond to the bias. For the matrices `Theta1` and `Theta2`, this corresponds to the first column of each matrix. You should now add regularization to your cost function. Notice that you can first compute the unregularized cost function $J$ using your existing `nnCostFunction` and then later add the cost for the regularization terms.\n", - "\n", - "[Click here to go back to `nnCostFunction` for editing.](#nnCostFunction)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you are done, the next cell will call your `nnCostFunction` using the loaded set of parameters for `Theta1` and `Theta2`, and $\\lambda = 1$. You should see that the cost is about 0.383770." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Weight regularization parameter (we set this to 1 here).\n", - "lambda_ = 1\n", - "J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size,\n", - " num_labels, X, y, lambda_)\n", - "\n", - "print('Cost at parameters (loaded from ex4weights): %.6f' % J)\n", - "print('This value should be about : 0.383770.')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[2] = nnCostFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2 Backpropagation\n", - "\n", - "In this part of the exercise, you will implement the backpropagation algorithm to compute the gradient for the neural network cost function. You will need to update the function `nnCostFunction` so that it returns an appropriate value for `grad`. Once you have computed the gradient, you will be able to train the neural network by minimizing the cost function $J(\\theta)$ using an advanced optimizer such as `scipy`'s `optimize.minimize`.\n", - "You will first implement the backpropagation algorithm to compute the gradients for the parameters for the (unregularized) neural network. After you have verified that your gradient computation for the unregularized case is correct, you will implement the gradient for the regularized neural network." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.1 Sigmoid Gradient\n", - "\n", - "To help you get started with this part of the exercise, you will first implement\n", - "the sigmoid gradient function. The gradient for the sigmoid function can be\n", - "computed as\n", - "\n", - "$$ g'(z) = \\frac{d}{dz} g(z) = g(z)\\left(1-g(z)\\right) $$\n", - "\n", - "where\n", - "\n", - "$$ \\text{sigmoid}(z) = g(z) = \\frac{1}{1 + e^{-z}} $$\n", - "\n", - "Now complete the implementation of `sigmoidGradient` in the next cell.\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def sigmoidGradient(z):\n", - " \"\"\"\n", - " Computes the gradient of the sigmoid function evaluated at z. \n", - " This should work regardless if z is a matrix or a vector. \n", - " In particular, if z is a vector or matrix, you should return\n", - " the gradient for each element.\n", - " \n", - " Parameters\n", - " ----------\n", - " z : array_like\n", - " A vector or matrix as input to the sigmoid function. \n", - " \n", - " Returns\n", - " --------\n", - " g : array_like\n", - " Gradient of the sigmoid function. Has the same shape as z. \n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the gradient of the sigmoid function evaluated at\n", - " each value of z (z can be a matrix, vector or scalar).\n", - " \n", - " Note\n", - " ----\n", - " We have provided an implementation of the sigmoid function \n", - " in `utils.py` file accompanying this assignment.\n", - " \"\"\"\n", - "\n", - " g = np.zeros(z.shape)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - "\n", - " # =============================================================\n", - " return g" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When you are done, the following cell call `sigmoidGradient` on a given vector `z`. Try testing a few values by calling `sigmoidGradient(z)`. For large values (both positive and negative) of z, the gradient should be close to 0. When $z = 0$, the gradient should be exactly 0.25. Your code should also work with vectors and matrices. For a matrix, your function should perform the sigmoid gradient function on every element." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = np.array([-1, -0.5, 0, 0.5, 1])\n", - "g = sigmoidGradient(z)\n", - "print('Sigmoid gradient evaluated at [-1 -0.5 0 0.5 1]:\\n ')\n", - "print(g)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[3] = sigmoidGradient\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2.2 Random Initialization\n", - "\n", - "When training neural networks, it is important to randomly initialize the parameters for symmetry breaking. One effective strategy for random initialization is to randomly select values for $\\Theta^{(l)}$ uniformly in the range $[-\\epsilon_{init}, \\epsilon_{init}]$. You should use $\\epsilon_{init} = 0.12$. This range of values ensures that the parameters are kept small and makes the learning more efficient.\n", - "\n", - "
\n", - "One effective strategy for choosing $\\epsilon_{init}$ is to base it on the number of units in the network. A good choice of $\\epsilon_{init}$ is $\\epsilon_{init} = \\frac{\\sqrt{6}}{\\sqrt{L_{in} + L_{out}}}$ where $L_{in} = s_l$ and $L_{out} = s_{l+1}$ are the number of units in the layers adjacent to $\\Theta^{l}$.\n", - "
\n", - "\n", - "Your job is to complete the function `randInitializeWeights` to initialize the weights for $\\Theta$. Modify the function by filling in the following code:\n", - "\n", - "```python\n", - "# Randomly initialize the weights to small values\n", - "W = np.random.rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init\n", - "```\n", - "Note that we give the function an argument for $\\epsilon$ with default value `epsilon_init = 0.12`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def randInitializeWeights(L_in, L_out, epsilon_init=0.12):\n", - " \"\"\"\n", - " Randomly initialize the weights of a layer in a neural network.\n", - " \n", - " Parameters\n", - " ----------\n", - " L_in : int\n", - " Number of incomming connections.\n", - " \n", - " L_out : int\n", - " Number of outgoing connections. \n", - " \n", - " epsilon_init : float, optional\n", - " Range of values which the weight can take from a uniform \n", - " distribution.\n", - " \n", - " Returns\n", - " -------\n", - " W : array_like\n", - " The weight initialiatized to random values. Note that W should\n", - " be set to a matrix of size(L_out, 1 + L_in) as\n", - " the first column of W handles the \"bias\" terms.\n", - " \n", - " Instructions\n", - " ------------\n", - " Initialize W randomly so that we break the symmetry while training\n", - " the neural network. Note that the first column of W corresponds \n", - " to the parameters for the bias unit.\n", - " \"\"\"\n", - "\n", - " # You need to return the following variables correctly \n", - " W = np.zeros((L_out, 1 + L_in))\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - "\n", - " # ============================================================\n", - " return W" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You do not need to submit any code for this part of the exercise.*\n", - "\n", - "Execute the following cell to initialize the weights for the 2 layers in the neural network using the `randInitializeWeights` function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print('Initializing Neural Network Parameters ...')\n", - "\n", - "initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size)\n", - "initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels)\n", - "\n", - "# Unroll parameters\n", - "initial_nn_params = np.concatenate([initial_Theta1.ravel(), initial_Theta2.ravel()], axis=0)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.4 Backpropagation\n", - "\n", - "![](Figures/ex4-backpropagation.png)\n", - "\n", - "Now, you will implement the backpropagation algorithm. Recall that the intuition behind the backpropagation algorithm is as follows. Given a training example $(x^{(t)}, y^{(t)})$, we will first run a “forward pass” to compute all the activations throughout the network, including the output value of the hypothesis $h_\\theta(x)$. Then, for each node $j$ in layer $l$, we would like to compute an “error term” $\\delta_j^{(l)}$ that measures how much that node was “responsible” for any errors in our output.\n", - "\n", - "For an output node, we can directly measure the difference between the network’s activation and the true target value, and use that to define $\\delta_j^{(3)}$ (since layer 3 is the output layer). For the hidden units, you will compute $\\delta_j^{(l)}$ based on a weighted average of the error terms of the nodes in layer $(l+1)$. In detail, here is the backpropagation algorithm (also depicted in the figure above). You should implement steps 1 to 4 in a loop that processes one example at a time. Concretely, you should implement a for-loop `for t in range(m)` and place steps 1-4 below inside the for-loop, with the $t^{th}$ iteration performing the calculation on the $t^{th}$ training example $(x^{(t)}, y^{(t)})$. Step 5 will divide the accumulated gradients by $m$ to obtain the gradients for the neural network cost function.\n", - "\n", - "1. Set the input layer’s values $(a^{(1)})$ to the $t^{th }$training example $x^{(t)}$. Perform a feedforward pass, computing the activations $(z^{(2)}, a^{(2)}, z^{(3)}, a^{(3)})$ for layers 2 and 3. Note that you need to add a `+1` term to ensure that the vectors of activations for layers $a^{(1)}$ and $a^{(2)}$ also include the bias unit. In `numpy`, if a 1 is a column matrix, adding one corresponds to `a_1 = np.concatenate([np.ones((m, 1)), a_1], axis=1)`.\n", - "\n", - "1. For each output unit $k$ in layer 3 (the output layer), set \n", - "$$\\delta_k^{(3)} = \\left(a_k^{(3)} - y_k \\right)$$\n", - "where $y_k \\in \\{0, 1\\}$ indicates whether the current training example belongs to class $k$ $(y_k = 1)$, or if it belongs to a different class $(y_k = 0)$. You may find logical arrays helpful for this task (explained in the previous programming exercise).\n", - "\n", - "1. For the hidden layer $l = 2$, set \n", - "$$ \\delta^{(2)} = \\left( \\Theta^{(2)} \\right)^T \\delta^{(3)} * g'\\left(z^{(2)} \\right)$$\n", - "Note that the symbol $*$ performs element wise multiplication in `numpy`.\n", - "\n", - "1. Accumulate the gradient from this example using the following formula. Note that you should skip or remove $\\delta_0^{(2)}$. In `numpy`, removing $\\delta_0^{(2)}$ corresponds to `delta_2 = delta_2[1:]`.\n", - "$$ \\Delta^{(l)} = \\Delta^{(l)} + \\delta^{(l+1)} (a^{(l)})^{(T)} $$\n", - "\n", - "1. Obtain the (unregularized) gradient for the neural network cost function by dividing the accumulated gradients by $\\frac{1}{m}$:\n", - "$$ \\frac{\\partial}{\\partial \\Theta_{ij}^{(l)}} J(\\Theta) = D_{ij}^{(l)} = \\frac{1}{m} \\Delta_{ij}^{(l)}$$\n", - "\n", - "
\n", - "**Python/Numpy tip**: You should implement the backpropagation algorithm only after you have successfully completed the feedforward and cost functions. While implementing the backpropagation alogrithm, it is often useful to use the `shape` function to print out the shapes of the variables you are working with if you run into dimension mismatch errors.\n", - "
\n", - "\n", - "[Click here to go back and update the function `nnCostFunction` with the backpropagation algorithm](#nnCostFunction).\n", - "\n", - "\n", - "**Note:** If the iterative solution provided above is proving to be difficult to implement, try implementing the vectorized approach which is easier to implement in the opinion of the moderators of this course. You can find the tutorial for the vectorized approach [here](https://www.coursera.org/learn/machine-learning/discussions/all/threads/a8Kce_WxEeS16yIACyoj1Q)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After you have implemented the backpropagation algorithm, we will proceed to run gradient checking on your implementation. The gradient check will allow you to increase your confidence that your code is\n", - "computing the gradients correctly.\n", - "\n", - "### 2.4 Gradient checking \n", - "\n", - "In your neural network, you are minimizing the cost function $J(\\Theta)$. To perform gradient checking on your parameters, you can imagine “unrolling” the parameters $\\Theta^{(1)}$, $\\Theta^{(2)}$ into a long vector $\\theta$. By doing so, you can think of the cost function being $J(\\Theta)$ instead and use the following gradient checking procedure.\n", - "\n", - "Suppose you have a function $f_i(\\theta)$ that purportedly computes $\\frac{\\partial}{\\partial \\theta_i} J(\\theta)$; you’d like to check if $f_i$ is outputting correct derivative values.\n", - "\n", - "$$\n", - "\\text{Let } \\theta^{(i+)} = \\theta + \\begin{bmatrix} 0 \\\\ 0 \\\\ \\vdots \\\\ \\epsilon \\\\ \\vdots \\\\ 0 \\end{bmatrix}\n", - "\\quad \\text{and} \\quad \\theta^{(i-)} = \\theta - \\begin{bmatrix} 0 \\\\ 0 \\\\ \\vdots \\\\ \\epsilon \\\\ \\vdots \\\\ 0 \\end{bmatrix}\n", - "$$\n", - "\n", - "So, $\\theta^{(i+)}$ is the same as $\\theta$, except its $i^{th}$ element has been incremented by $\\epsilon$. Similarly, $\\theta^{(i−)}$ is the corresponding vector with the $i^{th}$ element decreased by $\\epsilon$. You can now numerically verify $f_i(\\theta)$’s correctness by checking, for each $i$, that:\n", - "\n", - "$$ f_i\\left( \\theta \\right) \\approx \\frac{J\\left( \\theta^{(i+)}\\right) - J\\left( \\theta^{(i-)} \\right)}{2\\epsilon} $$\n", - "\n", - "The degree to which these two values should approximate each other will depend on the details of $J$. But assuming $\\epsilon = 10^{-4}$, you’ll usually find that the left- and right-hand sides of the above will agree to at least 4 significant digits (and often many more).\n", - "\n", - "We have implemented the function to compute the numerical gradient for you in `computeNumericalGradient` (within the file `utils.py`). While you are not required to modify the file, we highly encourage you to take a look at the code to understand how it works.\n", - "\n", - "In the next cell we will run the provided function `checkNNGradients` which will create a small neural network and dataset that will be used for checking your gradients. If your backpropagation implementation is correct,\n", - "you should see a relative difference that is less than 1e-9.\n", - "\n", - "
\n", - "**Practical Tip**: When performing gradient checking, it is much more efficient to use a small neural network with a relatively small number of input units and hidden units, thus having a relatively small number\n", - "of parameters. Each dimension of $\\theta$ requires two evaluations of the cost function and this can be expensive. In the function `checkNNGradients`, our code creates a small random model and dataset which is used with `computeNumericalGradient` for gradient checking. Furthermore, after you are confident that your gradient computations are correct, you should turn off gradient checking before running your learning algorithm.\n", - "
\n", - "\n", - "
\n", - "**Practical Tip:** Gradient checking works for any function where you are computing the cost and the gradient. Concretely, you can use the same `computeNumericalGradient` function to check if your gradient implementations for the other exercises are correct too (e.g., logistic regression’s cost function).\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "utils.checkNNGradients(nnCostFunction)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*Once your cost function passes the gradient check for the (unregularized) neural network cost function, you should submit the neural network gradient function (backpropagation).*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[4] = nnCostFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 2.5 Regularized Neural Network\n", - "\n", - "After you have successfully implemented the backpropagation algorithm, you will add regularization to the gradient. To account for regularization, it turns out that you can add this as an additional term *after* computing the gradients using backpropagation.\n", - "\n", - "Specifically, after you have computed $\\Delta_{ij}^{(l)}$ using backpropagation, you should add regularization using\n", - "\n", - "$$ \\begin{align} \n", - "& \\frac{\\partial}{\\partial \\Theta_{ij}^{(l)}} J(\\Theta) = D_{ij}^{(l)} = \\frac{1}{m} \\Delta_{ij}^{(l)} & \\qquad \\text{for } j = 0 \\\\\n", - "& \\frac{\\partial}{\\partial \\Theta_{ij}^{(l)}} J(\\Theta) = D_{ij}^{(l)} = \\frac{1}{m} \\Delta_{ij}^{(l)} + \\frac{\\lambda}{m} \\Theta_{ij}^{(l)} & \\qquad \\text{for } j \\ge 1\n", - "\\end{align}\n", - "$$\n", - "\n", - "Note that you should *not* be regularizing the first column of $\\Theta^{(l)}$ which is used for the bias term. Furthermore, in the parameters $\\Theta_{ij}^{(l)}$, $i$ is indexed starting from 1, and $j$ is indexed starting from 0. Thus, \n", - "\n", - "$$\n", - "\\Theta^{(l)} = \\begin{bmatrix}\n", - "\\Theta_{1,0}^{(i)} & \\Theta_{1,1}^{(l)} & \\cdots \\\\\n", - "\\Theta_{2,0}^{(i)} & \\Theta_{2,1}^{(l)} & \\cdots \\\\\n", - "\\vdots & ~ & \\ddots\n", - "\\end{bmatrix}\n", - "$$\n", - "\n", - "[Now modify your code that computes grad in `nnCostFunction` to account for regularization.](#nnCostFunction)\n", - "\n", - "After you are done, the following cell runs gradient checking on your implementation. If your code is correct, you should expect to see a relative difference that is less than 1e-9." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Check gradients by running checkNNGradients\n", - "lambda_ = 3\n", - "utils.checkNNGradients(nnCostFunction, lambda_)\n", - "\n", - "# Also output the costFunction debugging values\n", - "debug_J, _ = nnCostFunction(nn_params, input_layer_size,\n", - " hidden_layer_size, num_labels, X, y, lambda_)\n", - "\n", - "print('\\n\\nCost at (fixed) debugging parameters (w/ lambda = %f): %f ' % (lambda_, debug_J))\n", - "print('(for lambda = 3, this value should be about 0.576051)')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[5] = nnCostFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.6 Learning parameters using `scipy.optimize.minimize`\n", - "\n", - "After you have successfully implemented the neural network cost function\n", - "and gradient computation, the next step we will use `scipy`'s minimization to learn a good set parameters." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# After you have completed the assignment, change the maxiter to a larger\n", - "# value to see how more training helps.\n", - "options= {'maxiter': 100}\n", - "\n", - "# You should also try different values of lambda\n", - "lambda_ = 1\n", - "\n", - "# Create \"short hand\" for the cost function to be minimized\n", - "costFunction = lambda p: nnCostFunction(p, input_layer_size,\n", - " hidden_layer_size,\n", - " num_labels, X, y, lambda_)\n", - "\n", - "# Now, costFunction is a function that takes in only one argument\n", - "# (the neural network parameters)\n", - "res = optimize.minimize(costFunction,\n", - " initial_nn_params,\n", - " jac=True,\n", - " method='TNC',\n", - " options=options)\n", - "\n", - "# get the solution of the optimization\n", - "nn_params = res.x\n", - " \n", - "# Obtain Theta1 and Theta2 back from nn_params\n", - "Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],\n", - " (hidden_layer_size, (input_layer_size + 1)))\n", - "\n", - "Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],\n", - " (num_labels, (hidden_layer_size + 1)))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After the training completes, we will proceed to report the training accuracy of your classifier by computing the percentage of examples it got correct. If your implementation is correct, you should see a reported\n", - "training accuracy of about 95.3% (this may vary by about 1% due to the random initialization). It is possible to get higher training accuracies by training the neural network for more iterations. We encourage you to try\n", - "training the neural network for more iterations (e.g., set `maxiter` to 400) and also vary the regularization parameter $\\lambda$. With the right learning settings, it is possible to get the neural network to perfectly fit the training set." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pred = utils.predict(Theta1, Theta2, X)\n", - "print('Training Set Accuracy: %f' % (np.mean(pred == y) * 100))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3 Visualizing the Hidden Layer\n", - "\n", - "One way to understand what your neural network is learning is to visualize what the representations captured by the hidden units. Informally, given a particular hidden unit, one way to visualize what it computes is to find an input $x$ that will cause it to activate (that is, to have an activation value \n", - "($a_i^{(l)}$) close to 1). For the neural network you trained, notice that the $i^{th}$ row of $\\Theta^{(1)}$ is a 401-dimensional vector that represents the parameter for the $i^{th}$ hidden unit. If we discard the bias term, we get a 400 dimensional vector that represents the weights from each input pixel to the hidden unit.\n", - "\n", - "Thus, one way to visualize the “representation” captured by the hidden unit is to reshape this 400 dimensional vector into a 20 × 20 image and display it (It turns out that this is equivalent to finding the input that gives the highest activation for the hidden unit, given a “norm” constraint on the input (i.e., $||x||_2 \\le 1$)). \n", - "\n", - "The next cell does this by using the `displayData` function and it will show you an image with 25 units,\n", - "each corresponding to one hidden unit in the network. In your trained network, you should find that the hidden units corresponds roughly to detectors that look for strokes and other patterns in the input." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "utils.displayData(Theta1[:, 1:])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.1 Optional (ungraded) exercise\n", - "\n", - "In this part of the exercise, you will get to try out different learning settings for the neural network to see how the performance of the neural network varies with the regularization parameter $\\lambda$ and number of training steps (the `maxiter` option when using `scipy.optimize.minimize`). Neural networks are very powerful models that can form highly complex decision boundaries. Without regularization, it is possible for a neural network to “overfit” a training set so that it obtains close to 100% accuracy on the training set but does not as well on new examples that it has not seen before. You can set the regularization $\\lambda$ to a smaller value and the `maxiter` parameter to a higher number of iterations to see this for youself." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/Exercise4/utils.py b/Exercise4/utils.py deleted file mode 100755 index 6b7c3bdc..00000000 --- a/Exercise4/utils.py +++ /dev/null @@ -1,226 +0,0 @@ -import sys -import numpy as np -from matplotlib import pyplot - -sys.path.append('..') -from submission import SubmissionBase - - -def displayData(X, example_width=None, figsize=(10, 10)): - """ - Displays 2D data stored in X in a nice grid. - """ - # Compute rows, cols - if X.ndim == 2: - m, n = X.shape - elif X.ndim == 1: - n = X.size - m = 1 - X = X[None] # Promote to a 2 dimensional array - else: - raise IndexError('Input X should be 1 or 2 dimensional.') - - example_width = example_width or int(np.round(np.sqrt(n))) - example_height = n / example_width - - # Compute number of items to display - display_rows = int(np.floor(np.sqrt(m))) - display_cols = int(np.ceil(m / display_rows)) - - fig, ax_array = pyplot.subplots(display_rows, display_cols, figsize=figsize) - fig.subplots_adjust(wspace=0.025, hspace=0.025) - - ax_array = [ax_array] if m == 1 else ax_array.ravel() - - for i, ax in enumerate(ax_array): - # Display Image - h = ax.imshow(X[i].reshape(example_width, example_width, order='F'), - cmap='Greys', extent=[0, 1, 0, 1]) - ax.axis('off') - - -def predict(Theta1, Theta2, X): - """ - Predict the label of an input given a trained neural network - Outputs the predicted label of X given the trained weights of a neural - network(Theta1, Theta2) - """ - # Useful values - m = X.shape[0] - num_labels = Theta2.shape[0] - - # You need to return the following variables correctly - p = np.zeros(m) - h1 = sigmoid(np.dot(np.concatenate([np.ones((m, 1)), X], axis=1), Theta1.T)) - h2 = sigmoid(np.dot(np.concatenate([np.ones((m, 1)), h1], axis=1), Theta2.T)) - p = np.argmax(h2, axis=1) - return p - - -def debugInitializeWeights(fan_out, fan_in): - """ - Initialize the weights of a layer with fan_in incoming connections and fan_out outgoings - connections using a fixed strategy. This will help you later in debugging. - - Note that W should be set a matrix of size (1+fan_in, fan_out) as the first row of W handles - the "bias" terms. - - Parameters - ---------- - fan_out : int - The number of outgoing connections. - - fan_in : int - The number of incoming connections. - - Returns - ------- - W : array_like (1+fan_in, fan_out) - The initialized weights array given the dimensions. - """ - # Initialize W using "sin". This ensures that W is always of the same values and will be - # useful for debugging - W = np.sin(np.arange(1, 1 + (1+fan_in)*fan_out))/10.0 - W = W.reshape(fan_out, 1+fan_in, order='F') - return W - - -def computeNumericalGradient(J, theta, e=1e-4): - """ - Computes the gradient using "finite differences" and gives us a numerical estimate of the - gradient. - - Parameters - ---------- - J : func - The cost function which will be used to estimate its numerical gradient. - - theta : array_like - The one dimensional unrolled network parameters. The numerical gradient is computed at - those given parameters. - - e : float (optional) - The value to use for epsilon for computing the finite difference. - - Notes - ----- - The following code implements numerical gradient checking, and - returns the numerical gradient. It sets `numgrad[i]` to (a numerical - approximation of) the partial derivative of J with respect to the - i-th input argument, evaluated at theta. (i.e., `numgrad[i]` should - be the (approximately) the partial derivative of J with respect - to theta[i].) - """ - numgrad = np.zeros(theta.shape) - perturb = np.diag(e * np.ones(theta.shape)) - for i in range(theta.size): - loss1, _ = J(theta - perturb[:, i]) - loss2, _ = J(theta + perturb[:, i]) - numgrad[i] = (loss2 - loss1)/(2*e) - return numgrad - - -def checkNNGradients(nnCostFunction, lambda_=0): - """ - Creates a small neural network to check the backpropagation gradients. It will output the - analytical gradients produced by your backprop code and the numerical gradients - (computed using computeNumericalGradient). These two gradient computations should result in - very similar values. - - Parameters - ---------- - nnCostFunction : func - A reference to the cost function implemented by the student. - - lambda_ : float (optional) - The regularization parameter value. - """ - input_layer_size = 3 - hidden_layer_size = 5 - num_labels = 3 - m = 5 - - # We generate some 'random' test data - Theta1 = debugInitializeWeights(hidden_layer_size, input_layer_size) - Theta2 = debugInitializeWeights(num_labels, hidden_layer_size) - - # Reusing debugInitializeWeights to generate X - X = debugInitializeWeights(m, input_layer_size - 1) - y = np.arange(1, 1+m) % num_labels - # print(y) - # Unroll parameters - nn_params = np.concatenate([Theta1.ravel(), Theta2.ravel()]) - - # short hand for cost function - costFunc = lambda p: nnCostFunction(p, input_layer_size, hidden_layer_size, - num_labels, X, y, lambda_) - cost, grad = costFunc(nn_params) - numgrad = computeNumericalGradient(costFunc, nn_params) - - # Visually examine the two gradient computations.The two columns you get should be very similar. - print(np.stack([numgrad, grad], axis=1)) - print('The above two columns you get should be very similar.') - print('(Left-Your Numerical Gradient, Right-Analytical Gradient)\n') - - # Evaluate the norm of the difference between two the solutions. If you have a correct - # implementation, and assuming you used e = 0.0001 in computeNumericalGradient, then diff - # should be less than 1e-9. - diff = np.linalg.norm(numgrad - grad)/np.linalg.norm(numgrad + grad) - - print('If your backpropagation implementation is correct, then \n' - 'the relative difference will be small (less than 1e-9). \n' - 'Relative Difference: %g' % diff) - - -def sigmoid(z): - """ - Computes the sigmoid of z. - """ - return 1.0 / (1.0 + np.exp(-z)) - - -class Grader(SubmissionBase): - X = np.reshape(3 * np.sin(np.arange(1, 31)), (3, 10), order='F') - Xm = np.reshape(np.sin(np.arange(1, 33)), (16, 2), order='F') / 5 - ym = np.arange(1, 17) % 4 - t1 = np.sin(np.reshape(np.arange(1, 25, 2), (4, 3), order='F')) - t2 = np.cos(np.reshape(np.arange(1, 41, 2), (4, 5), order='F')) - t = np.concatenate([t1.ravel(), t2.ravel()], axis=0) - - def __init__(self): - part_names = ['Feedforward and Cost Function', - 'Regularized Cost Function', - 'Sigmoid Gradient', - 'Neural Network Gradient (Backpropagation)', - 'Regularized Gradient'] - super().__init__('neural-network-learning', part_names) - - def __iter__(self): - for part_id in range(1, 6): - try: - func = self.functions[part_id] - - # Each part has different expected arguments/different function - if part_id == 1: - res = func(self.t, 2, 4, 4, self.Xm, self.ym, 0)[0] - elif part_id == 2: - res = func(self.t, 2, 4, 4, self.Xm, self.ym, 1.5) - elif part_id == 3: - res = func(self.X, ) - elif part_id == 4: - J, grad = func(self.t, 2, 4, 4, self.Xm, self.ym, 0) - grad1 = np.reshape(grad[:12], (4, 3)) - grad2 = np.reshape(grad[12:], (4, 5)) - grad = np.concatenate([grad1.ravel('F'), grad2.ravel('F')]) - res = np.hstack([J, grad]).tolist() - elif part_id == 5: - J, grad = func(self.t, 2, 4, 4, self.Xm, self.ym, 1.5) - grad1 = np.reshape(grad[:12], (4, 3)) - grad2 = np.reshape(grad[12:], (4, 5)) - grad = np.concatenate([grad1.ravel('F'), grad2.ravel('F')]) - res = np.hstack([J, grad]).tolist() - else: - raise KeyError - yield part_id, res - except KeyError: - yield part_id, 0 diff --git a/Exercise5/Figures/cross_validation.png b/Exercise5/Figures/cross_validation.png deleted file mode 100644 index e6a8f28f..00000000 Binary files a/Exercise5/Figures/cross_validation.png and /dev/null differ diff --git a/Exercise5/Figures/learning_curve.png b/Exercise5/Figures/learning_curve.png deleted file mode 100755 index c4d3e1fb..00000000 Binary files a/Exercise5/Figures/learning_curve.png and /dev/null differ diff --git a/Exercise5/Figures/learning_curve_random.png b/Exercise5/Figures/learning_curve_random.png deleted file mode 100644 index ee965256..00000000 Binary files a/Exercise5/Figures/learning_curve_random.png and /dev/null differ diff --git a/Exercise5/Figures/linear_fit.png b/Exercise5/Figures/linear_fit.png deleted file mode 100755 index 826912f6..00000000 Binary files a/Exercise5/Figures/linear_fit.png and /dev/null differ diff --git a/Exercise5/Figures/polynomial_learning_curve.png b/Exercise5/Figures/polynomial_learning_curve.png deleted file mode 100644 index 39e4af46..00000000 Binary files a/Exercise5/Figures/polynomial_learning_curve.png and /dev/null differ diff --git a/Exercise5/Figures/polynomial_learning_curve_reg_1.png b/Exercise5/Figures/polynomial_learning_curve_reg_1.png deleted file mode 100644 index 01b52b04..00000000 Binary files a/Exercise5/Figures/polynomial_learning_curve_reg_1.png and /dev/null differ diff --git a/Exercise5/Figures/polynomial_regression.png b/Exercise5/Figures/polynomial_regression.png deleted file mode 100644 index 530ae53e..00000000 Binary files a/Exercise5/Figures/polynomial_regression.png and /dev/null differ diff --git a/Exercise5/Figures/polynomial_regression_reg_1.png b/Exercise5/Figures/polynomial_regression_reg_1.png deleted file mode 100644 index e27bb13d..00000000 Binary files a/Exercise5/Figures/polynomial_regression_reg_1.png and /dev/null differ diff --git a/Exercise5/Figures/polynomial_regression_reg_100.png b/Exercise5/Figures/polynomial_regression_reg_100.png deleted file mode 100644 index cb060bc9..00000000 Binary files a/Exercise5/Figures/polynomial_regression_reg_100.png and /dev/null differ diff --git a/Exercise5/exercise5.ipynb b/Exercise5/exercise5.ipynb deleted file mode 100755 index c5e5c679..00000000 --- a/Exercise5/exercise5.ipynb +++ /dev/null @@ -1,927 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Programming Exercise 5:\n", - "# Regularized Linear Regression and Bias vs Variance\n", - "\n", - "## Introduction\n", - "\n", - "In this exercise, you will implement regularized linear regression and use it to study models with different bias-variance properties. Before starting on the programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics.\n", - "\n", - "All the information you need for solving this assignment is in this notebook, and all the code you will be implementing will take place within this notebook. The assignment can be promptly submitted to the coursera grader directly from this notebook (code and instructions are included below).\n", - "\n", - "Before we begin with the exercises, we need to import all libraries required for this programming exercise. Throughout the course, we will be using [`numpy`](http://www.numpy.org/) for all arrays and matrix operations, [`matplotlib`](https://matplotlib.org/) for plotting, and [`scipy`](https://docs.scipy.org/doc/scipy/reference/) for scientific and numerical computation functions and tools. You can find instructions on how to install required libraries in the README file in the [github repository](https://github.com/dibgerge/ml-coursera-python-assignments)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [ - "# used for manipulating directory paths\n", - "import os\n", - "\n", - "# Scientific and vector computation for python\n", - "import numpy as np\n", - "\n", - "# Plotting library\n", - "from matplotlib import pyplot\n", - "\n", - "# Optimization module in scipy\n", - "from scipy import optimize\n", - "\n", - "# will be used to load MATLAB mat datafile format\n", - "from scipy.io import loadmat\n", - "\n", - "# library written for this exercise providing additional functions for assignment submission, and others\n", - "import utils\n", - "\n", - "# define the submission/grader object for this exercise\n", - "grader = utils.Grader()\n", - "\n", - "# tells matplotlib to embed plots within the notebook\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Submission and Grading\n", - "\n", - "\n", - "After completing each part of the assignment, be sure to submit your solutions to the grader. The following is a breakdown of how each part of this exercise is scored.\n", - "\n", - "\n", - "| Section | Part | Submitted Function | Points |\n", - "| :- |:- |:- | :-: |\n", - "| 1 | [Regularized Linear Regression Cost Function](#section1) | [`linearRegCostFunction`](#linearRegCostFunction) | 25 |\n", - "| 2 | [Regularized Linear Regression Gradient](#section2) | [`linearRegCostFunction`](#linearRegCostFunction) |25 |\n", - "| 3 | [Learning Curve](#section3) | [`learningCurve`](#func2) | 20 |\n", - "| 4 | [Polynomial Feature Mapping](#section4) | [`polyFeatures`](#polyFeatures) | 10 |\n", - "| 5 | [Cross Validation Curve](#section5) | [`validationCurve`](#validationCurve) | 20 |\n", - "| | Total Points | |100 |\n", - "\n", - "\n", - "You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "
\n", - "At the end of each section in this notebook, we have a cell which contains code for submitting the solutions thus far to the grader. Execute the cell to see your score up to the current section. For all your work to be submitted properly, you must execute those cells at least once.\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "## 1 Regularized Linear Regression\n", - "\n", - "In the first half of the exercise, you will implement regularized linear regression to predict the amount of water flowing out of a dam using the change of water level in a reservoir. In the next half, you will go through some diagnostics of debugging learning algorithms and examine the effects of bias v.s.\n", - "variance. \n", - "\n", - "### 1.1 Visualizing the dataset\n", - "\n", - "We will begin by visualizing the dataset containing historical records on the change in the water level, $x$, and the amount of water flowing out of the dam, $y$. This dataset is divided into three parts:\n", - "\n", - "- A **training** set that your model will learn on: `X`, `y`\n", - "- A **cross validation** set for determining the regularization parameter: `Xval`, `yval`\n", - "- A **test** set for evaluating performance. These are “unseen” examples which your model did not see during training: `Xtest`, `ytest`\n", - "\n", - "Run the next cell to plot the training data. In the following parts, you will implement linear regression and use that to fit a straight line to the data and plot learning curves. Following that, you will implement polynomial regression to find a better fit to the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load from ex5data1.mat, where all variables will be store in a dictionary\n", - "data = loadmat(os.path.join('Data', 'ex5data1.mat'))\n", - "\n", - "# Extract train, test, validation data from dictionary\n", - "# and also convert y's form 2-D matrix (MATLAB format) to a numpy vector\n", - "X, y = data['X'], data['y'][:, 0]\n", - "Xtest, ytest = data['Xtest'], data['ytest'][:, 0]\n", - "Xval, yval = data['Xval'], data['yval'][:, 0]\n", - "\n", - "# m = Number of examples\n", - "m = y.size\n", - "\n", - "# Plot training data\n", - "pyplot.plot(X, y, 'ro', ms=10, mec='k', mew=1)\n", - "pyplot.xlabel('Change in water level (x)')\n", - "pyplot.ylabel('Water flowing out of the dam (y)');" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.2 Regularized linear regression cost function\n", - "\n", - "Recall that regularized linear regression has the following cost function:\n", - "\n", - "$$ J(\\theta) = \\frac{1}{2m} \\left( \\sum_{i=1}^m \\left( h_\\theta\\left( x^{(i)} \\right) - y^{(i)} \\right)^2 \\right) + \\frac{\\lambda}{2m} \\left( \\sum_{j=1}^n \\theta_j^2 \\right)$$\n", - "\n", - "where $\\lambda$ is a regularization parameter which controls the degree of regularization (thus, help preventing overfitting). The regularization term puts a penalty on the overall cost J. As the magnitudes of the model parameters $\\theta_j$ increase, the penalty increases as well. Note that you should not regularize\n", - "the $\\theta_0$ term.\n", - "\n", - "You should now complete the code in the function `linearRegCostFunction` in the next cell. Your task is to calculate the regularized linear regression cost function. If possible, try to vectorize your code and avoid writing loops.\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [ - "def linearRegCostFunction(X, y, theta, lambda_=0.0):\n", - " \"\"\"\n", - " Compute cost and gradient for regularized linear regression \n", - " with multiple variables. Computes the cost of using theta as\n", - " the parameter for linear regression to fit the data points in X and y. \n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The dataset. Matrix with shape (m x n + 1) where m is the \n", - " total number of examples, and n is the number of features \n", - " before adding the bias term.\n", - " \n", - " y : array_like\n", - " The functions values at each datapoint. A vector of\n", - " shape (m, ).\n", - " \n", - " theta : array_like\n", - " The parameters for linear regression. A vector of shape (n+1,).\n", - " \n", - " lambda_ : float, optional\n", - " The regularization parameter.\n", - " \n", - " Returns\n", - " -------\n", - " J : float\n", - " The computed cost function. \n", - " \n", - " grad : array_like\n", - " The value of the cost function gradient w.r.t theta. \n", - " A vector of shape (n+1, ).\n", - " \n", - " Instructions\n", - " ------------\n", - " Compute the cost and gradient of regularized linear regression for\n", - " a particular choice of theta.\n", - " You should set J to the cost and grad to the gradient.\n", - " \"\"\"\n", - " # Initialize some useful values\n", - " m = y.size # number of training examples\n", - "\n", - " # You need to return the following variables correctly \n", - " J = 0\n", - " grad = np.zeros(theta.shape)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - "\n", - " # ============================================================\n", - " return J, grad" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When you are finished, the next cell will run your cost function using `theta` initialized at `[1, 1]`. You should expect to see an output of 303.993." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "theta = np.array([1, 1])\n", - "J, _ = linearRegCostFunction(np.concatenate([np.ones((m, 1)), X], axis=1), y, theta, 1)\n", - "\n", - "print('Cost at theta = [1, 1]:\\t %f ' % J)\n", - "print('This value should be about 303.993192)\\n' % J)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After completing a part of the exercise, you can submit your solutions for grading by first adding the function you modified to the submission object, and then sending your function to Coursera for grading. \n", - "\n", - "The submission script will prompt you for your login e-mail and submission token. You can obtain a submission token from the web page for the assignment. You are allowed to submit your solutions multiple times, and we will take only the highest score into consideration.\n", - "\n", - "*Execute the following cell to grade your solution to the first part of this exercise.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[1] = linearRegCostFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 1.3 Regularized linear regression gradient\n", - "\n", - "Correspondingly, the partial derivative of the cost function for regularized linear regression is defined as:\n", - "\n", - "$$\n", - "\\begin{align}\n", - "& \\frac{\\partial J(\\theta)}{\\partial \\theta_0} = \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta \\left(x^{(i)} \\right) - y^{(i)} \\right) x_j^{(i)} & \\qquad \\text{for } j = 0 \\\\\n", - "& \\frac{\\partial J(\\theta)}{\\partial \\theta_j} = \\left( \\frac{1}{m} \\sum_{i=1}^m \\left( h_\\theta \\left( x^{(i)} \\right) - y^{(i)} \\right) x_j^{(i)} \\right) + \\frac{\\lambda}{m} \\theta_j & \\qquad \\text{for } j \\ge 1\n", - "\\end{align}\n", - "$$\n", - "\n", - "In the function [`linearRegCostFunction`](#linearRegCostFunction) above, add code to calculate the gradient, returning it in the variable `grad`. Do not forget to re-execute the cell containing this function to update the function's definition.\n", - "\n", - "\n", - "When you are finished, use the next cell to run your gradient function using theta initialized at `[1, 1]`. You should expect to see a gradient of `[-15.30, 598.250]`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "theta = np.array([1, 1])\n", - "J, grad = linearRegCostFunction(np.concatenate([np.ones((m, 1)), X], axis=1), y, theta, 1)\n", - "\n", - "print('Gradient at theta = [1, 1]: [{:.6f}, {:.6f}] '.format(*grad))\n", - "print(' (this value should be about [-15.303016, 598.250744])\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[2] = linearRegCostFunction\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Fitting linear regression\n", - "\n", - "Once your cost function and gradient are working correctly, the next cell will run the code in `trainLinearReg` (found in the module `utils.py`) to compute the optimal values of $\\theta$. This training function uses `scipy`'s optimization module to minimize the cost function.\n", - "\n", - "In this part, we set regularization parameter $\\lambda$ to zero. Because our current implementation of linear regression is trying to fit a 2-dimensional $\\theta$, regularization will not be incredibly helpful for a $\\theta$ of such low dimension. In the later parts of the exercise, you will be using polynomial regression with regularization.\n", - "\n", - "Finally, the code in the next cell should also plot the best fit line, which should look like the figure below. \n", - "\n", - "![](Figures/linear_fit.png)\n", - "\n", - "The best fit line tells us that the model is not a good fit to the data because the data has a non-linear pattern. While visualizing the best fit as shown is one possible way to debug your learning algorithm, it is not always easy to visualize the data and model. In the next section, you will implement a function to generate learning curves that can help you debug your learning algorithm even if it is not easy to visualize the\n", - "data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# add a columns of ones for the y-intercept\n", - "X_aug = np.concatenate([np.ones((m, 1)), X], axis=1)\n", - "theta = utils.trainLinearReg(linearRegCostFunction, X_aug, y, lambda_=0)\n", - "\n", - "# Plot fit over the data\n", - "pyplot.plot(X, y, 'ro', ms=10, mec='k', mew=1.5)\n", - "pyplot.xlabel('Change in water level (x)')\n", - "pyplot.ylabel('Water flowing out of the dam (y)')\n", - "pyplot.plot(X, np.dot(X_aug, theta), '--', lw=2);" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "## 2 Bias-variance\n", - "\n", - "An important concept in machine learning is the bias-variance tradeoff. Models with high bias are not complex enough for the data and tend to underfit, while models with high variance overfit to the training data.\n", - "\n", - "In this part of the exercise, you will plot training and test errors on a learning curve to diagnose bias-variance problems.\n", - "\n", - "### 2.1 Learning Curves\n", - "\n", - "You will now implement code to generate the learning curves that will be useful in debugging learning algorithms. Recall that a learning curve plots training and cross validation error as a function of training set size. Your job is to fill in the function `learningCurve` in the next cell, so that it returns a vector of errors for the training set and cross validation set.\n", - "\n", - "To plot the learning curve, we need a training and cross validation set error for different training set sizes. To obtain different training set sizes, you should use different subsets of the original training set `X`. Specifically, for a training set size of $i$, you should use the first $i$ examples (i.e., `X[:i, :]`\n", - "and `y[:i]`).\n", - "\n", - "You can use the `trainLinearReg` function (by calling `utils.trainLinearReg(...)`) to find the $\\theta$ parameters. Note that the `lambda_` is passed as a parameter to the `learningCurve` function.\n", - "After learning the $\\theta$ parameters, you should compute the error on the training and cross validation sets. Recall that the training error for a dataset is defined as\n", - "\n", - "$$ J_{\\text{train}} = \\frac{1}{2m} \\left[ \\sum_{i=1}^m \\left(h_\\theta \\left( x^{(i)} \\right) - y^{(i)} \\right)^2 \\right] $$\n", - "\n", - "In particular, note that the training error does not include the regularization term. One way to compute the training error is to use your existing cost function and set $\\lambda$ to 0 only when using it to compute the training error and cross validation error. When you are computing the training set error, make sure you compute it on the training subset (i.e., `X[:n,:]` and `y[:n]`) instead of the entire training set. However, for the cross validation error, you should compute it over the entire cross validation set. You should store\n", - "the computed errors in the vectors error train and error val.\n", - "\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [ - "def learningCurve(X, y, Xval, yval, lambda_=0):\n", - " \"\"\"\n", - " Generates the train and cross validation set errors needed to plot a learning curve\n", - " returns the train and cross validation set errors for a learning curve. \n", - " \n", - " In this function, you will compute the train and test errors for\n", - " dataset sizes from 1 up to m. In practice, when working with larger\n", - " datasets, you might want to do this in larger intervals.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The training dataset. Matrix with shape (m x n + 1) where m is the \n", - " total number of examples, and n is the number of features \n", - " before adding the bias term.\n", - " \n", - " y : array_like\n", - " The functions values at each training datapoint. A vector of\n", - " shape (m, ).\n", - " \n", - " Xval : array_like\n", - " The validation dataset. Matrix with shape (m_val x n + 1) where m is the \n", - " total number of examples, and n is the number of features \n", - " before adding the bias term.\n", - " \n", - " yval : array_like\n", - " The functions values at each validation datapoint. A vector of\n", - " shape (m_val, ).\n", - " \n", - " lambda_ : float, optional\n", - " The regularization parameter.\n", - " \n", - " Returns\n", - " -------\n", - " error_train : array_like\n", - " A vector of shape m. error_train[i] contains the training error for\n", - " i examples.\n", - " error_val : array_like\n", - " A vecotr of shape m. error_val[i] contains the validation error for\n", - " i training examples.\n", - " \n", - " Instructions\n", - " ------------\n", - " Fill in this function to return training errors in error_train and the\n", - " cross validation errors in error_val. i.e., error_train[i] and \n", - " error_val[i] should give you the errors obtained after training on i examples.\n", - " \n", - " Notes\n", - " -----\n", - " - You should evaluate the training error on the first i training\n", - " examples (i.e., X[:i, :] and y[:i]).\n", - " \n", - " For the cross-validation error, you should instead evaluate on\n", - " the _entire_ cross validation set (Xval and yval).\n", - " \n", - " - If you are using your cost function (linearRegCostFunction) to compute\n", - " the training and cross validation error, you should call the function with\n", - " the lambda argument set to 0. Do note that you will still need to use\n", - " lambda when running the training to obtain the theta parameters.\n", - " \n", - " Hint\n", - " ----\n", - " You can loop over the examples with the following:\n", - " \n", - " for i in range(1, m+1):\n", - " # Compute train/cross validation errors using training examples \n", - " # X[:i, :] and y[:i], storing the result in \n", - " # error_train[i-1] and error_val[i-1]\n", - " .... \n", - " \"\"\"\n", - " # Number of training examples\n", - " m = y.size\n", - "\n", - " # You need to return these values correctly\n", - " error_train = np.zeros(m)\n", - " error_val = np.zeros(m)\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - " \n", - "\n", - " \n", - " # =============================================================\n", - " return error_train, error_val" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When you are finished implementing the function `learningCurve`, executing the next cell prints the learning curves and produce a plot similar to the figure below. \n", - "\n", - "![](Figures/learning_curve.png)\n", - "\n", - "In the learning curve figure, you can observe that both the train error and cross validation error are high when the number of training examples is increased. This reflects a high bias problem in the model - the linear regression model is too simple and is unable to fit our dataset well. In the next section, you will implement polynomial regression to fit a better model for this dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "X_aug = np.concatenate([np.ones((m, 1)), X], axis=1)\n", - "Xval_aug = np.concatenate([np.ones((yval.size, 1)), Xval], axis=1)\n", - "error_train, error_val = learningCurve(X_aug, y, Xval_aug, yval, lambda_=0)\n", - "\n", - "pyplot.plot(np.arange(1, m+1), error_train, np.arange(1, m+1), error_val, lw=2)\n", - "pyplot.title('Learning curve for linear regression')\n", - "pyplot.legend(['Train', 'Cross Validation'])\n", - "pyplot.xlabel('Number of training examples')\n", - "pyplot.ylabel('Error')\n", - "pyplot.axis([0, 13, 0, 150])\n", - "\n", - "print('# Training Examples\\tTrain Error\\tCross Validation Error')\n", - "for i in range(m):\n", - " print(' \\t%d\\t\\t%f\\t%f' % (i+1, error_train[i], error_val[i]))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[3] = learningCurve\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## 3 Polynomial regression\n", - "\n", - "The problem with our linear model was that it was too simple for the data\n", - "and resulted in underfitting (high bias). In this part of the exercise, you will address this problem by adding more features. For polynomial regression, our hypothesis has the form:\n", - "\n", - "$$\n", - "\\begin{align}\n", - "h_\\theta(x) &= \\theta_0 + \\theta_1 \\times (\\text{waterLevel}) + \\theta_2 \\times (\\text{waterLevel})^2 + \\cdots + \\theta_p \\times (\\text{waterLevel})^p \\\\\n", - "& = \\theta_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\cdots + \\theta_p x_p\n", - "\\end{align}\n", - "$$\n", - "\n", - "Notice that by defining $x_1 = (\\text{waterLevel})$, $x_2 = (\\text{waterLevel})^2$ , $\\cdots$, $x_p =\n", - "(\\text{waterLevel})^p$, we obtain a linear regression model where the features are the various powers of the original value (waterLevel).\n", - "\n", - "Now, you will add more features using the higher powers of the existing feature $x$ in the dataset. Your task in this part is to complete the code in the function `polyFeatures` in the next cell. The function should map the original training set $X$ of size $m \\times 1$ into its higher powers. Specifically, when a training set $X$ of size $m \\times 1$ is passed into the function, the function should return a $m \\times p$ matrix `X_poly`, where column 1 holds the original values of X, column 2 holds the values of $X^2$, column 3 holds the values of $X^3$, and so on. Note that you don’t have to account for the zero-eth power in this function.\n", - "\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [ - "def polyFeatures(X, p):\n", - " \"\"\"\n", - " Maps X (1D vector) into the p-th power.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " A data vector of size m, where m is the number of examples.\n", - " \n", - " p : int\n", - " The polynomial power to map the features. \n", - " \n", - " Returns \n", - " -------\n", - " X_poly : array_like\n", - " A matrix of shape (m x p) where p is the polynomial \n", - " power and m is the number of examples. That is:\n", - " \n", - " X_poly[i, :] = [X[i], X[i]**2, X[i]**3 ... X[i]**p]\n", - " \n", - " Instructions\n", - " ------------\n", - " Given a vector X, return a matrix X_poly where the p-th column of\n", - " X contains the values of X to the p-th power.\n", - " \"\"\"\n", - " # You need to return the following variables correctly.\n", - " X_poly = np.zeros((X.shape[0], p))\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - "\n", - " # ============================================================\n", - " return X_poly" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now you have a function that will map features to a higher dimension. The next cell will apply it to the training set, the test set, and the cross validation set." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "p = 8\n", - "\n", - "# Map X onto Polynomial Features and Normalize\n", - "X_poly = polyFeatures(X, p)\n", - "X_poly, mu, sigma = utils.featureNormalize(X_poly)\n", - "X_poly = np.concatenate([np.ones((m, 1)), X_poly], axis=1)\n", - "\n", - "# Map X_poly_test and normalize (using mu and sigma)\n", - "X_poly_test = polyFeatures(Xtest, p)\n", - "X_poly_test -= mu\n", - "X_poly_test /= sigma\n", - "X_poly_test = np.concatenate([np.ones((ytest.size, 1)), X_poly_test], axis=1)\n", - "\n", - "# Map X_poly_val and normalize (using mu and sigma)\n", - "X_poly_val = polyFeatures(Xval, p)\n", - "X_poly_val -= mu\n", - "X_poly_val /= sigma\n", - "X_poly_val = np.concatenate([np.ones((yval.size, 1)), X_poly_val], axis=1)\n", - "\n", - "print('Normalized Training Example 1:')\n", - "X_poly[0, :]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[4] = polyFeatures\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3.1 Learning Polynomial Regression\n", - "\n", - "After you have completed the function `polyFeatures`, we will proceed to train polynomial regression using your linear regression cost function.\n", - "\n", - "Keep in mind that even though we have polynomial terms in our feature vector, we are still solving a linear regression optimization problem. The polynomial terms have simply turned into features that we can use for linear regression. We are using the same cost function and gradient that you wrote for the earlier part of this exercise.\n", - "\n", - "For this part of the exercise, you will be using a polynomial of degree 8. It turns out that if we run the training directly on the projected data, will not work well as the features would be badly scaled (e.g., an example with $x = 40$ will now have a feature $x_8 = 40^8 = 6.5 \\times 10^{12}$). Therefore, you will\n", - "need to use feature normalization.\n", - "\n", - "Before learning the parameters $\\theta$ for the polynomial regression, we first call `featureNormalize` and normalize the features of the training set, storing the mu, sigma parameters separately. We have already implemented this function for you (in `utils.py` module) and it is the same function from the first exercise.\n", - "\n", - "After learning the parameters $\\theta$, you should see two plots generated for polynomial regression with $\\lambda = 0$, which should be similar to the ones here:\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - "\n", - "You should see that the polynomial fit is able to follow the datapoints very well, thus, obtaining a low training error. The figure on the right shows that the training error essentially stays zero for all numbers of training samples. However, the polynomial fit is very complex and even drops off at the extremes. This is an indicator that the polynomial regression model is overfitting the training data and will not generalize well.\n", - "\n", - "To better understand the problems with the unregularized ($\\lambda = 0$) model, you can see that the learning curve shows the same effect where the training error is low, but the cross validation error is high. There is a gap between the training and cross validation errors, indicating a high variance problem." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "lambda_ = 0\n", - "theta = utils.trainLinearReg(linearRegCostFunction, X_poly, y,\n", - " lambda_=lambda_, maxiter=55)\n", - "\n", - "# Plot training data and fit\n", - "pyplot.plot(X, y, 'ro', ms=10, mew=1.5, mec='k')\n", - "\n", - "utils.plotFit(polyFeatures, np.min(X), np.max(X), mu, sigma, theta, p)\n", - "\n", - "pyplot.xlabel('Change in water level (x)')\n", - "pyplot.ylabel('Water flowing out of the dam (y)')\n", - "pyplot.title('Polynomial Regression Fit (lambda = %f)' % lambda_)\n", - "pyplot.ylim([-20, 50])\n", - "\n", - "pyplot.figure()\n", - "error_train, error_val = learningCurve(X_poly, y, X_poly_val, yval, lambda_)\n", - "pyplot.plot(np.arange(1, 1+m), error_train, np.arange(1, 1+m), error_val)\n", - "\n", - "pyplot.title('Polynomial Regression Learning Curve (lambda = %f)' % lambda_)\n", - "pyplot.xlabel('Number of training examples')\n", - "pyplot.ylabel('Error')\n", - "pyplot.axis([0, 13, 0, 100])\n", - "pyplot.legend(['Train', 'Cross Validation'])\n", - "\n", - "print('Polynomial Regression (lambda = %f)\\n' % lambda_)\n", - "print('# Training Examples\\tTrain Error\\tCross Validation Error')\n", - "for i in range(m):\n", - " print(' \\t%d\\t\\t%f\\t%f' % (i+1, error_train[i], error_val[i]))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "One way to combat the overfitting (high-variance) problem is to add regularization to the model. In the next section, you will get to try different $\\lambda$ parameters to see how regularization can lead to a better model.\n", - "\n", - "### 3.2 Optional (ungraded) exercise: Adjusting the regularization parameter\n", - "\n", - "In this section, you will get to observe how the regularization parameter affects the bias-variance of regularized polynomial regression. You should now modify the the lambda parameter and try $\\lambda = 1, 100$. For each of these values, the script should generate a polynomial fit to the data and also a learning curve.\n", - "\n", - "For $\\lambda = 1$, the generated plots should look like the the figure below. You should see a polynomial fit that follows the data trend well (left) and a learning curve (right) showing that both the cross validation and training error converge to a relatively low value. This shows the $\\lambda = 1$ regularized polynomial regression model does not have the high-bias or high-variance problems. In effect, it achieves a good trade-off between bias and variance.\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - "\n", - "For $\\lambda = 100$, you should see a polynomial fit (figure below) that does not follow the data well. In this case, there is too much regularization and the model is unable to fit the training data.\n", - "\n", - "![](Figures/polynomial_regression_reg_100.png)\n", - "\n", - "*You do not need to submit any solutions for this optional (ungraded) exercise.*" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### 3.3 Selecting $\\lambda$ using a cross validation set\n", - "\n", - "From the previous parts of the exercise, you observed that the value of $\\lambda$ can significantly affect the results of regularized polynomial regression on the training and cross validation set. In particular, a model without regularization ($\\lambda = 0$) fits the training set well, but does not generalize. Conversely, a model with too much regularization ($\\lambda = 100$) does not fit the training set and testing set well. A good choice of $\\lambda$ (e.g., $\\lambda = 1$) can provide a good fit to the data.\n", - "\n", - "In this section, you will implement an automated method to select the $\\lambda$ parameter. Concretely, you will use a cross validation set to evaluate how good each $\\lambda$ value is. After selecting the best $\\lambda$ value using the cross validation set, we can then evaluate the model on the test set to estimate\n", - "how well the model will perform on actual unseen data. \n", - "\n", - "Your task is to complete the code in the function `validationCurve`. Specifically, you should should use the `utils.trainLinearReg` function to train the model using different values of $\\lambda$ and compute the training error and cross validation error. You should try $\\lambda$ in the following range: {0, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10}.\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [ - "def validationCurve(X, y, Xval, yval):\n", - " \"\"\"\n", - " Generate the train and validation errors needed to plot a validation\n", - " curve that we can use to select lambda_.\n", - " \n", - " Parameters\n", - " ----------\n", - " X : array_like\n", - " The training dataset. Matrix with shape (m x n) where m is the \n", - " total number of training examples, and n is the number of features \n", - " including any polynomial features.\n", - " \n", - " y : array_like\n", - " The functions values at each training datapoint. A vector of\n", - " shape (m, ).\n", - " \n", - " Xval : array_like\n", - " The validation dataset. Matrix with shape (m_val x n) where m is the \n", - " total number of validation examples, and n is the number of features \n", - " including any polynomial features.\n", - " \n", - " yval : array_like\n", - " The functions values at each validation datapoint. A vector of\n", - " shape (m_val, ).\n", - " \n", - " Returns\n", - " -------\n", - " lambda_vec : list\n", - " The values of the regularization parameters which were used in \n", - " cross validation.\n", - " \n", - " error_train : list\n", - " The training error computed at each value for the regularization\n", - " parameter.\n", - " \n", - " error_val : list\n", - " The validation error computed at each value for the regularization\n", - " parameter.\n", - " \n", - " Instructions\n", - " ------------\n", - " Fill in this function to return training errors in `error_train` and\n", - " the validation errors in `error_val`. The vector `lambda_vec` contains\n", - " the different lambda parameters to use for each calculation of the\n", - " errors, i.e, `error_train[i]`, and `error_val[i]` should give you the\n", - " errors obtained after training with `lambda_ = lambda_vec[i]`.\n", - "\n", - " Note\n", - " ----\n", - " You can loop over lambda_vec with the following:\n", - " \n", - " for i in range(len(lambda_vec))\n", - " lambda = lambda_vec[i]\n", - " # Compute train / val errors when training linear \n", - " # regression with regularization parameter lambda_\n", - " # You should store the result in error_train[i]\n", - " # and error_val[i]\n", - " ....\n", - " \"\"\"\n", - " # Selected values of lambda (you should not change this)\n", - " lambda_vec = [0, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10]\n", - "\n", - " # You need to return these variables correctly.\n", - " error_train = np.zeros(len(lambda_vec))\n", - " error_val = np.zeros(len(lambda_vec))\n", - "\n", - " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", - "\n", - " # ============================================================\n", - " return lambda_vec, error_train, error_val" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After you have completed the code, the next cell will run your function and plot a cross validation curve of error v.s. $\\lambda$ that allows you select which $\\lambda$ parameter to use. You should see a plot similar to the figure below. \n", - "\n", - "![](Figures/cross_validation.png)\n", - "\n", - "In this figure, we can see that the best value of $\\lambda$ is around 3. Due to randomness\n", - "in the training and validation splits of the dataset, the cross validation error can sometimes be lower than the training error." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "lambda_vec, error_train, error_val = validationCurve(X_poly, y, X_poly_val, yval)\n", - "\n", - "pyplot.plot(lambda_vec, error_train, '-o', lambda_vec, error_val, '-o', lw=2)\n", - "pyplot.legend(['Train', 'Cross Validation'])\n", - "pyplot.xlabel('lambda')\n", - "pyplot.ylabel('Error')\n", - "\n", - "print('lambda\\t\\tTrain Error\\tValidation Error')\n", - "for i in range(len(lambda_vec)):\n", - " print(' %f\\t%f\\t%f' % (lambda_vec[i], error_train[i], error_val[i]))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "*You should now submit your solutions.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "grader[5] = validationCurve\n", - "grader.grade()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.4 Optional (ungraded) exercise: Computing test set error\n", - "\n", - "In the previous part of the exercise, you implemented code to compute the cross validation error for various values of the regularization parameter $\\lambda$. However, to get a better indication of the model’s performance in the real world, it is important to evaluate the “final” model on a test set that was not used in any part of training (that is, it was neither used to select the $\\lambda$ parameters, nor to learn the model parameters $\\theta$). For this optional (ungraded) exercise, you should compute the test error using the best value of $\\lambda$ you found. In our cross validation, we obtained a test error of 3.8599 for $\\lambda = 3$.\n", - "\n", - "*You do not need to submit any solutions for this optional (ungraded) exercise.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.5 Optional (ungraded) exercise: Plotting learning curves with randomly selected examples\n", - "\n", - "In practice, especially for small training sets, when you plot learning curves to debug your algorithms, it is often helpful to average across multiple sets of randomly selected examples to determine the training error and cross validation error.\n", - "\n", - "Concretely, to determine the training error and cross validation error for $i$ examples, you should first randomly select $i$ examples from the training set and $i$ examples from the cross validation set. You will then learn the parameters $\\theta$ using the randomly chosen training set and evaluate the parameters $\\theta$ on the randomly chosen training set and cross validation set. The above steps should then be repeated multiple times (say 50) and the averaged error should be used to determine the training error and cross validation error for $i$ examples.\n", - "\n", - "For this optional (ungraded) exercise, you should implement the above strategy for computing the learning curves. For reference, the figure below shows the learning curve we obtained for polynomial regression with $\\lambda = 0.01$. Your figure may differ slightly due to the random selection of examples.\n", - "\n", - "![](Figures/learning_curve_random.png)\n", - "\n", - "*You do not need to submit any solutions for this optional (ungraded) exercise.*" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/Exercise5/utils.py b/Exercise5/utils.py deleted file mode 100755 index b2340ad7..00000000 --- a/Exercise5/utils.py +++ /dev/null @@ -1,164 +0,0 @@ -import sys -import numpy as np -from scipy import optimize -from matplotlib import pyplot - -sys.path.append('..') -from submission import SubmissionBase - - -def trainLinearReg(linearRegCostFunction, X, y, lambda_=0.0, maxiter=200): - """ - Trains linear regression using scipy's optimize.minimize. - - Parameters - ---------- - X : array_like - The dataset with shape (m x n+1). The bias term is assumed to be concatenated. - - y : array_like - Function values at each datapoint. A vector of shape (m,). - - lambda_ : float, optional - The regularization parameter. - - maxiter : int, optional - Maximum number of iteration for the optimization algorithm. - - Returns - ------- - theta : array_like - The parameters for linear regression. This is a vector of shape (n+1,). - """ - # Initialize Theta - initial_theta = np.zeros(X.shape[1]) - - # Create "short hand" for the cost function to be minimized - costFunction = lambda t: linearRegCostFunction(X, y, t, lambda_) - - # Now, costFunction is a function that takes in only one argument - options = {'maxiter': maxiter} - - # Minimize using scipy - res = optimize.minimize(costFunction, initial_theta, jac=True, method='TNC', options=options) - return res.x - - -def featureNormalize(X): - """ - Normalizes the features in X returns a normalized version of X where the mean value of each - feature is 0 and the standard deviation is 1. This is often a good preprocessing step to do when - working with learning algorithms. - - Parameters - ---------- - X : array_like - An dataset which is a (m x n) matrix, where m is the number of examples, - and n is the number of dimensions for each example. - - Returns - ------- - X_norm : array_like - The normalized input dataset. - - mu : array_like - A vector of size n corresponding to the mean for each dimension across all examples. - - sigma : array_like - A vector of size n corresponding to the standard deviations for each dimension across - all examples. - """ - mu = np.mean(X, axis=0) - X_norm = X - mu - - sigma = np.std(X_norm, axis=0, ddof=1) - X_norm /= sigma - return X_norm, mu, sigma - - -def plotFit(polyFeatures, min_x, max_x, mu, sigma, theta, p): - """ - Plots a learned polynomial regression fit over an existing figure. - Also works with linear regression. - Plots the learned polynomial fit with power p and feature normalization (mu, sigma). - - Parameters - ---------- - polyFeatures : func - A function which generators polynomial features from a single feature. - - min_x : float - The minimum value for the feature. - - max_x : float - The maximum value for the feature. - - mu : float - The mean feature value over the training dataset. - - sigma : float - The feature standard deviation of the training dataset. - - theta : array_like - The parameters for the trained polynomial linear regression. - - p : int - The polynomial order. - """ - # We plot a range slightly bigger than the min and max values to get - # an idea of how the fit will vary outside the range of the data points - x = np.arange(min_x - 15, max_x + 25, 0.05).reshape(-1, 1) - - # Map the X values - X_poly = polyFeatures(x, p) - X_poly -= mu - X_poly /= sigma - - # Add ones - X_poly = np.concatenate([np.ones((x.shape[0], 1)), X_poly], axis=1) - - # Plot - pyplot.plot(x, np.dot(X_poly, theta), '--', lw=2) - - -class Grader(SubmissionBase): - # Random test cases - X = np.vstack([np.ones(10), - np.sin(np.arange(1, 15, 1.5)), - np.cos(np.arange(1, 15, 1.5))]).T - y = np.sin(np.arange(1, 31, 3)) - Xval = np.vstack([np.ones(10), - np.sin(np.arange(0, 14, 1.5)), - np.cos(np.arange(0, 14, 1.5))]).T - yval = np.sin(np.arange(1, 11)) - - def __init__(self): - part_names = ['Regularized Linear Regression Cost Function', - 'Regularized Linear Regression Gradient', - 'Learning Curve', - 'Polynomial Feature Mapping', - 'Validation Curve'] - super().__init__('regularized-linear-regression-and-bias-variance', part_names) - - def __iter__(self): - for part_id in range(1, 6): - try: - func = self.functions[part_id] - # Each part has different expected arguments/different function - if part_id == 1: - res = func(self.X, self.y, np.array([0.1, 0.2, 0.3]), 0.5) - elif part_id == 2: - theta = np.array([0.1, 0.2, 0.3]) - res = func(self.X, self.y, theta, 0.5)[1] - elif part_id == 3: - res = np.hstack(func(self.X, self.y, self.Xval, self.yval, 1)).tolist() - elif part_id == 4: - res = func(self.X[1, :].reshape(-1, 1), 8) - elif part_id == 5: - res = np.hstack(func(self.X, self.y, self.Xval, self.yval)).tolist() - else: - raise KeyError - except KeyError: - yield part_id, 0 - yield part_id, res - diff --git a/Exercise6/ex6.pdf b/Exercise6/ex6.pdf new file mode 100644 index 00000000..2da2af7c Binary files /dev/null and b/Exercise6/ex6.pdf differ diff --git a/Exercise6/exercise6.ipynb b/Exercise6/exercise6.ipynb index cdc43975..1868d109 100755 --- a/Exercise6/exercise6.ipynb +++ b/Exercise6/exercise6.ipynb @@ -18,7 +18,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -101,9 +101,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Load from ex6data1\n", "# You will have X, y as keys in the dict data\n", @@ -147,9 +160,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 3, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# You should try to change the C value below and see how the decision\n", "# boundary varies (e.g., try C = 1000)\n", @@ -180,7 +206,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -213,8 +239,8 @@ " sim = 0\n", " # ====================== YOUR CODE HERE ======================\n", "\n", - "\n", - "\n", + " sim = np.exp(-np.sum((x1 - x2)**2) / (2*sigma**2))\n", + " \n", " # =============================================================\n", " return sim" ] @@ -228,9 +254,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Gaussian Kernel between x1 = [1, 2, 1], x2 = [0, 4, -1], sigma = 2.00:\n", + "\t0.324652\n", + "(for sigma = 2, this value should be about 0.324652)\n", + "\n" + ] + } + ], "source": [ "x1 = np.array([1, 2, 1])\n", "x2 = np.array([0, 4, -1])\n", @@ -251,9 +288,29 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise support-vector-machines\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Gaussian Kernel | 25 / 25 | Nice work!\n", + " Parameters (C, sigma) for Dataset 3 | 0 / 25 | \n", + " Email Processing | 0 / 25 | \n", + " Email Feature Extraction | 0 / 25 | \n", + " --------------------------------\n", + " | 25 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[1] = gaussianKernel\n", "grader.grade()" @@ -272,9 +329,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Load from ex6data2\n", "# You will have X, y as keys in the dict data\n", @@ -298,9 +368,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 8, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# SVM Parameters\n", "C = 1\n", @@ -326,9 +409,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 9, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Load from ex6data3\n", "# You will have X, y, Xval, yval as keys in the dict data\n", @@ -356,7 +452,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 10, "metadata": {}, "outputs": [], "source": [ @@ -406,12 +502,20 @@ " np.mean(predictions != yval)\n", " \"\"\"\n", " # You need to return the following variables correctly.\n", - " C = 1\n", - " sigma = 0.3\n", "\n", " # ====================== YOUR CODE HERE ======================\n", - "\n", - " \n", + " c_sigma = [0.1, 0.3, 1, 3, 10, 30, 100, 300]\n", + " error = float('inf')\n", + " for i in c_sigma:\n", + " for j in c_sigma:\n", + " model = utils.svmTrain(X, y, i, gaussianKernel, args=(j,))\n", + " pred = utils.svmPredict(model, Xval)\n", + " error_temp = np.mean(pred != yval)\n", + " if(error_temp < error):\n", + " error = error_temp\n", + " C = i\n", + " sigma = j\n", + " print(i, j, C, sigma)\n", " \n", " # ============================================================\n", " return C, sigma" @@ -426,9 +530,31 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 11, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.1 0.1 0.1 0.1\n", + "0.3 0.1 0.3 0.1\n", + "0.3 0.1\n" + ] + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Try different SVM Parameters here\n", "C, sigma = dataset3Params(X, y, Xval, yval)\n", @@ -451,9 +577,29 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise support-vector-machines\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Gaussian Kernel | 25 / 25 | Nice work!\n", + " Parameters (C, sigma) for Dataset 3 | 25 / 25 | Nice work!\n", + " Email Processing | 0 / 25 | \n", + " Email Feature Extraction | 0 / 25 | \n", + " --------------------------------\n", + " | 50 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[2] = lambda : (C, sigma)\n", "grader.grade()" @@ -555,7 +701,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 27, "metadata": {}, "outputs": [], "source": [ @@ -643,6 +789,11 @@ " # Stem the email contents word by word\n", " stemmer = utils.PorterStemmer()\n", " processed_email = []\n", + " \n", + " vocab_to_index = {}\n", + " for i in range(len(vocabList)):\n", + " vocab_to_index[vocabList[i]] = i\n", + " \n", " for word in email_contents:\n", " # Remove any remaining non alphanumeric characters in word\n", " word = re.compile('[^a-zA-Z0-9]').sub('', word).strip()\n", @@ -655,7 +806,8 @@ " # Look up the word in the dictionary and add to word_indices if found\n", " # ====================== YOUR CODE HERE ======================\n", "\n", - " \n", + " if word in vocabList:\n", + " word_indices.append(vocab_to_index[word])\n", "\n", " # =============================================================\n", "\n", @@ -676,9 +828,24 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 28, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "----------------\n", + "Processed email:\n", + "----------------\n", + "anyon know how much it cost to host a web portal well it depend on how mani visitor your expect thi can be anywher from less than number buck a month to a coupl of dollar number you should checkout httpaddr or perhap amazon ec number if your run someth big to unsubscrib yourself from thi mail list send an email to emailaddr\n", + "-------------\n", + "Word Indices:\n", + "-------------\n", + "[85, 915, 793, 1076, 882, 369, 1698, 789, 1821, 1830, 882, 430, 1170, 793, 1001, 1894, 591, 1675, 237, 161, 88, 687, 944, 1662, 1119, 1061, 1698, 374, 1161, 476, 1119, 1892, 1509, 798, 1181, 1236, 511, 1119, 809, 1894, 1439, 1546, 180, 1698, 1757, 1895, 687, 1675, 991, 960, 1476, 70, 529, 1698, 530]\n" + ] + } + ], "source": [ "# To use an SVM to classify emails into Spam v.s. Non-Spam, you first need\n", "# to convert each email into a vector of features. In this part, you will\n", @@ -708,9 +875,29 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 29, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise support-vector-machines\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Gaussian Kernel | 25 / 25 | Nice work!\n", + " Parameters (C, sigma) for Dataset 3 | 25 / 25 | Nice work!\n", + " Email Processing | 25 / 25 | Nice work!\n", + " Email Feature Extraction | 0 / 25 | \n", + " --------------------------------\n", + " | 75 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[3] = processEmail\n", "grader.grade()" @@ -738,7 +925,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 36, "metadata": {}, "outputs": [], "source": [ @@ -799,7 +986,8 @@ "\n", " # ===================== YOUR CODE HERE ======================\n", "\n", - " \n", + " for word in word_indices:\n", + " x[word] = 1\n", " \n", " # ===========================================================\n", " \n", @@ -815,9 +1003,23 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 37, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "----------------\n", + "Processed email:\n", + "----------------\n", + "anyon know how much it cost to host a web portal well it depend on how mani visitor your expect thi can be anywher from less than number buck a month to a coupl of dollar number you should checkout httpaddr or perhap amazon ec number if your run someth big to unsubscrib yourself from thi mail list send an email to emailaddr\n", + "\n", + "Length of feature vector: 1899\n", + "Number of non-zero entries: 45\n" + ] + } + ], "source": [ "# Extract Features\n", "with open(os.path.join('Data', 'emailSample1.txt')) as fid:\n", @@ -840,9 +1042,29 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 38, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise support-vector-machines\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Gaussian Kernel | 25 / 25 | Nice work!\n", + " Parameters (C, sigma) for Dataset 3 | 25 / 25 | Nice work!\n", + " Email Processing | 25 / 25 | Nice work!\n", + " Email Feature Extraction | 25 / 25 | Nice work!\n", + " --------------------------------\n", + " | 100 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[4] = emailFeatures\n", "grader.grade()" @@ -862,9 +1084,19 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 39, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Training Linear SVM (Spam Classification)\n", + "This may take 1 to 2 minutes ...\n", + "\n" + ] + } + ], "source": [ "# Load the Spam Email dataset\n", "# You will have X, y in your environment\n", @@ -1021,7 +1253,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.4" + "version": "3.6.8" } }, "nbformat": 4, diff --git a/Exercise7/exercise7.ipynb b/Exercise7/exercise7.ipynb index 2dbde786..1bfe12ff 100755 --- a/Exercise7/exercise7.ipynb +++ b/Exercise7/exercise7.ipynb @@ -18,7 +18,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -142,7 +142,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -185,8 +185,12 @@ " idx = np.zeros(X.shape[0], dtype=int)\n", "\n", " # ====================== YOUR CODE HERE ======================\n", - "\n", " \n", + " for i in range(len(X)):\n", + " x = X[i]\n", + " minimum = np.linalg.norm(centroids - x, axis=1)\n", + " ind = np.where(minimum == minimum.min())[0][0]\n", + " idx[i] = ind\n", " \n", " # =============================================================\n", " return idx" @@ -201,9 +205,19 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 3, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Closest centroids for the first 3 examples:\n", + "[0 2 1]\n", + "(the closest centroids should be 0, 2, 1 respectively)\n" + ] + } + ], "source": [ "# Load an example dataset that we will be using\n", "data = loadmat(os.path.join('Data', 'ex7data2.mat'))\n", @@ -230,9 +244,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise k-means-clustering-and-pca\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Find Closest Centroids (k-Means) | 30 / 30 | Nice work!\n", + " Compute Centroid Means (k-Means) | 0 / 30 | \n", + " PCA | 0 / 20 | \n", + " Project Data (PCA) | 0 / 10 | \n", + " Recover Data (PCA) | 0 / 10 | \n", + " --------------------------------\n", + " | 30 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[1] = findClosestCentroids\n", "grader.grade()" @@ -257,7 +292,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -304,8 +339,9 @@ "\n", "\n", " # ====================== YOUR CODE HERE ======================\n", - "\n", " \n", + " for i in range(K):\n", + " centroids[i] = np.mean(X[idx == i], axis = 0)\n", " \n", " # =============================================================\n", " return centroids" @@ -320,9 +356,25 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Centroids computed after initial finding of closest centroids:\n", + "[[2.42830111 3.15792418]\n", + " [5.81350331 2.63365645]\n", + " [7.11938687 3.6166844 ]]\n", + "\n", + "The centroids should be\n", + " [ 2.428301 3.157924 ]\n", + " [ 5.813503 2.633656 ]\n", + " [ 7.119387 3.616684 ]\n" + ] + } + ], "source": [ "# Compute means based on the closest centroids found in the previous part.\n", "centroids = computeCentroids(X, idx, K)\n", @@ -344,9 +396,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise k-means-clustering-and-pca\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Find Closest Centroids (k-Means) | 30 / 30 | Nice work!\n", + " Compute Centroid Means (k-Means) | 30 / 30 | Nice work!\n", + " PCA | 0 / 20 | \n", + " Project Data (PCA) | 0 / 10 | \n", + " Recover Data (PCA) | 0 / 10 | \n", + " --------------------------------\n", + " | 60 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[2] = computeCentroids\n", "grader.grade()" @@ -367,11 +440,4496 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 8, "metadata": { "scrolled": false }, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
\n", + " \n", + "
\n", + " \n", + "
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "
\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAEICAYAAAB25L6yAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8GearUAAAgAElEQVR4nOydd3iTVdvAfydp00GhFGjZozJkKEtkCoiIgCAuEDd1oX6CC19cvPK6F+LChSLuyVAERVGLyBYosletzELLKhRKR/J8f5yUJmmS58lq0nJ+15WryZl3A71zcp97CE3TUCgUCkXkYgq3AAqFQqHwjlLUCoVCEeEoRa1QKBQRjlLUCoVCEeEoRa1QKBQRjlLUCoVCEeEoRa2IOIQQ+UKIs8ItRyAIIf4VQlwcbjkUVQOlqBVOOCoYIUSaEGJxiPdbKIS43bFN07QETdP+CeW+lR0hxNNCiPVCiBIhxP/CLY8itChFrQgZQoiocMtQ2fHyHu4AxgPzKlAcRZhQilrhFiFEG+BdoIfdFHHU3h4jhJgkhNglhDgghHhXCBFn77tQCLFHCPGwEGI/MF0IkSSEmCuEyBVCHLE/b2Qf/yzQG5hi32OKvV0TQrSwP08UQnxin79TCDFBCGGy96UJIRbb5TkihMgSQgz28jv9K4R4SAixTgiRJ4T4WggR67iWy3hHOT4SQrwthPjJLusSIUQ9IcRr9r23CCE6uWx5vhBik71/eule9vWGCiHWCiGOCiGWCiHau8j5sBBiHXDCnbLWNO1jTdN+Ao4b+fdUVG6Uola4RdO0zcBdwDK7KaKmvesFoBXQEWgBNASecJhaD6gFNAVGI/+PTbe/bgIUAFPsezwO/AmMse8xxo0obwKJwFlAX+Bm4BaH/m7AVqAO8BIwTQghvPxq1wCDgFSgPZCm81a4zp1g36sQWAassb+eAUx2GX8DMBBojnzPJgDYFfqHwJ1AbeA9YI4QIsZh7nXAEKCmpmklPsioqIIoRa0wjF0BjgYe0DTtsKZpx4HngGsdhtmAiZqmFWqaVqBp2iFN02ZqmnbSPv5ZpMI1sp/ZvvajmqYd1zTtX+AV4CaHYTs1TXtf0zQr8DFQH6jrZdk3NE3bp2naYeAH5AeOUWZrmrZa07RTwGzglKZpn9j3/hpwPVFP0TRtt32vZ5HKF+R7+J6maSs0TbNqmvYxUvF3d5Fzt6ZpBT7Ip6iiKBuiwheSgXhgtcOhVQBmhzG5dkUmO4WIB15FnmKT7M3VhRBmu4LzRh0gGtjp0LYTeYovZX/pE03TTtrlSvCy5n6H5yeBBjoyOHLA4XmBm9eu++52eL7TYa+mwCghxFiHfouLLI5zFWc46kSt8IZrasWDSIXUTtO0mvZHoqZpCV7mjAPOBrppmlYD6GNvFx7Gu+5XjFRspTQB9vrwOxjlBPJDCAAhRL0grNnY4XkTYJ/9+W7gWYf3sKamafGapn3pMF6ltVScRilqhTcOAI2EEBYATdNswPvAq0KIFAAhREMhxEAva1RHKvejQohawEQ3e7j1mbafuL8BnhVCVBdCNAUeBD4L4HfyxN9AOyFER/ul3/+CsOY9QohG9t/7caR5BOR7eJcQopuQVBNCDBFCVDe6sBAi2i6nCYgSQsTaTUWKKohS1Apv/A5sBPYLIQ7a2x5GuoYtF0IcA35Fnpg98RoQhzwdLwfmu/S/Dgy3e0a84Wb+WORp9x9gMfAF8iIuqGiatg14Cvn7bLfvFShfAL8gZc8EnrHvtQq4A3mpegT5fqb5uPb7yA/A65AfAgU42+4VVQihCgcoFApFZKNO1AqFQhHhKEWtUCgUEY5S1AqFQhHhGFLUQoiaQogZ9jDZzUKIHqEWTKFQKBQSowEvrwPzNU0bbnfVivc2uE6dOlqzZs0ClU2hUCjOGFavXn1Q07Rkd326iloIkYgMUkgD0DStCCjyNqdZs2asWrXKd0kVCoXiDEUIsdNTnxHTRyqQi8yEliGE+EAIUS1o0ikUCoXCK0YUdRTQGXhH07ROyOCDR1wHCSFGCyFWCSFW5ebmBllMhUKhOHMxoqj3AHs0TVthfz0Dqbid0DRtqqZpXTRN65Kc7NbMolAoFAo/0FXUmqbtB3YLIUrDhPsDm0IqleKMID8XfnsUfhkPeaFIs6RQVBGMen2MBT63e3z8g3PidoXCJ6xWmNIKjjpURVz2MlRvCGO2gyUufLIpFJGIIUWtadpaoEuohPh3EXyfBnm7wRwFrYbC5Z+oP9iqyuQGcDKnfPvxvfBKPXg0r+JlUigimbBHJs5Og4/7wtEs0Eqg5BRsmgHPV4PczeGWzhlrEXwzHJ6KgieF/Pn11bJdYYys390r6VKKjsH6LypOHoWiMhBWRZ31O6z72EOnBlPLXVmGj6ICeD4RNs+E0rokmhW2zILna8h+hT6/Pqw/5o+nQi+HQlGZCKui/v5W7/0lpyBrYYWIostHfcF6yn2ftRCm96pYeSorhfn6Y4pOhF4OhaIyEVZFfczATf/fn4ReDiNk/+W9f39Gxcihx/Yf4Y2W8EJNmNJG2v8jiSYX6I9pcF7o5VAoKhNhVdQmA4WD4mqGXo5gYdUr1RpippwNXwyBIzugMA8ObZH2//e7hVcuRwZM0h8z9N3Qy6FQVCbCqqibe6u0Z6dfJbJXmiuwYp3VCl9eBk+a5MXmkwIObXM/dt9K+HFMxcnmiXWfwUsGPnhfqQ8/jtUfp1CcKYRVUV/xCWW1qN3QsDtYEjz3e2LVe/Lir1SBTW4kLy4DITbJe39FlxV9pR5sm4vhWtWrQnBKLSqAb0fApPrwWjNY8pLnsQWHYbYPFf3+miL/HRUKRZgVdVwi3LUWzJbyfQ26wu3LfF/z2xEw7y7p5lXK8b3wSX9YNtl/Wfu61s52QbNWXHTdb49DwUH9cY5oQTbLrJkGz8dLV8oT+yFvp/ToeNri/n2YcZ3veyx4KHA5IwWrVUZgvt8VPhsUea6nisgmJMVtu3Tpovma5nTHz/KrcWxN6PeMVOK+cmAdvNvB+5gJJf6ZKN7rBPvXeh/TfCDc6FpjOwQ8lwDFfnhGTAzSP3X+fmme8ER0NXjMxbvjuWpQfNL3vYIlczhZMw1+uL18e+2zYcyWipdHEZkIIVZrmuY2sDDsAS+ltBgIV30Kl77pn5IGmHWz/phf/+Pf2sf364/J2+Xf2r5S7KfP9jMxMPMGmV9j5RT/Lz9n3uC9v/iEG1NTxPxPq1iy17pX0gCHtkbWRa8icqlSfz5Hs/TH7F7u39qJjfTH1Grp39q+EhXj3zxrEWz4Aha/AD+NhWeiYJYPduNS9hgwSS1/1fn1WRf7vo/Zz98zkpipY/LZt1IFSyn0qVKKOqa6/piEuv6tPWya/pjLPvBvbV/pfEfw1lr/me/KWni5AD49xiWLzFWf+rYHQO/HfZ8TaRzaqj9m9Tuhl0NRualSilrvwg9g8BT/1q7bHpr189x/zvWQEKI03Hm7IH0iLH5Rnr4umSztwJ6o0dS39dd/5psZJNXA6bjPBOfXlgS4YT5evXwcaXEp9P2vcZkqMyWF4ZZAEelUKUV93h0Q48VPt1EPSGzo//qjfofu48DkcFo0x0C/p+Hqz/1f1xMFefByCrzWFBY9Bb89Ij0tPuoDDx+G+i7XDlGxMHQqRHstPeweX051V+hEi8bVch9d2GIgTCiGCx6D5HOgfme4Zhbc+AvUbg3xKdCgC9y9HrreA2+1la5/H/aGIwbMWpFIQj39MZ1U0mCFDhHj9REojifCd9vDQZfSBq2GwXXfV6hIThQchh3zIaEBpF5obM4zsTKPiDtSzoW719nXzpMnVrMZdi2B6QbCtF254BHo/7zx8Tt+hs8HU86P21IDxu31z/8d5L/j683g+J7yfe1vhis9JfGKULbNlYFJnqjeCB7cXXHyKCIXb14fRgsHRCw/Pwgr3ijzExYm6Hgr3LEKts6BmGrQfHDFRg06UnAY3j4H8rMdGgV0HQuDX/c8L32iZyUNkLNemkQSmzh7yXw3yj852wz3bXyLgTDRJuXc8p284Lzwf9DyUv/2L+WjPu6VNMC6T6BxT+hyZ2B7VCSthsL5Y2QAjyvR1WCMh2hShcKRSn2i/nwo7Jjnvq9RD7jpF5h1o0xUZCuRPtr9n/f8h15wGPathjpnSwUYKNYi6T9sK3HfX60u3LHavTnm2Xgo0fEGaHM1XDPDue1pC9iKfZMzKh4ed/HLXvmONLUUHQME1G4FV38F9TuWn7/+C5j/IBQckgq7023Sju7rh6PVKj1RvGGpUTkLCxzJkpGZuZukgr7gMeh6d7ilUkQSVfJEnbfLs5IG6UL2vIsXyKkjMmpxzQcw2iEb3sFtMK0HnDpc1maOgcs/hHOv91/GOaM9K2mAEwfgtUbSRtt8QFn7d7fqK2mAk4f1x+gi4I4Vzk2fD4EdPzo0aNJ7YWonGPQmdHPIGzL1fMh2+EwuLoGVb8Dq9yC1P+z4idPmkeh4GPwWdEpzL0rWr/riOkacViaSUuHWxeGWQlFZqbSXiXMCcFHLXgVLX5HP83bBW62dlTRIs8OsG4xXG1n7CbyQVJZf5MVasN7gBePng8qeF+XD39ONzWs5yPn1Bz18O02bLDD+IKScU9aW9buLknZh/lh4rrq0i6dPdFbSjlgL7es4fGErPglzboFFz7ifY442LrtCcSZRaRV1oFGAf/xP/vxiGF4TG+kVNwCYexd8PwoKj5a1nToiS4sZQbOVJSCaa/TrsIBe48terngT9voazGOTHhqOzLlNf1pxvsyC50nh6pH+X/fugE366s+Nq+3fngpFZabSKuqazQKbX5Qvbcg5f3sfZy30/qGQv19+zQ+ULbPlz4MGAiQAOt4mbeql/D7B81hPmNwkwzqeXb7NIzbf9yxl+Svl28xmaDHE+7whKjhEcQZSaRX1FQbNA954xmCIsjflOdtPLwtXYmvCzw/pV5IpZe0H8FJteC1VfuAUHfd9z/PvKd9mJOowGGSvcd9+w1yo6+bCEuQ3iHYjQieTQhGpVNrLxIR60PoqWVw21HgrDWUkRNgISakyB4ev5P0rFbYQ4KsDzyX2/NF5u2CB3cPDkihrVYaaep08992VIdOAzrtHVixPaS9D+C1xxtcvyIMZ18gLSs0GpmjoMAqGvBs+V02Fwl8qraIGGDlT5mZe+nLZJZowy5Ni8UnICELujdha5e24jsQlyVzM3kio7+JH7bpGctnlpj8U5UPSWXDkH+NzOt0h7cTvtAveh40v9NDJNZ3cBtL8LPaQnwuT6zvn4LYVy/8Pm2fCQ7lKWSt8Iycnh3379tGxY0evbaGiUitqgP7Pyoc7rMWwLpBINgG36WSKu+QVWZTAG9f/CNXrwRst3OeRLsj1X8RSCn3MT53xvnyEgz5PhFZRTu3kuVDCqSMwc2R5/3OFwhM5OTn069cPm81GRkYGsbGxbttCSaW1URvhyo9koQCjiYAcqdsR7v8X6rQq35e1EJ6vKd3w9JR07bNlkEhCveBXWXGkKA9GZ8h8H5FM5zug35OhW7+oQFb08cbmWfDDaFmXMd9AnnHFmUupQt60aRNbtmxh4sSJbttCTaWOTDTK5Eb6f7ylXDML2lzpuX/Hz85+z95o1h9G2YM49qyEaSFOEv/oSWnHXfU+zBsd2r0CoWFPGPFNYAmyPOFPrpOk5nDPVmUOUTjjqJCbA1kAJhOpqalkZmY6tS1dupRu3QL7Aw+4wosQ4l8hxHohxFohRORoYIMM9aGw6w86Cu7rq/TXqNVanuRH/SpPeNlr4eurjcvgL2+3kz9/vi94a172AVz8IlzxGSS1gBg/q+84snepjMh8Jg6m94VvhsO/iwJfFyCxme9zjmTKRFAKRSmuSnopMA6w2WxkZmbS1t52v70tLS2NU6dCdwvvi426n6ZpPpZUjQxaDYWzr4Ct3+mP9VY0Nm8vlBio+3d4i/zjn34BnAyC/TkxVV6GeUpWdFq+LFk30kj4uVF+faTsPYlPhsuny6ok/niouGI9BbvsCnrzTIhOgHu3G0sN6onEhtI/3Fbk27zje+R7V7e9/3srqg779u3DZpOBAlnAJOAx4Afk6TbdPq60RKrJZOLEiRMhs1VXaRu1I62vlCfdQNALjnHkrdbBUdKD34b7/4FYgyfZzAWB7+mI4wfXyVz45irp7tYmBN8QivPh1SAkwxr8hn/zfhkX+N6KqkHHjh3JyMhg/PjxYDLxMtADeBippM1AP2AT0LZtW9LT06ldO3Rhs0YVtQb8IoRYLYRwaxwQQowWQqwSQqzKzQ2ChgoS23+Ep8wyxPuwwYrPTwp4vbn05XUkRafCuRNBMP03u6gsw1rXe43NqRbAadQoS16Cy6bBuByZ5zv5HPmz1bDA17YVy0o2gdDlThj0hnTV9AVfPWcUVZvY2FjGjRtHamoqIJXlpUAKUI2ywNzBgweTkpISUlkMXSYKIRpqmrZXCJECLADGaprm0aoYKZeJB7fBW2cHtsYdq5wDXp6rLk9+FUGbEXDNN2WvnzTgvRKdIM0zWgDh3UZwl2K1lFVTYV4AOaOrN4YHg1TRPWsh5G6AtR97TiBVSr+ny5cQU5y5ONqp21J2kq4GxAIrgJ4QOZeJmqbttf/MAWYDXQOSqIL49prA1/j4IufX13tJrRpsNn8LT0WVneyrN9afU5wPnW4PrVwg7bnuWP1+YEoavBdM8JXUC6HrGBipdz8hqpiSLiiAu+6Ciy6CsWOhyEej/RmO62ViqU26D9AZqaS7UXGXibqKWghRTQhRvfQ5cAmwIWQSBZEcD8rEF4qOOfvaNuuDX37Z/qJZ4R37BZfeZWIpf39CyG8f3GWxs1phbhDcAhv3CHwNVxIbykAbT1w3J/h7ho3RoyE+Ht57D9LTYcoUiImBBx8Mt2SVBtfLxBcps0lvRp6kx1H+MjFUGPlzrgssFkL8DawE5mmaNl9nTmQQJBfxPSvLnu/42cC6OopcREEdHy42tRJZccXo72M9RUCZ7YwwyE0ZsfQgnUiv+iyw+Yczy98vgAy0uWOVLKQromT+j6Z9YVy29AyqEjzzDLzvIeT01Vel0lbo4nqZOJmyi8Nx48aVawv1ZWKlDHjJz4Uvh8K+vwBN1klsdRmMmFkWtGAtkn66wVBYY3dAreby+W+Pw+Ln9Od4dBETcM8WGfFoxOZcSr2OsH+t8fGhJKk53LujfPvLKYF7usTUhEeO+Dd3xvWw8SucPtBaDYXrfghMpkqFxQLFXqpHxMXBSQM+porTrFixgrS0NEwmE+np6aSkpLhtC5SAbdSRRN4ueKWu9OUt/YPUbLD1e3ixhvz6bbXCi0kERUmbY8qUNEADt2+j6yRZXqtRL5xO13XawL2ZUknn+6HQqocgks8f3ClpgEI/Uq260v5G/+ZN6wkbv6Tct45tc2FKgBfKlQar1buSBmm7VvhEt27dyMjIYNGiRacVsru2UFLpkjK93xWPJoDik/KkndhYPtdDRMmirQc3eR5zuUtSpzZXyhO8V68KK3xyYdkeY7ZBrVTnIb8/pi+fI+ePkafq9418UISQKC+pRuOT4fjuABYXMPA136cdzpQ1Mj1xaJuMfGzWx3/RKgXuyuYogkJsbGy5YBZ3baGiUp2oiwpkQVhvZP5sv0zTYfgMeKIY7tkI9++BhAbO/dHxMGIGnDuy/FxflIlWAm+eJaMaHSn0oUirKRo63ybdBG9Od6MsK/Bys60XT5ruAYauj1roX74NI/Uzfxrr+7qVDotFv/KDqVL9ySvsVKp/tQNGIgM1aZ/WI9vhgjCxIYzbCxMK4e71MP4oPHYC2nqIvus2Fhr46DL5nkvK2o4+VIYxRZeVA2vSG1peClHxEFUN2o6ECcUQm6S/jjne+J7uEGYYNtV9X1E+7PrTv3XrdpS5Ufw98Z4wUD6s4JB/a1c6hulEHd10U8XIoQgqlcr0kdRcfwyA2aLvi9ukt/t5jhW5PfHjGNi3wpgspTiGYu9bDcsmI0/CBu5yS07Ca03BFAM2l99r09fy0bAH7NXJnW0N4A4pJhHu3ijfI1cObpMh80a9UpKaQ90OkNwW+v4v8Kx1tVvDQZ2o00BrbFYavvsOUlPh33/L97VpAx99VNESKYJApVLUCcky37K3UlG1z4azLoa/3vI8RpgCc8da5WeB1cNZML2X92ov3nBV0o7oKWmj1Osig2Zia0HT3mCOhnOvlxVXPPFeJ3xyhYyKk9V5gsXl0+AlnYCWKwIpIFHZyMqCr7+Ghx6Cw4chORneeEP/tK2IWCqVogZZO2/WDZ77r50jT2x/fyqDVdxx2TT/9z+wzv/w7DfP8n/fiqD7OBg4ybc5e1YayyjoSHKAybFciasF593puRp8m6udPXfOCEaOlA9FlaBS2ahBnu6u/LT8V3BLDVnhpE4r+VV6/GFo2s95TExNGPk9dErzf/+Co/7PjXR8VdIAG77wfc7lQagg78rQd2WmQUtCWVtUnMylrcpuKSo7le5EDdLXtv2N0jaauxEadC1fLcRs9r84qjcadw/+msGiehO7e5w/MUx+eo7E1/Ft/PljnJVpMOl6d1m2QYWiKlHpTtSO1Gkl/ZpDUdLJE2YLpJxbcfv5hA3qny8DbcwW6cNdsxnc/JsM3PFG457+bdlrvLFxluowdCpc+qZ/+ygUZzKV8kQdbkZnwEu1PNvA/SWhPpzIlb7X/nB8j3Pipis+gQ52b6wrP4UZHnyghcn/pERmC7S4FHb86HnM+EPSjqxQKPyjUp+ow4XZDI/mSftnbC15Wq3RGIZ/G9i6N/0ig3CumR0cOb+7Gaa0lc/bjZCFe6OrOY+p0Rge3BuYIr1hHjR3U/DXHCM/1JSSVigCo1ImZYpkMhfAZwMpZyeOioXb/4KpnWUVE1dck9bn7ZVlr0oTT5kt0OE2KDwMG7/FpzwmnW6HYQ4J1YryIf+ALAQbzMrb1iL442mZmKndNZB6kf4chUIh8ZaUSSnqEGC1wsIJsOU7mUWv13+ckw0tfFL6YpcUQf1OcPXn/hV0fSbWWJJ9YYYn/DSnKBSKikEp6irK0xb3p3N3TAz+P7NCoQgiVSrNqaKMur4U21UoFJUWpagrMcO/MjYuJjG0cigUitCiFHUlplZzuOQV/XFD3g69LAqFInQoRV3J6fGgTMtaw0OF8ouek2H3CoWi8qICXqoAcYnwwC7pHvfrI7KiSYPzofeE4LrfKRSK8KAUdRXCbIGBk8MthcJfcjnOLNaQQz4AdanOVXSmDiFKjqKoNCjTh0IRAawgi3dYxAHy0ZDxUvs5ztv8wWp2hls8RZhRilqhCDNFWPkZzxWW57EBK6pw7ZmMUtQRzrLJ8EYLmNwYZqcZqwepqFzMQb8Y6Fw2VIAkikhF2agjlLy98MZZYHNQzOs+lo+rvnJfHV1ROdmLfjWK3RyuAEkUkYo6UUcoU1o5K2lHZl0LBervtsoQjb5rjpExiqqLYUUthDALITKEEHNDKZACdvysX4dwppe6kYrKRW9a6I7pR6sKkEQRqfhyor4P2BwqQRRlLH9Nf8yuxaGXQ1ExnEtDLF5OzDFE0Qo/0isqqgyGFLUQohEwBPggtOIoQFZc0R0TejEUFciDDKAalnLt1YnhIS4Og0SKSMLoZeJrwHiguqcBQojRwGiAJk2aBC7ZGUyvh72XtgI465KKkUVRMVgwM44BHOEEy/gHgF60IJG4MEumiAR0z25CiKFAjqZpq72N0zRtqqZpXTRN65KcnBw0Ac9EmvUBSw3vY678uGJkUVQsSVTjUs7lUs5VSlpxGiOmj17AMCHEv8BXwEVCiM9CKpWC+7PK1zcEQMAN88GioooVijMGXdOHpmmPAo8CCCEuBB7SNO1Gr5MUARNXCx7Ll/URFz8HJYXQ+nK48BmVaEmhONNQAS8RTrsR8qFQKM5cfFLUmqYtBBaGRBKFQqFQuEVFJioUCkWEoxS1QqFQRDhKUSsUCkWEoxS1QqFQRDhKUSsUCkWEoxS1QqFQRDhKUSsUCkWEoxS1QqFQRDhKUSsUiqrDtm3QubPMsyAEJCbCtGnhlipglKJWKBRVgzlz4OyzISMDbDbZduwY3H47xMbC4cpbv07l+lAoFJWLvDyYPl2emm+/HeLs6WCvvNLznMJCSE6G/Pyy8ZUIpagVCkXlwGqFdu1g69aytnvvhQ4dIC2t7BTtCZsNbrkFvvoqpGKGAqFpWtAX7dKli7Zq1aqgr6tQKCIXK1b2cQwLJuqSyFIy+YNtFCMVaCxRDOEc2tHQvw2Sk+HgwcCEjI6GoqLA1ggRQojVmqZ1cdenTtQKRYCs/xp2pkNKB+h6d8XvX4SV7/mb3RyiCCvRmKlJPJfQlsYkuZ3zIxtYwy5syINaHaoxgvNI9lxtzyNWrHzCcnZz1Ou4U5Qwk7XkkE8/zvZtkzlzAlfSACUlga8RBtSJWqHwk7WfwJxbQHP5xt17Alz0dMXI8Bf/8hMbPfbXIJaxXIjZocr56/xGHqfcjk+jO02orbuvFSvfs46dHOI4hT7L/TiDnGTS5ZxzYKPn39Mw1avLC8YIRJ2oFYogk7kAvh/lvu/PZyC2JvQcF1oZDpPvVUkDHOMUH7CELjTlFzZTjNXr+E9ZyeMMPv16I3uZxwZOIU+i0ZhpTV3Wsy8g2X9kI5fR3viEvLyA9jvN448HZ50KRrnnKRR+MOsG7/2/PRp6GWaSYWjcAY4zjw26ShrAio3dHAHgN7Ywk7WnlTRAMdaAlTRANj4q3vY+KHVPNG8ODz8c+DphQClqhcIPTuZ677cVQ0GI3XYPcDwk62aSQxFWlpAZkvUBqhPLPo6ygqzTHwxe+fRT/zezWOC++2DHDv/XCDPK9KFQhIiCI7JIcbCwYmURO9jAPgScvggMNrVJ4EfWh2TtUnaQw3ZyTr82IRhMO3ZymC3sx4ZGLeK5kk7UJxFq1YJHHoEXXjC+SQR7ePiKOlErFH4gDPzlJDYL3n57OcJzzOdPdnCEkxzmZPAWd+FcGnKA0F64uX7E2NCYxwY2sI8SbNjQOMgJ3mcxP7FBDnr+eZg/Hxo2lOHhUPbTHbNmGRfIaoVFi2DBAvk8wlCKWqHwg7YjvfcnnyMD54KBFSvTWBqi87N7qhFTgVXYwCEAACAASURBVLt55y92sq/U9W/gQNizRwavaBoUF8OwYc4Ku0EDWLwYhg41tsHVV0NUFPTtC5dcIp/36BFRClspaoXCD4Z/AdU9xG1ExcGdq4O3VzrbgreYQYZwToXv6Y3vWOu+w2yG778vU9yaBnv3Qq9exhbu1s39yXv5cmjSxH+Bg4xS1AqFnzy4By56DizVpSkkKg7OvwceOQ5mS2BrW7FymHwKKGItu4MjsA8kUY2G1KzwfT0RElNPVhasXOm5f98+GWgTAajLREVIyMnJYd++fXTs2NFrW6Sz9BVYOUV6cTS5AIZNB4tDTp/ej8pHsJBmjiXsD5FHhx5mykwIt9GLb1nDZrLDIosjJrzYov3lbgNhpA89JE0rYUYpakXQycnJoV+/fthsNjIyMoiNjXXbFsnk74dXm4LNwWlg49fyMWw6dEoLzb4vs4AiA/7OocKCmZVk0ZVUAEbQGYC3WMghToRNrpbUDf6i+/frjzliwHWwAlCmD0VQKVXImzZtYsuWLUycONFtW6TzRgtnJe3InFsgX8eP2ij5FLKZ/eRynIVsDauSBiighPls4ll+JJ+C0+2X0i6MUsHldAj+oka+2bVoEfx9/UDl+lAEDUeFnERzjpKFMEFqaiqZmZlObUuXLqVbt27hFtktWQvhk37exzS7CEb95v8euRznQ5ZSSOQmCYrGxKMO4eSTWMBJwuOXHIWJEXR2OlkXUcT3bGA/ecQSxSW0pamBPCWnKSiA+HjvY7ZuhVat/JTaN7zl+tA9UQshYoUQK4UQfwshNgohngy+iIrKjquSvo2l9GAcNpuNzMxMkmnLbSylO/djs9lIS0vj1Cn3iYHCzfJX9cfsWeb/+nkU8A6LIlpJAxRj41V+JcfuU/0AF1GT8kn34yrAglqCjS9ZRSbyq8witvMCC9hMNkc4STbH+JjlvMqvWI1+K4mLgwkTPPePHFlhSloPI+9wIXCRpmn5QohoYLEQ4idN05aHWDZFJWLfvn3Y7Inbj5LFUibRm8fYxg8ITIwiHYAdzAfAZDJx4sSJiLRVGwlm8RZnocdX/OX/5ArmOIW8y5+M4DzaUI97uYh8CllBFhoa59OMROKwYmUeG8gkFxsaJ0J08p7JGkbQmYUeXBaPU8j7LOYu+hpb8OmnpQnkrrvK0qgmJMBzz8HYsUGSOnB0FbUmbSP59pfR9kdF+t4rKgEdO3YkIyODiRMnMmnSJJbaXmYbP9CLh2nJpZgwM50+5LKJtm3bkp6eTu3aPnxNDRFWK8y+CTbPkJ4dpmhIvVh/3lkD/NtvPXtDlqMjlMxgNf9lCAAJxNCf1gD8zW5+YuNp27oA2lKfTWSHREmcooTvWed1TA75FGHFYiCNahFWLFdfLYNeIhhD31mEEGZgNdACeEvTtBVuxowGRgM0iSBHcUXFERsby7hx45g5cyaZmZkITLTkUhJIoZhTaPZKH4MHDyYlJSXM0oK1CJ5PBKuDBcZWDJk/6c+9/GPf91tKJr+yxfeJOtxHXzaRwwI2B33tUjQgg110ouxv+0+2lwvG0YCNZFONaE5QHBJZjnnIpe3IWnad9lxxZSeH+JpVTlkBaxLHrfQiIYIiMh0x5PWhaZpV07SOQCOgqxCiXNiSpmlTNU3romlal+Tk5GDLqagElNqpS23So0jHhJliThFNLFfwEQITr776KitWlPusr3A+6OGspMvh7q9DwPXzIC7R9/1CoaRbkUwiCfTgrKCv7coPrCfX/m3AitVrxOQJiokNo/evhpTxY5bxFPNOPybxCx+z3ElJAxylgFf5lYIwXZbq4dM7qWnaUSFEOjAISjOlKBTlLxNLbdLT6YOGxhVMpxHd6M79LLNNJi0tLez+1PvX6AywwcjvYdFTUFIIZw+DC58qn8PjMPms4F/MmOhJcyxEUUQRCQ4XbyvI8km2eKJ5iEsA+IRl/Iv7nKmHOIkVK2bMtKEemzHgGxwA77AIgBiDqkMQXDtpHapRSIluVZn21ONFfqEE5/I7J72c8jXgM1ZwB72DIWpQ0X23hRDJQLFdSccBA4AXQy6ZolLhepm4hBfZwXxy2QTANHrSnfsj5jLRaL6d1sPkwx1FFPE6fzidwpa7KOQYoiiixGdlFY+MQZ9FhkclDXCIE7zEL4znEq6iI5NYQGEF+GIb8VgpwsrDDOAFFgRlTwHcQk/2cpQvvFzIJlON79hQTkkbITvEWQP9xYjpoz6QLoRYB/wFLNA0bW5oxVJUNkovE8ePH48wwTImn744HDduXLm2cF8mGs1st+JNz32T+E33q3KhH0oaYDDtsGJlg4FqKsXYWMh2zJh5mEHU9aNAbSiwYMaCJSjB382ozcMMIA4LLUihF809ju1CU3Y45LquCqiAF0XQWbFiBWlpaZhMJtLT00lJSXHbVhFYrbBgHGRMg5JTEJ8Cg9+AtlfD0xZ5eeiNuDow3k0U4gqy+Nn+bSEUXM/5ZHGIZfxjaLwFM48w6PTr39jMEoNzIx3XQrgFFPERy8g97YwWXJ6we7dUNKq4raJC6datGxkZGZw4ceL0qdldmx4H1sH8B6AwD5oPgAuf8S3Hc1E+vFQHrA7mzPx98O1waNwT6rSDHA/ZM0sp9qALloawTBXAN6ymJcY/zBy/5m9mP2vZEwqxKpyWJDsp6SKKmMSvaCHyEG5EUkjWDRSlqBUhITY2tpz92V2bO6xWeLM55O0sa8teDYtfgCs/hfY3GpNhShtnJe3I7qXQVCdMHOQJ3K2Mftg/faEEG3WoZnh8LNEAzGANmyIg251RGpNEF5oy202+6SYkcR1dndq+YHXIlLQAbsTtgTbsKEWtiDje6+CspB2ZfRM06Ap1dCJ7C/LguM6hco+B2Nohb7lvr0siWRzUXyAAavmgqM0IprCQwzoZ7hKIYSjnkMlB/sLDm1yB3EJPQJb/WkEWO8glgRguoQ1xlE/qvcvLxaovxBDldCGaQgKj6IHFzZ6RgFLUioii4DDkbvQ+ZuZ1+hVUtv+gv5e1AAZMggUPue9vcD608lDN6Uo6Mplf9TcJgAbU5HLa60biAXZ3Ne8uawADaUMr6tGKelxIK14OkkdGMOhGKt08BKn4QxSCeGKcAmSiMXE5HWlL/aDtUxEoRa2IKJa+oj/mwN/6Y+INxlz1HAe1WsKc26DAfkA2WaDbfXDJS57nJRBDb1rwJzuMbeRCPWqw34srmBkTyVQnmerUJJ7ZrDUUkafHJg7QDllDzN2JtSKpjU7mugDpRBMGcw5WrByhgBpYIvbErIdS1IqIwmok6tiAibLFQHSjLZLsqYZbD4PWfuSX7sfZtCKF2azlCCcNWU4FcBUdaUk9XuJnbB5mXU2n08+bUpv76e/U/y2r/QpuqWa3ZZdyBefyHet9XicYXOnwOxqlMUnsRj+ZvwnBJbQBwIyZOiT4vFckoRS1IqI4/25Y9rL3MYnNjK3V8RZY+6Hn/uFfGRbLIw1JYgzebyW3c4BsjtGYmqRSdtQfx8V8xFJyHezKMURxNZ1ooePxUZru01f6cbbT6/Y0CYuiHkRbGvhRk/FaujCJBV4/FOOxcA99nbxFKjtKUSsiiqRUabY46UUPXT7d2FqXT4OiY7BphnO7MMHVX0GD8/yX0xdaUtdtKak4LNzNhQCnw8CN4o/fgxnh1tzxBEN4inl+rGicFBLQkCfiwbTzW4nGYeEhBvAhS51Kg0VjogON6EULEt3kzK7sKEWtiDju+wdeToGSgvJ9vcZDsz7G1xrxrcyS9+tjcGw3NL8EOt8WPFmDhS+K6zD5pJDAXvJ82qMGnl0j7+QC3mOxT+v5gkBwFz78w3khDgv3+PkBV1lRiloRcVgS4PGTsPw1eblYfBLqngtXfg6JDX1fz2yBgZOCL2dF8wPryGC33/ObUcdjX10SeZxBPGvPxeKNKEw+59EIVWHcM0FJgwohVygqBV+wgh0B+m27hmK7YzU7mechMWZtqnEPF5LFQT7F9zS1AnmybkptrqRjxOZ+9pWN7GUuG077ZQugE40ZSnuf1gmoZqJCoQgvBRTpKmkL5tPRie64ko6GTp/n0ZTr6EK8w1oC6Ezj0+aGVOpg8iPVkgbY0MjiIJP5NeQBQxXBYrYzk7VOwTMasIbdvMPCoO2jTB8KRYTj6YTrSCzR3E9/8ilgJmvZw1FAXt5dRQen3Nh6tKTu6VzYnriWLl5TjRrhM1acLu9VGbFi5XcvxRNyOcF69nIuftjrXFCKWqGIcI7h5lbVhVP2hPgJxDGKHqEWiRakcBs9+ZY1fgfiuCvvVZkwkp1wAZuVolYozgTqkXj6hOyJ6mGw9zYkqVwgTh4F/ME2irCyhWzdK8eNZFdaRb1X598ECFppL2WjViginIH2CDtvXObjxVWoSCSOYXRgOJ2JNnAOjPFiV490ahkIgbcE6SysTtQKRYRjxqybVyS5gqq6LGY7i8mkCCsmBGdTj8vpgMXNReW5NGSVToa+Swx8CEUqF9OaFfzrdUwfWgZlL3WiVigiiL0cYQVZ7HXJZ1FbJ+Xp2/wRSrEAeJN0frebNUB6cGwmmxeZT76bzH0DaePVN6Q21Sp1FKEZM51o7LE/lqigZQNUJ2qFIgLYRDazyHBK0mRCcCUdaEdD3bJfJygij4KQKb75bOQIJ932acB7LGIcA5zazZi5l4t4hz9OK/dSUkjgDi4IiawVyWW0Jx4LS8l0CutvQCK3BPFSVylqhSLM7CCHGawp125DYyZricZMAfppBTewl160CIWIuiaMExRRQFG5XCKJxPEIg9jLEdawm2jM9KVl2FOsBpP+tKY/rSmgiFMUU4PYoEdMKkWtUISZWWR47Z+NgQTc4DXgJVA8pWN1JItDHhPyNySJhhFajzBYxGEJ2QeQslErFGHmlENUmzsKKSHZQFmujjQKlkh+kegl6ZMiMJSiVigqAVfQwWv/WdQO6Ot2Tk4Oa9eu9dimV41FQJU/MYcTpagVikpAfZK4mo5u+xpSkxvp7vfaOTk59OvXj+uuu45Tp065bRuB9+TdPWnu9/4KfZSiVijCTH1qeO2va/eRbkdDnmAIA2jDWdThHBrwIBdzG7383rtUIW/atIktW7YwceJEt20p1OAmurpNxtSNZvSntd8yKPRRaU4VijBjxcoL/ILVTcC1GRP/4RK3ASWB4qiQazWvz5GsAwggNTWVzMxMp7alS5fSrVs3ALLIZQcHqUU859E06HKdqQSU5lQI0VgIkS6E2CSE2CiEuC/4IioUZy5mzDzCJbSh/unzqgBaU49HKkhJ37b0ZXqOuwKbzUZmZibJbZtw29KX6X7/MGw2G2lpaafNIqkkM4A2SklXIEbc80qAcZqmrRFCVAdWCyEWaJrm3QNfoVAYxoyZEXSusP327duHzSZP8EeyDrB00ix6P3YNW39YiTCZSEt/DoAd86V/t8lk4sSJE8TGKs+OcKB7otY0LVvTtDX258eBzRCEvH0KhSJsdOzYkYyMDMaPH48Alrw8iw96PMQFDw8nLf05TGYTH/V7jNxNu2jbti3p6enUrl073GKfsfh0mSiEaAZ0gvJ1eIQQo4UQq4QQq3Jz/Stlr1AoKo7Y2FjGjRtHaqrMRyFMJlpe2oWElJpEV4tFs5+4Bw8eTEpKSjhFPeMxrKiFEAnATOB+TdOOufZrmjZV07QumqZ1SU5ODqaMCoUiBJTaqUtt0qUn6eJTRUTHWrjyowcQJhOvvvoqK1b4XiNRETwMKWohRDRSSX+uadqs0IqkUChCjetlYqlN+sM+j/Be5/vYs2Irjbqd7fYyUVHx6F4mCiEEMA3YrGna5NCLpFAoQo3rZeLiF2ewY/4acjftAuCDnv+h+/3D1GVihGDkRN0LuAm4SAix1v64NMRyKc4gNn4L346EH8dCUX64pTkzcL1MXDb5u9MXh+PGjSvXpi4Tw4sKeFGEjayF8OnFoDmnKqZZPxj1e1hEOiNZsWIFaWlpmEwm0tPTSUlJcdumCC3eAl6UolaEhbxd8JqXeInmA+HG+RUnz5nOqVOnOHHihNOp2V2bInR4U9QqH7UiLMy4znt/5s9gtYI5+EF5CjfExsaWsz+7a1OEB5WUSREW9i7XH7Py9dDLoVBUBtSJWhEWjFjcCo64b0+fCIufB5u9OpUwQcdbYdj7wZNPoYgk1IlaERbiDJg9299Uvm3mDbDoqTIlDaDZIOMDmHp+8ORTKCIJdaJWhJQ5d8DaD6UyBTBZoM8EGPw6zLrB87yYmlCnlXNbwWHY8IXnOdmrYNcSaOJ/emaFIiJRJ2pFyJjaRZ50NYc0y7YiWPgEbP8Jzr7C/TxhhjGby7fPuUN/z7l3+SerQhHJKEWtCAm7lkD2as/96z+Dyz+CG3+BxKZgioboeOh0Ozx+EhLqlZ9zJFN/3/xsv0VWKCIWZfpQhIQfRuuPmTsaRnwN9/9rbM2k5nDgb+9jqrlR8Pn74eA2qNsB4hKN7aVQRBJKUSv8xmoFrGC2lO/L368///B23/Yb9j5s0UkJdnAjPCkgNgnaj4I1U6HkZFl/bBLcsghSzvFtb4UinCjTxxnMxm/hxdpSsT0p4NlqsOQl/XkZH8mxz0TBMzFy7rRedsVtp1pd/XXydsGvDzvP80ZcLWg30tjYU0dg5WvOSrq0/Z1z4bABM4pCESkoRX2G8vt/YcY1cOpwWVvJSak4v7zc87ylr8CcW8orwD1L4aVaZUp36Lv6MhQckh8Mz0TBu51g00z9OcO/gt4TpE07EL5QacUUlQilqCOMogJ52jN6yvQHaxH8+Yzn/m1z4MA6930L/uN5XtEx+O1R+bxZH0jpYFymA2vh2+HwdDTs1olavOhp+G8RTNSgs59eHoe2+TdPoQgHSlFHCBu/heeqwfPx8GYLecp8PVX6Dhth1xLI+t2Ygp93j/6Y2aPKt62ZBuhEFK56S/7cMgdy1+vv44qtBD7sAfkGq7ll/+X7HgpFZUNdJkYA6z6D2W6i8I7+Cy+nwH8OefZW+HYEbJrh3NawO9yyWCY0KiqA79Ng2w8ynWitVs5+zZ44urN8mzd3u1KKT0kPi6+9mE+MMPt6uGmB/rj4On5uIPycp1CEAaWoI4Dvb/Xcp1nhm6tg1G/l+6b1hD3LyrfvXQ6vNYbrf4SpnXE6BeduMCZTXFL5tobdYdU73udFx8GMa43t4Y0sD/mot82F72+DkznytdnP5G5Nevs3T6EIB8r0EWYOrHPOW+GOf9PLt+Xtcq+kS8nPhve7oGuq8MQAN94fHW9G9yTa7T59X2cjaDZ4vYVz29JX4MvLypQ0gNWPMn5mC9wwHwoo4mtW8Qa/8y6L2EGO/mSFIgwoRR1mstcaGORG2c6928C0AC4k87Phs8HSA2Sfg8lj8Bue58TVgv7P4veHgytHM8ts5VYrLHgoOOvetgKWxm3lZRawlQMcpYAcjvMFfzGJBVgJ4U2uQuEHSlGHmUbdDQxyc4oNdaj0T2Mhc770AHm/C7xUR9Yz7DoGhn8DlhrO8jUfCOPsB1K/7cZuWP+Z/FnqTaJHjIHIw2/+r4BF7HDbd5Ii3uEPssjlIKqAoyIyUDZqP8jfD3NulxduSc3gsg8hIdm/teq0gqj48n7JTmjwVDSkLSzLDFevE+zP8G9Pfyg4BJMbwyNHoN0I+fDEoDdhVhDs1CBNIEUF3s08jhTm6Y85etAKVpn8yR2HKeBTVgLyM/JCWtGblsYEUChCgDpR+8jXV8Mr9WH7PHkxt20uvJIivS/85Yaf9MdoJTD9gjK3taE6l3qhoPCo50s+R84dCe10Sm35wvPxsHtp8NajbqFHJe2KBqSzjXS2BlEAhcI3lKL2gV8f9pxrYtMMGe3nD836wND3wBynP3b29fKnUf/qYJM+UQbM/PaotB9vnu1+3PAvwJIQxI0NuBQapstRn6f86cFUolBUBMr04QPLJnvvX/KijJrzRvZamHmtPTJOg6gEsJ2SgR5G+Heh/FlcYGy8J7qOgZVTfJ93YL3M71HKuk9ARMFNP0PqRc5jiyLVxPtHbbQCEyLON+2fwU464aV0ukIRItSJ2iBWq74y1XOzW/81TO0Eh7Zy2jOiJN+4koayYJWkVONz3NH0Qv/mFbmxAWsl8El/N4mOIjWo5O9EKPL9v/4CtoRAGIVCH6WoK5BZQbDbVm8okxe91db/YA+ARTonf3/49hpY+Q48X0Nm1AuWm17QsQno1x3tgAXtmBnN4AflKUpYx+7QyqZQuEEpaoOYzbLatTe8XVCteJOgKK6CwzJ50cHN/gV7lHLKdzOtLvvXwE//B0XHg7+2US41Wok8IxEa9ocbOiKmN6YabpJqu+E71jGLCnS3UShQitonOnoJ9QY4z0tVk8yfA98/Kh6KT3gZ4IOp4Vw3uUWqAm0uh2YXGxxsNcHcegyhPeMYYHiPDexjL0f8E1Ch8ANdRS2E+FAIkSOEMJglouoy7H1ocL77vkY9YMjbnucm1A9sb3Osjq81ULuVrPKthyUB+j9N5NqQ/cRSQ/qzCx++ubS7Ds6zF81NxLgt6TuMhJQqFMHByIn6I2BQiOWoNNyxUoYg1+soq5jU6wR3rILbdPx8B0wKbN9mF+qPObwdHjsJ5njv4/5vo/zZd2JgMkUapdn23OVGcSXxLPi/TdKNsJRhGE+gfZQA7E4KhY/ouudpmrZICNEs9KJUHhp1hTt9NFPGJUJqf8hykwVPj17jpX08c76x8Vadk/dP98G1s+HCiTLb3e+P++Z5EmnUaAzXzoH6HeVrI2lc24w5yY42+zlCPK2ox0Hy+dwejWgEc1X7OqKIaILmRy2EGA2MBmjSpEmwlq1S3PyrjGx0DZqJqw3R8XB8H2CChHpQswnUPw8umSwvMvNzYfEL3tev0xZWeTG/lLJ9XtnzXuMhPgV+HAMl3uzfYWLAK7BwIhQ7+GQLM3S+E4a+5X5OVCyUeDvwRttYfsufCPz/dOpIY7/nKhS+EjRFrWnaVGAqQJcuXSLVMSvsjJwpfbI3zZAeHG2HG8sTkpAMNc+Co/94HnP157D2I/21HE/Pvz0Oi5/TnxMOGvcENGclDTIr4Oq3YfefcLebkmHd7vVSpDfKClccQNT0X0kLBANo7fd8hcJXlNdHGDCbZT6Mrnf7lsxpzDaI9zB+0BtQtz2cfYX+OqUZ5qzWyFXSrYbBqEXeU5vmrJclzFzp9zREVwfpD+l4ZtCkp0fjAjQ/jxIxRPEA/TFjMFmIQhEEVAh5EFn+mjRPFJ+AxGZw5adldtNgYDbDf3IgcwH89ggU5kPjXnDZuzIZPsi8ISIKQ0EcvzwYPNkCoXoDGLMVNs6AmOryw8ZshkVeCvCWMv9+50x+c++C1e/ZX1g0lwhEIfX2e02hbT7ctsewjPFYSKMHdQhmAhOFwhi6iloI8SVwIVBHCLEHmKhp2rRQC1aZsBbZ8zU7BHrkbpDh4m2vgRFfB3e/5gPkwx1FBfpKuvAoLHwy8EosF0+CP582lloUoEFXOJoFJ+0ZAIUZ2t8MV3woX3dKcx6/d4X+mgUHy57/8bSDkk4qgmMe/nufiIInW6Hdugdh8E6wJ2cpJa0IG7qmD03TrtM0rb6madGapjVSSro8b7XzHI236RtYbTRaLgjMucXYuKUvS5t3IHQfa1xJA1w3V34juPJTiKkpbc1/T4fnqrtPECXNF96JdtCdTmHxBWZp5vBErgUOxHjud0AAPWluaKxCEQqUjTpAivLhiE4GzAXjA9tjx8/wahN4OhqejYfvbpWneHdsm2tszeKTcKmXslp61D+vzNxiaHwXeL2pzAEy+yZ5qj8tS76sKDPLIVrSaoWNX+qv2/sx+bOowCUplp4NWgPMxgzVt9LT0DiFIlQoRR0gG2fojykMIK/GF5fB54Pg2G7prVFSIE+hz8ZD3t7y4434EJeS/gTEuqk2rkf1RjLop1y2PA8kd4DsVVJ2b6z/DI5kyee/Pqy/blQ89Bwnn5f74CoWEO3lzWhWgEh2njSaXrShHjFEEUcUnWjM4wyiIX68SQpFEFGXiQFiqRa6tdd9Bts9nJA1K7xzjiyN5Ui9TrDHSDUUDZa/akyOmESIrSl/Dnq1LO909hpj83N9sIV/NwpuWQQbv9IfW8uhOlaca61EmwlMHhR1nBWelBVbBFCPGpxDA8yYGcF5xoX1wC4OsY69xGGhNy2xKA8RRYAoRR0gra/SH1OzmX9rz3/Ae3/hUTi4TdZdLGX4l/BakHPb35nhPv91gy7B3QeMn9IBTC4XgWddDP/86tBQYoIYK0RpUCIgWgOrQNy1k0dGnstW6vE9f5PNMbI5xgK2EIWJ6+hCKr4XwTxIPlP5kxKHcjRLyKQxSdyizCeKAFCmjwAxm6HlUO9jLv/Yv7ULDumPWf+58+vEJnDxi/7t545u93kuUpCUClEGyof5Qlwt+bOVznsK0Ok259fXz4dq9VwGFZrhhAnRoJBalx/n1vVFPDH5LDI5zGzWYnMxZpdg41NWstvH7HhFWHmbP5yUdCm7OcJ0gln0UXGmoRR1ELj+B5k9zx1D3pW+zf5gxHUszo35tNd4GJcNTfuBpTpYEg1m1atuz6ltgsSmcPNvMOg173OuN3h5aZTBr9t/eggPL0VEyXJijpjN8FA2DJ0qE2aZY2WhheHfmHjin3jGfpZE4+ZxHCafWXi323zLaqfX2eTxEct4j0X8xAasWJ36v8e7fWc3RyjCww2wQqGD0PwN0fJCly5dtFWrVgV93Ugnby/8fB/kH4CmfeDCp6Ty8JcPesLeZd7HTCgxtscLNfVd6Rr2gNv9OPjtWgJfXwUnc3yf60hiM7g/q+x11u/wycWU8+AQJrjrb0g5x7f1czjGhyylyEXJeuIJhmDFyhQWkucmW94QzuE8ew3FZ/kJq04F3m40YyDtfBNaccYghFitaZpbg6KyUQeRxIZwjQEvEKOM+Mq7vbnlUOMfBHXbw64/vY9pdan+OvtWww93bbeHHAAACuVJREFUwPG9Msf2kHehSS/4zwHZP6k+nNhvTCZHmvaFtIXObakXwYRi+Pl+2Pq9VNAdRkG/J31fP48C3kXnDXDDByxxq6QB5rGBBtSkPonYDJRJL0CnqKZC4QFl+ohgEpvAzenya74rqf2lycWV3M2Q8REccElWdNVnOpsJ6DPB+5APL4D3u8D+DDiRIyMbP+wBUx3OAM0H6uzjQpvh8luBq5IuxWyGS9+EB3bB/f+WV9K7lsB7nWByY5jep8y9z5Vv8P0bXhFFHMB7XbHZ9rJc1Q0UHTiHBj7LoFCAOlFHPKkXwhPFsqDtppmQkAIXPQ8Wl0u8zAXw5TDnOopmC1z1BbS9Wir9Cx7xnCp15Hfe5fhxDOxe4r4ve7UMwrniQxg2FdbpXJ6KKKjdUsrmLRfKyncgYyogoOu95UPMp7SBQw6FwY/vgTfOgnYjYbiLe182x7wL5UJN4ljJTt1xB5G5YS+jvdd81mYELUjxSQaFohRlo64C7FoC0y/w3B+bBDYr1GoBPR+GhRPgyD+AgEbdZXrURJ0U4k9FSd9tTwgTPGHv3zYXvrzM/Zg7M6QZxhu5m+Gd9uVzlpgsMGaL9Db5YqhzXm1XBkwqC4YBeAovgz3Qnoasw01UkQtPMASAGaxhE9lux9zBBdTH1dlboShD2airOF/r+HKfsnua7V8Ds0bKREhjt/m2hzclDTIi0mqVpopWQ2FCIfxwF+z4SSroc2+A/s8bs6m/c677/WxFMOVseKzAu5IGGXXpqKhNiHKueHpkon87WsPB5DGczmznAHNZTz6FCATNSeYKOhBnsMq5O/IpZAGbOUkRLUmmKx78JRVVFqWoqwC+elus+wRaDIRzrw+NPCDNLqVZ8Xzhj6e9fyjYiuGP/+mv41oI+Dya8JcBU4YjJyimOrEc91If8TKcvx60pC4PUNenfbwxjSXspSwHQSa5/MwmhnMebXB1GldUVdRl4hnKT/caH+vpgs4REeW7K6LVCr//F15Igmdi4fXmsOpd/XkbDCRrcmUw51AdY9nyHLmHCz2GgHcnleZ+RDAa5VOWOynpUjSkn3euzkWnouqgTtRnKEaiHkuZaiBUvLtOuLsrRfnwUrLz5ae3MmOOmKORSTq8WDIsblJHP8DFzGcjq9mJ1YAZJAoTFsw8wiAy2MUitlOMlRRqcBUdSCDIYZkOWLGShfd/pJms4S76hkwGReSgFHUVoEFX2Ge8gLbEYML8/P1w6rD+uOWvwLJJ8uLyklfKe2i4MqW1s5L2hXOuh5wNMte3J/p7qJk4iHYMsgedvM3C014b7uhNi9PPO9GETlRc0ebV7NIdk0O+7hhF1UCZPnzAaoXpfWVO5dLHi7Vg+4/hlWvUQjD7+K2+psH7qKzfjY3TbIAmlfqcW+DzwZ7H5ufKgBl/ECbo+19ZNae+h5P+eXfKepR63Elvj2aNJiTRm5Zu+yqCwgAqpCuqHkpRG8RqhRdrwq5Fzu2njsAXQ2B9kMtt+YIlDh49Bmdfbs/VYYCrPtcfA5DUQn+MO3bM91zEYMss/9ZEwK0OIfWj/4J7tkKTPlC7tfz9xx+CoQbs3ABmzIyhH0kOJowYzNxEV9LCnO2uPY10x8QRXQGSKCIBZfowyLy7ZCUST8y+UVYWDxdmC1zrErTyXmcZRejK4LehcXdj6zbqKk+xvhQkKGXuXfCgm/qxMTWNzT//Htg8WyananetzAroemFZpxXc8ofvsgEsZju/4+ynWIiVz/mLsfQjMYQ2aD0SiSMeCye9JHIapPKGnDGoE7VB/v7Ee79WYtxMUFHcuQbu3wmthsmES73Gy3BtI2YBR/oZqAbujvwD7tvbDtefW7sVXDoFxu2Vyn7gpMASXLlykPxySroUGxpv46f2DyL30BeTh8uEttTnXBpWsESKcKFO1AaxGchQuWtJWfWTSCGxCVz3fWBr9H5U+jYvnOjbydrs4X+X2SwLLngzgVwdYlOSXprTYqzs4hBNqB1aQbwQh4VHGchCtrOKnVixkUgsw+hIY1Ue7IxCKWqDmKJdiqe6oWHXipElHPSZIB+ZC+DgFmh8Abzf2fucttd47hs5U6YwzfrNpUPAVV96zwESDIx4TPzFzrAqapB29P60pj+twyqHIrwoRW2QdiNl8VVPCLOM9qvqNB8gHwDNB0HmfPfjhBmGve99rZt/hYI8WPAQ5GfLzHvdxgZXXk8IPUdswKwsg//f3v28WFXGcRx/f7jjlD9iMrKFP9CBJJuEmLCwlIo0MBRdRKBiizZtKqcIwlr0DxSRi4pkshZZLUxIpJ9QmzYykwblrxIrHTMcF1nUwqxvi3OsMZ1z78Tcc57D/bxWc8+dmfPly70fzn3Oc5/HEuFXYovWvQFdBStZrnm1tFKSsemDbPz7v67ogYEfshuczUztyQJ9457yQhpgPtc0/Z27KpyeZzaWr6hb1GjAlrMwuPTimRTdM2Dtdrjpgepqq9KG97Kpi8Mvw+9nYPF6mHVj1VU1dz/9PMcn4z5/FVcykzZuMW82AQ7qCWh0ZzMpAP4819oVYydoNMq9Gp4MU+lmI7fyFkOXPDedbjZzd/lFmY2jpaCWtArYCjSAwYgYZ/n5zuGQrr/ruY5nWc3nfMsRTjOFBitZxGxanOhtVpKmQS2pAbwE3AuMAEOSdkfEwXYXZ1aG5SxkucejLWGt3Ey8DTgaEcci4hzwDrCuvWWZmdkFrQT1HODEmMcj+bGLSHpY0rCk4dHR0cmqz8ys403a9LyI2BYRSyJiyaxZ7VtM3cys07QS1CeBeWMez82PmZlZCZruQi6pC/gGWEEW0EPAxog4UPA3ozDBDeom5lrgTBv/f925P8Xcn2LuT7F29Wd+RFx2OKLprI+IOC/pUeAjsul524tCOv+bto59SBoeb1t1c3+acX+KuT/FquhPS/OoI+J9oOJ9TMzMOpPX+jAzS1xdg3pb1QUkzv0p5v4Uc3+Kld6fpjcTzcysWnW9ojYz6xgOajOzxNUqqCWtknRE0lFJW6quJyWS5kn6TNJBSQckDVRdU4okNSTtl7Sn6lpSI+lqSTslHZZ0SNLtVdeUEklP5O+tryW9LalgK5HJVZugHrOK331AH7BBUl+1VSXlPPBkRPQBS4FH3J/LGgAOVV1EorYCH0bEIuBm3Kd/SJoDbAaWRMRisu+UrC/r/LUJaryKX6GIOBUR+/KffyV7k12yeFYnkzQXWA0MVl1LaiT1AHcCrwFExLmI+LnaqpLTBUzNv609DfixrBPXKahbWsXPQNICoB/YW20lyXkReAr4q+pCEtQLjAKv50NDg5K8F1kuIk4CzwPHgVPA2Yj4uKzz1ymorQWSZgDvAo9HxC9V15MKSWuA0xHxRdW1JKoLuAV4JSL6gd8A3wfKSZpJ9gm+F5gNTJe0qazz1ymovYpfE5KmkIX0jojYVXU9iVkGrJX0Pdmw2T2S3qy2pKSMACMRceFT2E6y4LbMSuC7iBiNiD+AXcAdZZ28TkE9BCyU1Cupm2wgf3fFNSVDksjGFw9FxAtV15OaiHg6IuZGxAKy186nEVHaFVHqIuIn4ISkG/JDKwBvt/ev48BSSdPy99oKSrzZWptdyP/PKn4dZhnwIPCVpC/zY8/kC2qZteIxYEd+IXQMeKjiepIREXsl7QT2kc2w2k+JXyX3V8jNzBJXp6EPM7OO5KA2M0ucg9rMLHEOajOzxDmozcwS56A2M0ucg9rMLHF/A8FrZgQgirH3AAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Load an example dataset\n", "data = loadmat(os.path.join('Data', 'ex7data2.mat'))\n", @@ -422,7 +4980,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 9, "metadata": {}, "outputs": [], "source": [ @@ -453,8 +5011,9 @@ " centroids = np.zeros((K, n))\n", "\n", " # ====================== YOUR CODE HERE ======================\n", - "\n", - "\n", + " cluster_index = np.random.choice(m, K, replace=False)\n", + " for i in range(len(cluster_index)):\n", + " centroids[i] = X[cluster_index[i]]\n", " \n", " # =============================================================\n", " return centroids" @@ -503,9 +5062,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 10, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# ======= Experiment with these parameters ================\n", "# You should try different values for those parameters\n", @@ -586,9 +5158,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 11, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Load the dataset into the variable X \n", "data = loadmat(os.path.join('Data', 'ex7data1.mat'))\n", @@ -627,7 +5212,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 32, "metadata": {}, "outputs": [], "source": [ @@ -674,7 +5259,8 @@ "\n", " # ====================== YOUR CODE HERE ======================\n", "\n", - " \n", + " sigma = np.dot(X.T,X)/m\n", + " U, S, _ = np.linalg.svd(sigma, full_matrices=True)\n", " \n", " # ============================================================\n", " return U, S" @@ -699,9 +5285,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 33, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Top eigenvector: U[:, 0] = [-0.707107 -0.707107]\n", + " (you should expect to see [-0.707107 -0.707107])\n" + ] + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Before running PCA, it is important to first normalize X\n", "X_norm, mu, sigma = utils.featureNormalize(X)\n", @@ -735,9 +5342,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 34, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise k-means-clustering-and-pca\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Find Closest Centroids (k-Means) | 30 / 30 | Nice work!\n", + " Compute Centroid Means (k-Means) | 30 / 30 | Nice work!\n", + " PCA | 20 / 20 | Nice work!\n", + " Project Data (PCA) | 0 / 10 | \n", + " Recover Data (PCA) | 0 / 10 | \n", + " --------------------------------\n", + " | 80 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[3] = pca\n", "grader.grade()" @@ -763,7 +5391,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 39, "metadata": {}, "outputs": [], "source": [ @@ -808,7 +5436,7 @@ "\n", " # ====================== YOUR CODE HERE ======================\n", "\n", - "\n", + " Z = np.dot(X,U[:,:K])\n", " \n", " # =============================================================\n", " return Z" @@ -823,9 +5451,18 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 40, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Projection of the first example: 1.481274\n", + "(this value should be about : 1.481274)\n" + ] + } + ], "source": [ "# Project the data onto K = 1 dimension\n", "K = 1\n", @@ -843,9 +5480,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 41, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise k-means-clustering-and-pca\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Find Closest Centroids (k-Means) | 30 / 30 | Nice work!\n", + " Compute Centroid Means (k-Means) | 30 / 30 | Nice work!\n", + " PCA | 20 / 20 | Nice work!\n", + " Project Data (PCA) | 10 / 10 | Nice work!\n", + " Recover Data (PCA) | 0 / 10 | \n", + " --------------------------------\n", + " | 90 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[4] = projectData\n", "grader.grade()" @@ -864,7 +5522,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 46, "metadata": {}, "outputs": [], "source": [ @@ -913,7 +5571,7 @@ "\n", " # ====================== YOUR CODE HERE ======================\n", "\n", - " \n", + " X_rec = np.dot(U[:,:K], Z.T).T\n", "\n", " # =============================================================\n", " return X_rec" @@ -932,9 +5590,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 47, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Approximation of the first example: [-1.047419 -1.047419]\n", + " (this value should be about [-1.047419 -1.047419])\n" + ] + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "X_rec = recoverData(Z, U, K)\n", "print('Approximation of the first example: [{:.6f} {:.6f}]'.format(X_rec[0, 0], X_rec[0, 1]))\n", @@ -962,9 +5641,30 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 48, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Submitting Solutions | Programming Exercise k-means-clustering-and-pca\n", + "\n", + "Use token from last successful submission (ajasmineflower@gmail.com)? (Y/n): Y\n", + " Part Name | Score | Feedback\n", + " --------- | ----- | --------\n", + " Find Closest Centroids (k-Means) | 30 / 30 | Nice work!\n", + " Compute Centroid Means (k-Means) | 30 / 30 | Nice work!\n", + " PCA | 20 / 20 | Nice work!\n", + " Project Data (PCA) | 10 / 10 | Nice work!\n", + " Recover Data (PCA) | 10 / 10 | Nice work!\n", + " --------------------------------\n", + " | 100 / 100 | \n", + "\n" + ] + } + ], "source": [ "grader[5] = recoverData\n", "grader.grade()" @@ -985,9 +5685,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 49, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Load Face dataset\n", "data = loadmat(os.path.join('Data', 'ex7faces.mat'))\n", @@ -1010,9 +5723,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 50, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# normalize X by subtracting the mean value from each feature\n", "X_norm, mu, sigma = utils.featureNormalize(X)\n", @@ -1037,9 +5763,17 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 51, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The projected data Z has a shape of: (5000, 100)\n" + ] + } + ], "source": [ "# Project images to the eigen space using the top k eigenvectors \n", "# If you are applying a machine learning algorithm \n", @@ -1068,9 +5802,34 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 52, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], "source": [ "# Project images to the eigen space using the top K eigen vectors and \n", "# visualize only using those K dimensions\n", @@ -1099,9 +5858,799 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 53, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "application/javascript": [ + "/* Put everything inside the global mpl namespace */\n", + "window.mpl = {};\n", + "\n", + "\n", + "mpl.get_websocket_type = function() {\n", + " if (typeof(WebSocket) !== 'undefined') {\n", + " return WebSocket;\n", + " } else if (typeof(MozWebSocket) !== 'undefined') {\n", + " return MozWebSocket;\n", + " } else {\n", + " alert('Your browser does not have WebSocket support. ' +\n", + " 'Please try Chrome, Safari or Firefox ≥ 6. ' +\n", + " 'Firefox 4 and 5 are also supported but you ' +\n", + " 'have to enable WebSockets in about:config.');\n", + " };\n", + "}\n", + "\n", + "mpl.figure = function(figure_id, websocket, ondownload, parent_element) {\n", + " this.id = figure_id;\n", + "\n", + " this.ws = websocket;\n", + "\n", + " this.supports_binary = (this.ws.binaryType != undefined);\n", + "\n", + " if (!this.supports_binary) {\n", + " var warnings = document.getElementById(\"mpl-warnings\");\n", + " if (warnings) {\n", + " warnings.style.display = 'block';\n", + " warnings.textContent = (\n", + " \"This browser does not support binary websocket messages. \" +\n", + " \"Performance may be slow.\");\n", + " }\n", + " }\n", + "\n", + " this.imageObj = new Image();\n", + "\n", + " this.context = undefined;\n", + " this.message = undefined;\n", + " this.canvas = undefined;\n", + " this.rubberband_canvas = undefined;\n", + " this.rubberband_context = undefined;\n", + " this.format_dropdown = undefined;\n", + "\n", + " this.image_mode = 'full';\n", + "\n", + " this.root = $('
');\n", + " this._root_extra_style(this.root)\n", + " this.root.attr('style', 'display: inline-block');\n", + "\n", + " $(parent_element).append(this.root);\n", + "\n", + " this._init_header(this);\n", + " this._init_canvas(this);\n", + " this._init_toolbar(this);\n", + "\n", + " var fig = this;\n", + "\n", + " this.waiting = false;\n", + "\n", + " this.ws.onopen = function () {\n", + " fig.send_message(\"supports_binary\", {value: fig.supports_binary});\n", + " fig.send_message(\"send_image_mode\", {});\n", + " if (mpl.ratio != 1) {\n", + " fig.send_message(\"set_dpi_ratio\", {'dpi_ratio': mpl.ratio});\n", + " }\n", + " fig.send_message(\"refresh\", {});\n", + " }\n", + "\n", + " this.imageObj.onload = function() {\n", + " if (fig.image_mode == 'full') {\n", + " // Full images could contain transparency (where diff images\n", + " // almost always do), so we need to clear the canvas so that\n", + " // there is no ghosting.\n", + " fig.context.clearRect(0, 0, fig.canvas.width, fig.canvas.height);\n", + " }\n", + " fig.context.drawImage(fig.imageObj, 0, 0);\n", + " };\n", + "\n", + " this.imageObj.onunload = function() {\n", + " fig.ws.close();\n", + " }\n", + "\n", + " this.ws.onmessage = this._make_on_message_function(this);\n", + "\n", + " this.ondownload = ondownload;\n", + "}\n", + "\n", + "mpl.figure.prototype._init_header = function() {\n", + " var titlebar = $(\n", + " '
');\n", + " var titletext = $(\n", + " '
');\n", + " titlebar.append(titletext)\n", + " this.root.append(titlebar);\n", + " this.header = titletext[0];\n", + "}\n", + "\n", + "\n", + "\n", + "mpl.figure.prototype._canvas_extra_style = function(canvas_div) {\n", + "\n", + "}\n", + "\n", + "\n", + "mpl.figure.prototype._root_extra_style = function(canvas_div) {\n", + "\n", + "}\n", + "\n", + "mpl.figure.prototype._init_canvas = function() {\n", + " var fig = this;\n", + "\n", + " var canvas_div = $('
');\n", + "\n", + " canvas_div.attr('style', 'position: relative; clear: both; outline: 0');\n", + "\n", + " function canvas_keyboard_event(event) {\n", + " return fig.key_event(event, event['data']);\n", + " }\n", + "\n", + " canvas_div.keydown('key_press', canvas_keyboard_event);\n", + " canvas_div.keyup('key_release', canvas_keyboard_event);\n", + " this.canvas_div = canvas_div\n", + " this._canvas_extra_style(canvas_div)\n", + " this.root.append(canvas_div);\n", + "\n", + " var canvas = $('');\n", + " canvas.addClass('mpl-canvas');\n", + " canvas.attr('style', \"left: 0; top: 0; z-index: 0; outline: 0\")\n", + "\n", + " this.canvas = canvas[0];\n", + " this.context = canvas[0].getContext(\"2d\");\n", + "\n", + " var backingStore = this.context.backingStorePixelRatio ||\n", + "\tthis.context.webkitBackingStorePixelRatio ||\n", + "\tthis.context.mozBackingStorePixelRatio ||\n", + "\tthis.context.msBackingStorePixelRatio ||\n", + "\tthis.context.oBackingStorePixelRatio ||\n", + "\tthis.context.backingStorePixelRatio || 1;\n", + "\n", + " mpl.ratio = (window.devicePixelRatio || 1) / backingStore;\n", + "\n", + " var rubberband = $('');\n", + " rubberband.attr('style', \"position: absolute; left: 0; top: 0; z-index: 1;\")\n", + "\n", + " var pass_mouse_events = true;\n", + "\n", + " canvas_div.resizable({\n", + " start: function(event, ui) {\n", + " pass_mouse_events = false;\n", + " },\n", + " resize: function(event, ui) {\n", + " fig.request_resize(ui.size.width, ui.size.height);\n", + " },\n", + " stop: function(event, ui) {\n", + " pass_mouse_events = true;\n", + " fig.request_resize(ui.size.width, ui.size.height);\n", + " },\n", + " });\n", + "\n", + " function mouse_event_fn(event) {\n", + " if (pass_mouse_events)\n", + " return fig.mouse_event(event, event['data']);\n", + " }\n", + "\n", + " rubberband.mousedown('button_press', mouse_event_fn);\n", + " rubberband.mouseup('button_release', mouse_event_fn);\n", + " // Throttle sequential mouse events to 1 every 20ms.\n", + " rubberband.mousemove('motion_notify', mouse_event_fn);\n", + "\n", + " rubberband.mouseenter('figure_enter', mouse_event_fn);\n", + " rubberband.mouseleave('figure_leave', mouse_event_fn);\n", + "\n", + " canvas_div.on(\"wheel\", function (event) {\n", + " event = event.originalEvent;\n", + " event['data'] = 'scroll'\n", + " if (event.deltaY < 0) {\n", + " event.step = 1;\n", + " } else {\n", + " event.step = -1;\n", + " }\n", + " mouse_event_fn(event);\n", + " });\n", + "\n", + " canvas_div.append(canvas);\n", + " canvas_div.append(rubberband);\n", + "\n", + " this.rubberband = rubberband;\n", + " this.rubberband_canvas = rubberband[0];\n", + " this.rubberband_context = rubberband[0].getContext(\"2d\");\n", + " this.rubberband_context.strokeStyle = \"#000000\";\n", + "\n", + " this._resize_canvas = function(width, height) {\n", + " // Keep the size of the canvas, canvas container, and rubber band\n", + " // canvas in synch.\n", + " canvas_div.css('width', width)\n", + " canvas_div.css('height', height)\n", + "\n", + " canvas.attr('width', width * mpl.ratio);\n", + " canvas.attr('height', height * mpl.ratio);\n", + " canvas.attr('style', 'width: ' + width + 'px; height: ' + height + 'px;');\n", + "\n", + " rubberband.attr('width', width);\n", + " rubberband.attr('height', height);\n", + " }\n", + "\n", + " // Set the figure to an initial 600x600px, this will subsequently be updated\n", + " // upon first draw.\n", + " this._resize_canvas(600, 600);\n", + "\n", + " // Disable right mouse context menu.\n", + " $(this.rubberband_canvas).bind(\"contextmenu\",function(e){\n", + " return false;\n", + " });\n", + "\n", + " function set_focus () {\n", + " canvas.focus();\n", + " canvas_div.focus();\n", + " }\n", + "\n", + " window.setTimeout(set_focus, 100);\n", + "}\n", + "\n", + "mpl.figure.prototype._init_toolbar = function() {\n", + " var fig = this;\n", + "\n", + " var nav_element = $('
');\n", + " nav_element.attr('style', 'width: 100%');\n", + " this.root.append(nav_element);\n", + "\n", + " // Define a callback function for later on.\n", + " function toolbar_event(event) {\n", + " return fig.toolbar_button_onclick(event['data']);\n", + " }\n", + " function toolbar_mouse_event(event) {\n", + " return fig.toolbar_button_onmouseover(event['data']);\n", + " }\n", + "\n", + " for(var toolbar_ind in mpl.toolbar_items) {\n", + " var name = mpl.toolbar_items[toolbar_ind][0];\n", + " var tooltip = mpl.toolbar_items[toolbar_ind][1];\n", + " var image = mpl.toolbar_items[toolbar_ind][2];\n", + " var method_name = mpl.toolbar_items[toolbar_ind][3];\n", + "\n", + " if (!name) {\n", + " // put a spacer in here.\n", + " continue;\n", + " }\n", + " var button = $('