Skip to content

Commit c506069

Browse files
committed
review knn
1 parent 4319042 commit c506069

File tree

6 files changed

+9
-50
lines changed

6 files changed

+9
-50
lines changed

source/py_tutorials/py_calib3d/py_calibration/py_calibration.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Basics
1717

1818
Today's cheap pinhole cameras introduces a lot of distortion to images. Two major distortions are radial distortion and tangential distortion.
1919

20-
Due to radial distortion, straight lines will appear curved. Its effect is more as we move away from the center of image. For example, one image is shown below, where two edges of a chess board are marked with red lines. But you can see that border is not a straight line and doesn't match with the red line. All the expected straight lines are bulged out. Visit `Distortion (optics) <en.wikipedia.org/wiki/Distortion_(optics)‎>`_ for more details.
20+
Due to radial distortion, straight lines will appear curved. Its effect is more as we move away from the center of image. For example, one image is shown below, where two edges of a chess board are marked with red lines. But you can see that border is not a straight line and doesn't match with the red line. All the expected straight lines are bulged out. Visit `Distortion (optics) <http://en.wikipedia.org/wiki/Distortion_%28optics%29>`_ for more details.
2121

2222
.. image:: images/calib_radial.jpg
2323
:alt: Radial Distortion
@@ -139,7 +139,7 @@ So now we have our object points and image points we are ready to go for calibra
139139
Undistortion
140140
---------------
141141

142-
We have got what we were trying. Now we can take an image and undistort it. OpenCV comes with two methods, we will see both. But before that, we can refine the camera matrix based on a free scaling parameter using **cv2.getOptimalNewCameraMatrix()**. If the scaling parameter ``alpha=0``, it returns undistorts with minimum unwanted pixels. So it may even remove some pixels at image corners. If ``alpha=1``, all pixels are retained with some extra black images. It also returns an image ROI which can be used to crop the result.
142+
We have got what we were trying. Now we can take an image and undistort it. OpenCV comes with two methods, we will see both. But before that, we can refine the camera matrix based on a free scaling parameter using **cv2.getOptimalNewCameraMatrix()**. If the scaling parameter ``alpha=0``, it returns undistorted image with minimum unwanted pixels. So it may even remove some pixels at image corners. If ``alpha=1``, all pixels are retained with some extra black images. It also returns an image ROI which can be used to crop the result.
143143

144144
So we take a new image (``left12.jpg`` in this case. That is the first image in this chapter)
145145
::

source/py_tutorials/py_calib3d/py_epipolar_geometry/py_epipolar_geometry.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Before going to depth images, let's first understand some basic concepts in mult
2727
:align: center
2828

2929

30-
If we are using only the left camera, we can't find the 3D point corresponding the point :math:`x` in image because every point on the line :math:`OX` projects to the same point on the image plane. But consider the right image also. Now different points on the line :math:`OX` projects to different points (:math:`x'`) in right plane. So with these two images, we can triangulate the correct 3D point. This is the whole idea.
30+
If we are using only the left camera, we can't find the 3D point corresponding to the point :math:`x` in image because every point on the line :math:`OX` projects to the same point on the image plane. But consider the right image also. Now different points on the line :math:`OX` projects to different points (:math:`x'`) in right plane. So with these two images, we can triangulate the correct 3D point. This is the whole idea.
3131

3232
The projection of the different points on :math:`OX` form a line on right plane (line :math:`l'`). We call it **epiline** corresponding to the point :math:`x`. It means, to find the point :math:`x` on the right image, search along this epiline. It should be somewhere on this line (Think of it this way, to find the matching point in other image, you need not search the whole image, just search along the epiline. So it provides better performance and accuracy). This is called **Epipolar Constraint**. Similarly all points will have its corresponding epilines in the other image. The plane :math:`XOO'` is called **Epipolar Plane**.
3333

source/py_tutorials/py_calib3d/py_pose/py_pose.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Basics
1616

1717
This is going to be a small section. During the last session on camera calibration, you have found the camera matrix, distortion coefficients etc. Given a pattern image, we can utilize the above information to calculate its pose, or how the object is situated in space, like how it is rotated, how it is displaced etc. For a planar object, we can assume Z=0, such that, the problem now becomes how camera is placed in space to see our pattern image. So, if we know how the object lies in the space, we can draw some 2D diagrams in it to simulate the 3D effect. Let's see how to do it.
1818

19-
Our problem is, we want to draw our 3D coordinate axis (X, Y, Z axes) on our chessboard's first corner. X in blue color, Y in green color and Z in red color. So in-effect, Z axis should feel like it is perpendicular to our chessboard plane.
19+
Our problem is, we want to draw our 3D coordinate axis (X, Y, Z axes) on our chessboard's first corner. X axis in blue color, Y axis in green color and Z axis in red color. So in-effect, Z axis should feel like it is perpendicular to our chessboard plane.
2020

2121
First, let's load the camera matrix and distortion coefficients from the previous calibration result.
2222
::
@@ -40,7 +40,7 @@ Now let's create a function, ``draw`` which takes the corners in the chessboard
4040
img = cv2.line(img, corner, tuple(imgpts[2].ravel()), (0,0,255), 5)
4141
return img
4242

43-
Then as in previous case, we create termination criteria, object points (3D points of corners in chessboard) and axis points. Axis points are points in 3D space for drawing the axis. We draw axis of length 3 (units will be in terms of chess square size since we calibrated based on that size). So our X axis is drawn from (0,0,0) to (3,0,0), so for Y axis. For Z axis, it is drawn from (0,0,0) to (0,0,-2). Negative denotes it is drawn towards the camera.
43+
Then as in previous case, we create termination criteria, object points (3D points of corners in chessboard) and axis points. Axis points are points in 3D space for drawing the axis. We draw axis of length 3 (units will be in terms of chess square size since we calibrated based on that size). So our X axis is drawn from (0,0,0) to (3,0,0), so for Y axis. For Z axis, it is drawn from (0,0,0) to (0,0,-3). Negative denotes it is drawn towards the camera.
4444
::
4545

4646
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

source/py_tutorials/py_ml/py_knn/py_knn_opencv/py_knn_opencv.rst

Lines changed: 2 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -52,47 +52,7 @@ Our goal is to build an application which can read the handwritten digits. For t
5252
print accuracy
5353

5454

55-
So our basic OCR app is ready. This particular example gave me an accuracy of 91%. One option improve accuracy is to put this wrong data in our training set and train again. Iterate this process until you get desired accuracy. For example, I modified above code for 10 iterations.
56-
::
57-
58-
img = cv2.imread('digits.png')
59-
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
60-
61-
cells = [np.hsplit(row,100) for row in np.vsplit(gray,50)]
62-
63-
x = np.array(cells)
64-
65-
# train and test data
66-
train = x[:,:50].reshape(-1,400).astype(np.float32)
67-
test = x[:,50:100].reshape(-1,400).astype(np.float32)
68-
69-
# labels for train and test data
70-
k = np.arange(10)
71-
train_labels = np.repeat(k,250)[:,np.newaxis]
72-
test_labels = train_labels.copy()
73-
74-
knn = cv2.KNearest()
75-
76-
for count in xrange(10):
77-
knn.train(train,train_labels)
78-
ret,res,neighbours,dist = knn.find_nearest(test,k=5)
79-
80-
matches = result==test_labels
81-
correct = np.count_nonzero(matches)
82-
accuracy = correct*100.0/result.size
83-
print accuracy
84-
85-
# show wrong ones
86-
loc = np.where(matches==0)[0]
87-
wrong_images = test[loc]
88-
real_labels = train_labels[loc]
89-
90-
# add it to train data and train labels
91-
train_labels=np.vstack((train_labels,real_labels))
92-
train = np.vstack((train,wrong_images))
93-
94-
95-
Now after 10 iterations, I get an accuracy of 100%. This final training data gives me 100% accuracy. So instead of finding this training data everytime I start application, I better save it, so that next time, I directly read this data from a file and start classification. You can do it with the help of some Numpy functions like np.savetxt, np.savez, np.load etc. Please check their docs for more details.
55+
So our basic OCR app is ready. This particular example gave me an accuracy of 91%. One option improve accuracy is to add more data for training, especially the wrong ones. So instead of finding this training data everytime I start application, I better save it, so that next time, I directly read this data from a file and start classification. You can do it with the help of some Numpy functions like np.savetxt, np.savez, np.load etc. Please check their docs for more details.
9656
::
9757

9858
# save the data
@@ -138,8 +98,7 @@ There are 20000 samples available, so we take first 10000 data as training sampl
13898
accuracy = correct*100.0/10000
13999
print accuracy
140100

141-
It gives me an accuracy of 93.22%. Again, if you want to increase accuracy, you can iteratively add error data in each level as we did in previous example.
142-
101+
It gives me an accuracy of 93.22%. Again, if you want to increase accuracy, you can iteratively add error data in each level.
143102

144103
Additional Resources
145104
=======================
Loading

source/py_tutorials/py_ml/py_knn/py_knn_understanding/py_knn_understanding.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Now a new member comes into the town and creates a new home, which is shown as g
2323

2424
One method is to check who is his nearest neighbour. From the image, it is clear it is the Red Triangle family. So he is also added into Red Triangle. This method is called simply **Nearest Neighbour**, because classification depends only on the nearest neighbour.
2525

26-
But there is a problem with that. Red Triangle may be the nearest. But what if there are lot of Blue Squares near to him. Then Blue Squares have more strength in that locality that Red Triangle. So just checking nearest one is not sufficient. Instead we check some `k` nearest families. Then whoever is majority in them, the new guy belongs to that family. In our image, let's take `k=3`, ie 3 nearest families. He has two Red and one Blue (there are two Blues equidistant, but since k=3, we take only one of them), so again he should be added to Red family. But what if we take `k=7`? Then he has 5 Blue families and 2 Red families. Great!! Now he should be added to Blue family. So it all changes with value of k. More funny thing is, what if `k = 4`? He has 2 Red and 2 Blue neighbours. It is a tie !!! So better take k as an odd number. So this method is called **k-Nearest Neighbour** since classification depends on k nearest neighbours.
26+
But there is a problem with that. Red Triangle may be the nearest. But what if there are lot of Blue Squares near to him? Then Blue Squares have more strength in that locality than Red Triangle. So just checking nearest one is not sufficient. Instead we check some `k` nearest families. Then whoever is majority in them, the new guy belongs to that family. In our image, let's take `k=3`, ie 3 nearest families. He has two Red and one Blue (there are two Blues equidistant, but since k=3, we take only one of them), so again he should be added to Red family. But what if we take `k=7`? Then he has 5 Blue families and 2 Red families. Great!! Now he should be added to Blue family. So it all changes with value of k. More funny thing is, what if `k = 4`? He has 2 Red and 2 Blue neighbours. It is a tie !!! So better take k as an odd number. So this method is called **k-Nearest Neighbour** since classification depends on k nearest neighbours.
2727

2828
Again, in kNN, it is true we are considering k neighbours, but we are giving equal importance to all, right? Is it justice? For example, take the case of `k=4`. We told it is a tie. But see, the 2 Red families are more closer to him than the other 2 Blue families. So he is more eligible to be added to Red. So how do we mathematically explain that? We give some weights to each family depending on their distance to the new-comer. For those who are near to him get higher weights while those are far away get lower weights. Then we add total weights of each family separately. Whoever gets highest total weights, new-comer goes to that family. This is called **modified kNN**.
2929

@@ -68,7 +68,7 @@ You will get something similar to our first image. Since you are using random nu
6868

6969
Next initiate the kNN algorithm and pass the `trainData` and `responses` to train the kNN (It constructs a search tree).
7070

71-
Then we will bring one new-comer and classify him to a family with the help of kNN in OpenCV. Before going to kNN, we need to know something on our test data (data of new comers). Our data should be a floating point array with size :math:`number \; of \; test_data \times number \; of \; features`. Then we find the nearest neighbours of new-comer. We can specify how many neighbours we want. It returns:
71+
Then we will bring one new-comer and classify him to a family with the help of kNN in OpenCV. Before going to kNN, we need to know something on our test data (data of new comers). Our data should be a floating point array with size :math:`number \; of \; testdata \times number \; of \; features`. Then we find the nearest neighbours of new-comer. We can specify how many neighbours we want. It returns:
7272

7373
1. The label given to new-comer depending upon the kNN theory we saw earlier. If you want Nearest Neighbour algorithm, just specify `k=1` where k is the number of neighbours.
7474
2. The labels of k-Nearest Neighbours.

0 commit comments

Comments
 (0)