Skip to content

Commit 2653833

Browse files
committed
DOC some fixes to the doc build.
1 parent 4bd1286 commit 2653833

File tree

25 files changed

+198
-167
lines changed

25 files changed

+198
-167
lines changed

doc/datasets/index.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -267,26 +267,26 @@ features::
267267

268268
.. include:: rcv1.rst
269269

270-
.. _boston_house_prices
270+
.. _boston_house_prices:
271271

272272
.. include:: ../../sklearn/datasets/descr/boston_house_prices.rst
273273

274-
.. _breast_cancer
274+
.. _breast_cancer:
275275

276276
.. include:: ../../sklearn/datasets/descr/breast_cancer.rst
277277

278-
.. _diabetes
278+
.. _diabetes:
279279

280280
.. include:: ../../sklearn/datasets/descr/diabetes.rst
281281

282-
.. _digits
282+
.. _digits:
283283

284284
.. include:: ../../sklearn/datasets/descr/digits.rst
285285

286-
.. _iris
286+
.. _iris:
287287

288288
.. include:: ../../sklearn/datasets/descr/iris.rst
289289

290-
.. _linnerud
290+
.. _linnerud:
291291

292292
.. include:: ../../sklearn/datasets/descr/linnerud.rst

doc/datasets/rcv1.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,10 +41,10 @@ There are 103 topics, each represented by a string. Their corpus frequencies spa
4141
>>> rcv1.target_names[:3].tolist() # doctest: +SKIP
4242
['E11', 'ECAT', 'M11']
4343

44-
The dataset will be downloaded from the `dataset's homepage`_ if necessary.
44+
The dataset will be downloaded from the `rcv1 homepage`_ if necessary.
4545
The compressed size is about 656 MB.
4646

47-
.. _dataset's homepage: http://jmlr.csail.mit.edu/papers/volume5/lewis04a/
47+
.. _rcv1 homepage: http://jmlr.csail.mit.edu/papers/volume5/lewis04a/
4848

4949

5050
.. topic:: References

doc/modules/decomposition.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -776,6 +776,7 @@ a corpus with :math:`D` documents and :math:`K` topics:
776776
2. For each document :math:`d`, draw :math:`\theta_d \sim Dirichlet(\alpha), \: d=1...D`
777777

778778
3. For each word :math:`i` in document :math:`d`:
779+
779780
a. Draw a topic index :math:`z_{di} \sim Multinomial(\theta_d)`
780781
b. Draw the observed word :math:`w_{ij} \sim Multinomial(beta_{z_{di}}.)`
781782

doc/modules/feature_selection.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,8 @@ For examples on how it is to be used refer to the sections below.
153153
most important features from the Boston dataset without knowing the
154154
threshold beforehand.
155155

156+
.. _l1_feature_selection:
157+
156158
L1-based feature selection
157159
--------------------------
158160

doc/modules/gaussian_process.rst

Lines changed: 33 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -67,12 +67,15 @@ level from the data (see example below).
6767

6868
The implementation is based on Algorithm 2.1 of [RW2006]_. In addition to
6969
the API of standard sklearn estimators, GaussianProcessRegressor:
70-
* allows prediction without prior fitting (based on the GP prior)
71-
* provides an additional method ``sample_y(X)``, which evaluates samples
72-
drawn from the GPR (prior or posterior) at given inputs
73-
* exposes a method ``log_marginal_likelihood(theta)``, which can be used
74-
externally for other ways of selecting hyperparameters, e.g., via
75-
Markov chain Monte Carlo.
70+
71+
* allows prediction without prior fitting (based on the GP prior)
72+
73+
* provides an additional method ``sample_y(X)``, which evaluates samples
74+
drawn from the GPR (prior or posterior) at given inputs
75+
76+
* exposes a method ``log_marginal_likelihood(theta)``, which can be used
77+
externally for other ways of selecting hyperparameters, e.g., via
78+
Markov chain Monte Carlo.
7679

7780

7881
GPR examples
@@ -171,26 +174,30 @@ model the CO2 concentration as a function of the time t.
171174

172175
The kernel is composed of several terms that are responsible for explaining
173176
different properties of the signal:
174-
- a long term, smooth rising trend is to be explained by an RBF kernel. The
175-
RBF kernel with a large length-scale enforces this component to be smooth;
176-
it is not enforced that the trend is rising which leaves this choice to the
177-
GP. The specific length-scale and the amplitude are free hyperparameters.
178-
- a seasonal component, which is to be explained by the periodic
179-
ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
180-
of this periodic component, controlling its smoothness, is a free parameter.
181-
In order to allow decaying away from exact periodicity, the product with an
182-
RBF kernel is taken. The length-scale of this RBF component controls the
183-
decay time and is a further free parameter.
184-
- smaller, medium term irregularities are to be explained by a
185-
RationalQuadratic kernel component, whose length-scale and alpha parameter,
186-
which determines the diffuseness of the length-scales, are to be determined.
187-
According to [RW2006]_, these irregularities can better be explained by
188-
a RationalQuadratic than an RBF kernel component, probably because it can
189-
accommodate several length-scales.
190-
- a "noise" term, consisting of an RBF kernel contribution, which shall
191-
explain the correlated noise components such as local weather phenomena,
192-
and a WhiteKernel contribution for the white noise. The relative amplitudes
193-
and the RBF's length scale are further free parameters.
177+
178+
- a long term, smooth rising trend is to be explained by an RBF kernel. The
179+
RBF kernel with a large length-scale enforces this component to be smooth;
180+
it is not enforced that the trend is rising which leaves this choice to the
181+
GP. The specific length-scale and the amplitude are free hyperparameters.
182+
183+
- a seasonal component, which is to be explained by the periodic
184+
ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
185+
of this periodic component, controlling its smoothness, is a free parameter.
186+
In order to allow decaying away from exact periodicity, the product with an
187+
RBF kernel is taken. The length-scale of this RBF component controls the
188+
decay time and is a further free parameter.
189+
190+
- smaller, medium term irregularities are to be explained by a
191+
RationalQuadratic kernel component, whose length-scale and alpha parameter,
192+
which determines the diffuseness of the length-scales, are to be determined.
193+
According to [RW2006]_, these irregularities can better be explained by
194+
a RationalQuadratic than an RBF kernel component, probably because it can
195+
accommodate several length-scales.
196+
197+
- a "noise" term, consisting of an RBF kernel contribution, which shall
198+
explain the correlated noise components such as local weather phenomena,
199+
and a WhiteKernel contribution for the white noise. The relative amplitudes
200+
and the RBF's length scale are further free parameters.
194201

195202
Maximizing the log-marginal-likelihood after subtracting the target's mean
196203
yields the following kernel with an LML of -83.214:

doc/modules/multiclass.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -215,7 +215,7 @@ code book. The code size is the dimensionality of the aforementioned space.
215215
Intuitively, each class should be represented by a code as unique as
216216
possible and a good code book should be designed to optimize classification
217217
accuracy. In this implementation, we simply use a randomly-generated code
218-
book as advocated in [2]_ although more elaborate methods may be added in the
218+
book as advocated in [3]_ although more elaborate methods may be added in the
219219
future.
220220

221221
At fitting time, one binary classifier per bit in the code book is fitted.
@@ -262,16 +262,16 @@ Below is an example of multiclass learning using Output-Codes::
262262

263263
.. topic:: References:
264264

265-
.. [1] "Solving multiclass learning problems via error-correcting output codes",
265+
.. [2] "Solving multiclass learning problems via error-correcting output codes",
266266
Dietterich T., Bakiri G.,
267267
Journal of Artificial Intelligence Research 2,
268268
1995.
269269
270-
.. [2] "The error coding method and PICTs",
270+
.. [3] "The error coding method and PICTs",
271271
James G., Hastie T.,
272272
Journal of Computational and Graphical statistics 7,
273273
1998.
274274
275-
.. [3] "The Elements of Statistical Learning",
275+
.. [4] "The Elements of Statistical Learning",
276276
Hastie T., Tibshirani R., Friedman J., page 606 (second-edition)
277277
2008.

doc/modules/neural_networks_supervised.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ See the examples below and the doc string of
153153

154154
.. topic:: Examples:
155155

156-
* :ref:`example_plot_mlp_alpha.py`
156+
* :ref:`example_neural_networks_plot_mlp_alpha.py`
157157

158158

159159
Regression
@@ -175,7 +175,7 @@ Algorithms
175175
MLP trains using `Stochastic Gradient Descent
176176
<http://en.wikipedia.org/wiki/Stochastic_gradient_descent>`_,
177177
`Adam <http://arxiv.org/abs/1412.6980>`_, or
178-
`L-BFGS <http://en.wikipedia.org/wiki/Limited-memory_BFGS>`_.
178+
`L-BFGS <http://en.wikipedia.org/wiki/Limited-memory_BFGS>`__.
179179
Stochastic Gradient Descent (SGD) updates parameters using the gradient of the
180180
loss function with respect to a parameter that needs adaptation, i.e.
181181

@@ -201,7 +201,7 @@ L-BFGS is a fast learning algorithm that approximates the Hessian matrix which
201201
represents the second-order partial derivative of a function. Further it
202202
approximates the inverse of the Hessian matrix to perform parameter updates.
203203
The implementation uses the Scipy version of
204-
`L-BFGS <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html>`_..
204+
`L-BFGS <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html>`__..
205205

206206
If the selected algorithm is 'L-BFGS', training does not support online nor
207207
mini-batch learning.

doc/modules/outlier_detection.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,7 @@ This strategy is illustrated below.
169169
:class:`covariance.MinCovDet`.
170170

171171
.. topic:: References:
172+
172173
.. [LTZ2008] Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. "Isolation forest."
173174
Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on.
174175

doc/whats_new.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ Enhancements
3333
that takes in the data and yields a generator for the different splits.
3434
This change makes it possible to do nested cross-validation with ease,
3535
facilitated by :class:`model_selection.GridSearchCV` and similar
36-
utilities. (`#4294 https://github.com/scikit-learn/scikit-learn/pull/4294>`_) by `Raghav R V`_.
36+
utilities. (`#4294 <https://github.com/scikit-learn/scikit-learn/pull/4294>`_) by `Raghav R V`_.
3737

3838
- The random forest, extra trees and decision tree estimators now has a
3939
method ``decision_path`` which returns the decision path of samples in
@@ -56,16 +56,16 @@ Bug fixes
5656
.........
5757

5858
- :class:`RandomizedPCA` default number of `iterated_power` is 2 instead of 3.
59-
This is a speed up with a minor precision decrease. (`#5141 https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.
59+
This is a speed up with a minor precision decrease. (`#5141 <https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.
6060

6161
- :func:`randomized_svd` performs 2 power iterations by default, instead or 0.
6262
In practice this is often enough for obtaining a good approximation of the
63-
true eigenvalues/vectors in the presence of noise. (`#5141 https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.
63+
true eigenvalues/vectors in the presence of noise. (`#5141 <https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.
6464

6565
- :func:`randomized_range_finder` is more numerically stable when many
6666
power iterations are requested, since it applies LU normalization by default.
6767
If `n_iter<2` numerical issues are unlikely, thus no normalization is applied.
68-
Other normalization options are available: 'none', 'LU' and 'QR'. (`#5141 https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.
68+
Other normalization options are available: 'none', 'LU' and 'QR'. (`#5141 <https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.
6969

7070
- Fixed bug in :func:`manifold.spectral_embedding` where diagonal of unnormalized
7171
Laplacian matrix was incorrectly set to 1. By `Peter Fischer`_.
@@ -85,7 +85,7 @@ API changes summary
8585
- The :mod:`cross_validation`, :mod:`grid_search` and :mod:`learning_curve`
8686
have been deprecated and the classes and functions have been reorganized into
8787
the :mod:`model_selection` module.
88-
(`#4294 https://github.com/scikit-learn/scikit-learn/pull/4294>`_) by `Raghav R V`_.
88+
(`#4294 <https://github.com/scikit-learn/scikit-learn/pull/4294>`_) by `Raghav R V`_.
8989

9090

9191
.. _changes_0_17:
@@ -366,7 +366,7 @@ Bug fixes
366366

367367
- Fixed bug in :class:`cross_decomposition.PLS` that yielded unstable and
368368
platform dependent output, and failed on `fit_transform`.
369-
By `Arthur Mensch`_.
369+
By `Arthur Mensch`_.
370370

371371
- Fixed a bug in :class:`linear_model.LogisticRegression` and
372372
:class:`linear_model.LogisticRegressionCV` when using
@@ -3403,8 +3403,8 @@ Changelog
34033403

34043404
- New :ref:`gaussian_process` module by Vincent Dubourg. This module
34053405
also has great documentation and some very neat examples. See
3406-
:ref:`example_gaussian_process_plot_gp_regression.py` or
3407-
:ref:`example_gaussian_process_plot_gp_probabilistic_classification_after_regression.py`
3406+
example_gaussian_process_plot_gp_regression.py or
3407+
example_gaussian_process_plot_gp_probabilistic_classification_after_regression.py
34083408
for a taste of what can be done.
34093409

34103410
- It is now possible to use liblinear’s Multi-class SVC (option
@@ -3866,4 +3866,4 @@ David Huard, Dave Morrill, Ed Schofield, Travis Oliphant, Pearu Peterson.
38663866
.. _Graham Clenaghan: https://github.com/gclenaghan
38673867
.. _Giorgio Patrini: https://github.com/giorgiop
38683868
.. _Elvis Dohmatob: https://github.com/dohmatob
3869-
.. _yelite https://github.com/yelite
3869+
.. _yelite: https://github.com/yelite

examples/applications/face_recognition.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,9 @@
1212
1313
Expected results for the top 5 most represented people in the dataset::
1414
15+
================== ============ ======= ========== =======
1516
precision recall f1-score support
16-
17+
================== ============ ======= ========== =======
1718
Ariel Sharon 0.67 0.92 0.77 13
1819
Colin Powell 0.75 0.78 0.76 60
1920
Donald Rumsfeld 0.78 0.67 0.72 27
@@ -23,6 +24,7 @@
2324
Tony Blair 0.81 0.69 0.75 36
2425
2526
avg / total 0.80 0.80 0.80 322
27+
================== ============ ======= ========== =======
2628
2729
"""
2830
from __future__ import print_function

0 commit comments

Comments
 (0)