Skip to content

Commit 29f038b

Browse files
committed
Pushing the docs to dev/ for branch: master, commit 84f5ee6529b2dd0d374b0ce0b172916d9b4e356b
1 parent b1ef906 commit 29f038b

File tree

1,053 files changed

+3230
-3217
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,053 files changed

+3230
-3217
lines changed
167 Bytes
Binary file not shown.
165 Bytes
Binary file not shown.

dev/_downloads/plot_rbf_parameters.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# RBF SVM parameters\n\n\nThis example illustrates the effect of the parameters ``gamma`` and ``C`` of\nthe Radial Basis Function (RBF) kernel SVM.\n\nIntuitively, the ``gamma`` parameter defines how far the influence of a single\ntraining example reaches, with low values meaning 'far' and high values meaning\n'close'. The ``gamma`` parameters can be seen as the inverse of the radius of\ninfluence of samples selected by the model as support vectors.\n\nThe ``C`` parameter trades off misclassification of training examples against\nsimplicity of the decision surface. A low ``C`` makes the decision surface\nsmooth, while a high ``C`` aims at classifying all training examples correctly\nby giving the model freedom to select more samples as support vectors.\n\nThe first plot is a visualization of the decision function for a variety of\nparameter values on a simplified classification problem involving only 2 input\nfeatures and 2 possible target classes (binary classification). Note that this\nkind of plot is not possible to do for problems with more features or target\nclasses.\n\nThe second plot is a heatmap of the classifier's cross-validation accuracy as a\nfunction of ``C`` and ``gamma``. For this example we explore a relatively large\ngrid for illustration purposes. In practice, a logarithmic grid from\n$10^{-3}$ to $10^3$ is usually sufficient. If the best parameters\nlie on the boundaries of the grid, it can be extended in that direction in a\nsubsequent search.\n\nNote that the heat map plot has a special colorbar with a midpoint value close\nto the score values of the best performing models so as to make it easy to tell\nthem apart in the blink of an eye.\n\nThe behavior of the model is very sensitive to the ``gamma`` parameter. If\n``gamma`` is too large, the radius of the area of influence of the support\nvectors only includes the support vector itself and no amount of\nregularization with ``C`` will be able to prevent overfitting.\n\nWhen ``gamma`` is very small, the model is too constrained and cannot capture\nthe complexity or \"shape\" of the data. The region of influence of any selected\nsupport vector would include the whole training set. The resulting model will\nbehave similarly to a linear model with a set of hyperplanes that separate the\ncenters of high density of any pair of two classes.\n\nFor intermediate values, we can see on the second plot that good models can\nbe found on a diagonal of ``C`` and ``gamma``. Smooth models (lower ``gamma``\nvalues) can be made more complex by selecting a larger number of support\nvectors (larger ``C`` values) hence the diagonal of good performing models.\n\nFinally one can also observe that for some intermediate values of ``gamma`` we\nget equally performing models when ``C`` becomes very large: it is not\nnecessary to regularize by limiting the number of support vectors. The radius of\nthe RBF kernel alone acts as a good structural regularizer. In practice though\nit might still be interesting to limit the number of support vectors with a\nlower value of ``C`` so as to favor models that use less memory and that are\nfaster to predict.\n\nWe should also note that small differences in scores results from the random\nsplits of the cross-validation procedure. Those spurious variations can be\nsmoothed out by increasing the number of CV iterations ``n_splits`` at the\nexpense of compute time. Increasing the value number of ``C_range`` and\n``gamma_range`` steps will increase the resolution of the hyper-parameter heat\nmap.\n\n\n"
18+
"\n# RBF SVM parameters\n\n\nThis example illustrates the effect of the parameters ``gamma`` and ``C`` of\nthe Radial Basis Function (RBF) kernel SVM.\n\nIntuitively, the ``gamma`` parameter defines how far the influence of a single\ntraining example reaches, with low values meaning 'far' and high values meaning\n'close'. The ``gamma`` parameters can be seen as the inverse of the radius of\ninfluence of samples selected by the model as support vectors.\n\nThe ``C`` parameter trades off correct classification of training examples \nagainst maximization of the decision function's margin. For larger values of \n``C``, a smaller margin will be accepted if the decision function is better at \nclassifying all training points correctly. A lower ``C`` will encourage a larger \nmargin, therefore a simpler decision function, at the cost of training accuracy.\nIn other words``C`` behaves as a regularization parameter in the SVM.\n\nThe first plot is a visualization of the decision function for a variety of\nparameter values on a simplified classification problem involving only 2 input\nfeatures and 2 possible target classes (binary classification). Note that this\nkind of plot is not possible to do for problems with more features or target\nclasses.\n\nThe second plot is a heatmap of the classifier's cross-validation accuracy as a\nfunction of ``C`` and ``gamma``. For this example we explore a relatively large\ngrid for illustration purposes. In practice, a logarithmic grid from\n$10^{-3}$ to $10^3$ is usually sufficient. If the best parameters\nlie on the boundaries of the grid, it can be extended in that direction in a\nsubsequent search.\n\nNote that the heat map plot has a special colorbar with a midpoint value close\nto the score values of the best performing models so as to make it easy to tell\nthem apart in the blink of an eye.\n\nThe behavior of the model is very sensitive to the ``gamma`` parameter. If\n``gamma`` is too large, the radius of the area of influence of the support\nvectors only includes the support vector itself and no amount of\nregularization with ``C`` will be able to prevent overfitting.\n\nWhen ``gamma`` is very small, the model is too constrained and cannot capture\nthe complexity or \"shape\" of the data. The region of influence of any selected\nsupport vector would include the whole training set. The resulting model will\nbehave similarly to a linear model with a set of hyperplanes that separate the\ncenters of high density of any pair of two classes.\n\nFor intermediate values, we can see on the second plot that good models can\nbe found on a diagonal of ``C`` and ``gamma``. Smooth models (lower ``gamma``\nvalues) can be made more complex by increasing the importance of classifying\neach point correctly (larger ``C`` values) hence the diagonal of good performing\nmodels.\n\nFinally one can also observe that for some intermediate values of ``gamma`` we\nget equally performing models when ``C`` becomes very large: it is not necessary\nto regularize by enforcing a larger margin. The radius of the RBF kernel alone \nacts as a good structural regularizer. In practice though it might still be \ninteresting to simplify the decision function with a lower value of ``C`` so as\nto favor models that use less memory and that are faster to predict.\n\nWe should also note that small differences in scores results from the random\nsplits of the cross-validation procedure. Those spurious variations can be\nsmoothed out by increasing the number of CV iterations ``n_splits`` at the\nexpense of compute time. Increasing the value number of ``C_range`` and\n``gamma_range`` steps will increase the resolution of the hyper-parameter heat\nmap.\n\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_rbf_parameters.py

+14-12
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,12 @@
1111
'close'. The ``gamma`` parameters can be seen as the inverse of the radius of
1212
influence of samples selected by the model as support vectors.
1313
14-
The ``C`` parameter trades off misclassification of training examples against
15-
simplicity of the decision surface. A low ``C`` makes the decision surface
16-
smooth, while a high ``C`` aims at classifying all training examples correctly
17-
by giving the model freedom to select more samples as support vectors.
14+
The ``C`` parameter trades off correct classification of training examples
15+
against maximization of the decision function's margin. For larger values of
16+
``C``, a smaller margin will be accepted if the decision function is better at
17+
classifying all training points correctly. A lower ``C`` will encourage a larger
18+
margin, therefore a simpler decision function, at the cost of training accuracy.
19+
In other words``C`` behaves as a regularization parameter in the SVM.
1820
1921
The first plot is a visualization of the decision function for a variety of
2022
parameter values on a simplified classification problem involving only 2 input
@@ -46,16 +48,16 @@
4648
4749
For intermediate values, we can see on the second plot that good models can
4850
be found on a diagonal of ``C`` and ``gamma``. Smooth models (lower ``gamma``
49-
values) can be made more complex by selecting a larger number of support
50-
vectors (larger ``C`` values) hence the diagonal of good performing models.
51+
values) can be made more complex by increasing the importance of classifying
52+
each point correctly (larger ``C`` values) hence the diagonal of good performing
53+
models.
5154
5255
Finally one can also observe that for some intermediate values of ``gamma`` we
53-
get equally performing models when ``C`` becomes very large: it is not
54-
necessary to regularize by limiting the number of support vectors. The radius of
55-
the RBF kernel alone acts as a good structural regularizer. In practice though
56-
it might still be interesting to limit the number of support vectors with a
57-
lower value of ``C`` so as to favor models that use less memory and that are
58-
faster to predict.
56+
get equally performing models when ``C`` becomes very large: it is not necessary
57+
to regularize by enforcing a larger margin. The radius of the RBF kernel alone
58+
acts as a good structural regularizer. In practice though it might still be
59+
interesting to simplify the decision function with a lower value of ``C`` so as
60+
to favor models that use less memory and that are faster to predict.
5961
6062
We should also note that small differences in scores results from the random
6163
splits of the cross-validation procedure. Those spurious variations can be

dev/_downloads/scikit-learn-docs.pdf

-10.9 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes
169 Bytes
-368 Bytes
-368 Bytes
-891 Bytes
-891 Bytes
-39 Bytes
-95 Bytes
-114 Bytes
-114 Bytes
-1 Bytes
357 Bytes
-66 Bytes
-66 Bytes
-31 Bytes
-31 Bytes
84 Bytes
84 Bytes
12 Bytes
12 Bytes
286 Bytes
286 Bytes
69 Bytes
69 Bytes
-89 Bytes
-89 Bytes
20 Bytes
150 Bytes
-98 Bytes
-98 Bytes
-450 Bytes
-64 Bytes
-39 Bytes
-13 Bytes
-13 Bytes
-116 Bytes
1 Byte

dev/_sources/auto_examples/applications/plot_face_recognition.rst.txt

+21-21

dev/_sources/auto_examples/applications/plot_model_complexity_influence.rst.txt

+14-14

0 commit comments

Comments
 (0)