Skip to content

Commit 974ceb5

Browse files
chrisfosterelliamueller
authored andcommitted
[WIP] Proposed C parameter clarifications in RBF SVM parameter example (scikit-learn#8957)
* Proposed rewrite * Update plot_rbf_parameters.py use phrasing suggested by @vene.
1 parent 6ade3f8 commit 974ceb5

File tree

1 file changed

+14
-12
lines changed

1 file changed

+14
-12
lines changed

examples/svm/plot_rbf_parameters.py

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,12 @@
1111
'close'. The ``gamma`` parameters can be seen as the inverse of the radius of
1212
influence of samples selected by the model as support vectors.
1313
14-
The ``C`` parameter trades off misclassification of training examples against
15-
simplicity of the decision surface. A low ``C`` makes the decision surface
16-
smooth, while a high ``C`` aims at classifying all training examples correctly
17-
by giving the model freedom to select more samples as support vectors.
14+
The ``C`` parameter trades off correct classification of training examples
15+
against maximization of the decision function's margin. For larger values of
16+
``C``, a smaller margin will be accepted if the decision function is better at
17+
classifying all training points correctly. A lower ``C`` will encourage a larger
18+
margin, therefore a simpler decision function, at the cost of training accuracy.
19+
In other words``C`` behaves as a regularization parameter in the SVM.
1820
1921
The first plot is a visualization of the decision function for a variety of
2022
parameter values on a simplified classification problem involving only 2 input
@@ -46,16 +48,16 @@
4648
4749
For intermediate values, we can see on the second plot that good models can
4850
be found on a diagonal of ``C`` and ``gamma``. Smooth models (lower ``gamma``
49-
values) can be made more complex by selecting a larger number of support
50-
vectors (larger ``C`` values) hence the diagonal of good performing models.
51+
values) can be made more complex by increasing the importance of classifying
52+
each point correctly (larger ``C`` values) hence the diagonal of good performing
53+
models.
5154
5255
Finally one can also observe that for some intermediate values of ``gamma`` we
53-
get equally performing models when ``C`` becomes very large: it is not
54-
necessary to regularize by limiting the number of support vectors. The radius of
55-
the RBF kernel alone acts as a good structural regularizer. In practice though
56-
it might still be interesting to limit the number of support vectors with a
57-
lower value of ``C`` so as to favor models that use less memory and that are
58-
faster to predict.
56+
get equally performing models when ``C`` becomes very large: it is not necessary
57+
to regularize by enforcing a larger margin. The radius of the RBF kernel alone
58+
acts as a good structural regularizer. In practice though it might still be
59+
interesting to simplify the decision function with a lower value of ``C`` so as
60+
to favor models that use less memory and that are faster to predict.
5961
6062
We should also note that small differences in scores results from the random
6163
splits of the cross-validation procedure. Those spurious variations can be

0 commit comments

Comments
 (0)