@@ -70,6 +70,7 @@ Scoring                           Function
7070'neg_log_loss'                    :func: `metrics.log_loss `                          requires ``predict_proba `` support
7171'precision' etc.                  :func: `metrics.precision_score `                   suffixes apply as with 'f1'
7272'recall' etc.                     :func: `metrics.recall_score `                      suffixes apply as with 'f1'
73+ 'jaccard' etc.                    :func: `metrics.jaccard_score `                     suffixes apply as with 'f1'
7374'roc_auc'                         :func: `metrics.roc_auc_score `
7475
7576**Clustering**
@@ -326,7 +327,7 @@ Some also work in the multilabel case:
326327   f1_score
327328   fbeta_score
328329   hamming_loss
329-    jaccard_similarity_score 
330+    jaccard_score 
330331   log_loss
331332   multilabel_confusion_matrix
332333   precision_recall_fscore_support
@@ -346,6 +347,8 @@ And some work with binary and multilabel (but not multiclass) problems:
346347In the following sub-sections, we will describe each of those functions,
347348preceded by some notes on common API and metric definition.
348349
350+ .. _average :
351+ 
349352From binary to multiclass and multilabel
350353---------------------------------------- 
351354
@@ -355,8 +358,6 @@ only the positive label is evaluated, assuming by default that the positive
355358class is labelled ``1 `` (though this may be configurable through the
356359``pos_label `` parameter).
357360
358- .. _average :
359- 
360361In extending a binary metric to multiclass or multilabel problems, the data
361362is treated as a collection of binary problems, one for each class.
362363There are then a number of ways to average binary metric calculations across
@@ -680,43 +681,6 @@ In the multilabel case with binary label indicators: ::
680681    or superset of the true labels will give a Hamming loss between
681682    zero and one, exclusive.
682683
683- .. _jaccard_similarity_score :
684- 
685- Jaccard similarity coefficient score
686- ------------------------------------- 
687- 
688- The :func: `jaccard_similarity_score ` function computes the average (default)
689- or sum of `Jaccard similarity coefficients 
690- <https://en.wikipedia.org/wiki/Jaccard_index> `_, also called the Jaccard index,
691- between pairs of label sets.
692- 
693- The Jaccard similarity coefficient of the :math: `i`-th samples,
694- with a ground truth label set :math: `y_i` and predicted label set
695- :math: `\hat {y}_i`, is defined as
696- 
697- .. math :: 
698- 
699-     J(y_i, \hat {y}_i) = \frac {|y_i \cap  \hat {y}_i|}{|y_i \cup  \hat {y}_i|}. 
700- 
701-  In binary and multiclass classification, the Jaccard similarity coefficient
702- score is equal to the classification accuracy.
703- 
704- ::
705- 
706-   >>> import numpy as np 
707-   >>> from sklearn.metrics import jaccard_similarity_score 
708-   >>> y_pred = [0, 2, 1, 3] 
709-   >>> y_true = [0, 1, 2, 3] 
710-   >>> jaccard_similarity_score(y_true, y_pred) 
711-   0.5 
712-   >>> jaccard_similarity_score(y_true, y_pred, normalize=False) 
713-   2 
714- 
715- In the multilabel case with binary label indicators: ::
716- 
717-   >>> jaccard_similarity_score(np.array([[0, 1], [1, 1]]), np.ones((2, 2))) 
718-   0.75 
719- 
720684.. _precision_recall_f_measure_metrics :
721685
722686Precision, recall and F-measures
@@ -957,6 +921,61 @@ Similarly, labels not present in the data sample may be accounted for in macro-a
957921  ... #  doctest: +ELLIPSIS
958922  0.166... 
959923
924+ .. _jaccard_similarity_score :
925+ 
926+ Jaccard similarity coefficient score
927+ ------------------------------------- 
928+ 
929+ The :func: `jaccard_score ` function computes the average of `Jaccard similarity 
930+ coefficients <https://en.wikipedia.org/wiki/Jaccard_index> `_, also called the
931+ Jaccard index, between pairs of label sets.
932+ 
933+ The Jaccard similarity coefficient of the :math: `i`-th samples,
934+ with a ground truth label set :math: `y_i` and predicted label set
935+ :math: `\hat {y}_i`, is defined as
936+ 
937+ .. math :: 
938+ 
939+     J(y_i, \hat {y}_i) = \frac {|y_i \cap  \hat {y}_i|}{|y_i \cup  \hat {y}_i|}. 
940+ 
941+ :func: `jaccard_score ` works like :func: `precision_recall_fscore_support ` as a
942+ naively set-wise measure applying natively to binary targets, and extended to
943+ apply to multilabel and multiclass through the use of `average ` (see
944+ :ref: `above  <average >`).
945+ 
946+ In the binary case: ::
947+ 
948+   >>> import numpy as np 
949+   >>> from sklearn.metrics import jaccard_score 
950+   >>> y_true = np.array([[0, 1, 1], 
951+   ...                    [1, 1, 0]]) 
952+   >>> y_pred = np.array([[1, 1, 1], 
953+   ...                    [1, 0, 0]]) 
954+   >>> jaccard_score(y_true[0], y_pred[0])  # doctest: +ELLIPSIS 
955+   0.6666... 
956+ 
957+ In the multilabel case with binary label indicators: ::
958+ 
959+   >>> jaccard_score(y_true, y_pred, average='samples')  # doctest: +ELLIPSIS 
960+   0.5833... 
961+   >>> jaccard_score(y_true, y_pred, average='macro')  # doctest: +ELLIPSIS 
962+   0.6666... 
963+   >>> jaccard_score(y_true, y_pred, average=None) 
964+   array([0.5, 0.5, 1. ]) 
965+ 
966+ Multiclass problems are binarized and treated like the corresponding
967+ multilabel problem: ::
968+ 
969+   >>> y_pred = [0, 2, 1, 2] 
970+   >>> y_true = [0, 1, 2, 2] 
971+   >>> jaccard_score(y_true, y_pred, average=None) 
972+   ... # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS 
973+   array([1. , 0. , 0.33...]) 
974+   >>> jaccard_score(y_true, y_pred, average='macro') 
975+   0.44... 
976+   >>> jaccard_score(y_true, y_pred, average='micro') 
977+   0.33... 
978+ 
960979.. _hinge_loss :
961980
962981Hinge loss
0 commit comments