Skip to content

Commit f62a6d1

Browse files
ndawearjoly
authored andcommitted
min_weight_fraction_leaf: narrative docs
1 parent 6f4b31a commit f62a6d1

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

doc/modules/tree.rst

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -320,8 +320,15 @@ Tips on practical use
320320
create arbitrary small leaves, though ``min_samples_split`` is more common
321321
in the literature.
322322

323-
* Balance your dataset before training to prevent the tree from creating
324-
a tree biased toward the classes that are dominant.
323+
* Balance your dataset before training to prevent the tree from creating a
324+
tree biased toward the classes that are dominant. Balance the dataset by
325+
sampling an equal number of samples from each class, or preferably by
326+
normalizing the sum of the sample weights (``sample_weight``) for each
327+
class to the same value. Then use ``min_weight_fraction_leaf`` instead of
328+
``min_samples_leaf`` to control the leaf node sizes.
329+
``min_weight_fraction_leaf`` will ensure that leaf nodes contain at least
330+
some fraction of the overall sum of the sample weights and will not be
331+
biased toward the dominant classes like ``min_samples_leaf``.
325332

326333
* All decision trees use ``np.float32`` arrays internally.
327334
If training data is not in this format, a copy of the dataset will be made.

0 commit comments

Comments
 (0)