File tree Expand file tree Collapse file tree 1 file changed +9
-2
lines changed Expand file tree Collapse file tree 1 file changed +9
-2
lines changed Original file line number Diff line number Diff line change @@ -320,8 +320,15 @@ Tips on practical use
320320    create arbitrary small leaves, though ``min_samples_split `` is more common
321321    in the literature.
322322
323-   * Balance your dataset before training to prevent the tree from creating
324-     a tree biased toward the classes that are dominant.
323+   * Balance your dataset before training to prevent the tree from creating a
324+     tree biased toward the classes that are dominant. Balance the dataset by
325+     sampling an equal number of samples from each class, or preferably by
326+     normalizing the sum of the sample weights (``sample_weight ``) for each
327+     class to the same value. Then use ``min_weight_fraction_leaf `` instead of
328+     ``min_samples_leaf `` to control the leaf node sizes.
329+     ``min_weight_fraction_leaf `` will ensure that leaf nodes contain at least
330+     some fraction of the overall sum of the sample weights and will not be
331+     biased toward the dominant classes like ``min_samples_leaf ``.
325332
326333  * All decision trees use ``np.float32 `` arrays internally.
327334    If training data is not in this format, a copy of the dataset will be made.
    
 
   
 
     
   
   
          
     
  
    
     
 
    
      
     
 
     
    You can’t perform that action at this time.
  
 
    
  
     
    
      
        
     
 
       
      
     
   
 
    
    
  
 
  
 
     
    
0 commit comments