Skip to content

Commit de31f19

Browse files
committed
DOC Some narrative documentation for Random Forest Hashing.
1 parent 6e3472c commit de31f19

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

doc/modules/ensemble.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,28 @@ the matching feature to the prediction function.
216216
* :ref:`example_ensemble_plot_forest_importances_faces.py`
217217
* :ref:`example_ensemble_plot_forest_importances.py`
218218

219+
.. _random_hashing:
220+
221+
Random Forest Hashing
222+
---------------------
223+
:class:`RandomForestHasher` implements an unsupervised transformation of the
224+
data. Using a forest of completely random trees, :class:`RandomForestHasher`
225+
encodes the data by the indices of the leaves a data point ends up in. This
226+
index is then encoded in a one-of-K manner, leading to a high dimensional,
227+
sparse binary coding.
228+
This coding can be computed very efficiently and can then be used as a basis
229+
for other learning tasks.
230+
The size and sparsity of the code can be influenced by choosing the number of
231+
trees and the maximum depth per tree. For each tree in the ensemble, the coding
232+
contains one entry of one. The size of the coding is at most ``n_estimators * 2
233+
** max_depth``, the maximum number of leafs in the forest.
234+
235+
As neighboring data points are more likely to lie within the same leaf of a tree,
236+
the transformation performs an implicit, non-parametric density estimation.
237+
238+
.. topic:: Examples:
239+
240+
* :ref:`example_ensemble_plot_random_forest_hasher.py`
219241

220242
.. _gradient_boosting:
221243

0 commit comments

Comments
 (0)