File tree Expand file tree Collapse file tree 1 file changed +14
-0
lines changed Expand file tree Collapse file tree 1 file changed +14
-0
lines changed Original file line number Diff line number Diff line change @@ -767,6 +767,20 @@ by black points below.
767767 The possibility to use custom metrics is retained;
768768 for details, see :class: `NearestNeighbors `.
769769
770+ This implementation is by default not memory efficient because it constructs
771+ a full pairwise similarity matrix in the case where kd-trees or ball-trees cannot
772+ be used (e.g. with sparse matrices). This matrix will consume n^2 floats.
773+ A couple of mechanisms for getting around this are:
774+
775+ - A sparse radius neighborhood graph (where missing
776+ entries are presumed to be out of eps) can be precomputed in a memory-efficient
777+ way and dbscan can be run over this with ``metric='precomputed' ``.
778+
779+ - The dataset can be compressed, either by removing exact duplicates if
780+ these occur in your data, or by using BIRCH. Then you only have a
781+ relatively small number of representatives for a large number of points.
782+ You can then provide a ``sample_weight `` when fitting DBSCAN.
783+
770784.. topic :: References:
771785
772786 * "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases
You can’t perform that action at this time.
0 commit comments