2424class IsolationForest (BaseBagging ):
2525 """Isolation Forest Algorithm
2626
27- Return the anomaly score of each sample with the IsolationForest algorithm
27+ Return the anomaly score of each sample using the IsolationForest algorithm
2828
29- IsolationForest consists in 'isolate' the observations by randomly
30- selecting a feature and then randomly selecting a split value
31- between the maximum and minimum values of the selected feature.
29+ The IsolationForest 'isolates' observations by randomly selecting a feature
30+ and then randomly selecting a split value between the maximum and minimum
31+ values of the selected feature.
3232
3333 Since recursive partitioning can be represented by a tree structure, the
34- number of splitting required to isolate a point is equivalent to the path
35- length from the root node to a terminating node.
34+ number of splittings required to isolate a sample is equivalent to the path
35+ length from the root node to the terminating node.
3636
37- This path length, averaged among a forest of such random trees, is a
37+ This path length, averaged over a forest of such random trees, is a
3838 measure of abnormality and our decision function.
3939
40- Indeed random partitioning produces noticeable shorter paths for anomalies.
40+ Random partitioning produces noticeably shorter paths for anomalies.
4141 Hence, when a forest of random trees collectively produce shorter path
42- lengths for some particular points, then they are highly likely to be
43- anomalies.
42+ lengths for particular samples, they are highly likely to be anomalies.
4443
4544
4645 Parameters
@@ -52,8 +51,8 @@ class IsolationForest(BaseBagging):
5251 The number of samples to draw from X to train each base estimator.
5352 - If int, then draw `max_samples` samples.
5453 - If float, then draw `max_samples * X.shape[0]` samples.
55- If max_samples is larger than number of samples provided,
56- all samples with be used for all trees (no sampling).
54+ If max_samples is larger than the number of samples provided,
55+ all samples will be used for all trees (no sampling).
5756
5857 max_features : int or float, optional (default=1.0)
5958 The number of features to draw from X to train each base estimator.
@@ -169,12 +168,12 @@ def predict(self, X):
169168 """Predict anomaly score of X with the IsolationForest algorithm.
170169
171170 The anomaly score of an input sample is computed as
172- the mean anomaly scores of the trees in the forest.
171+ the mean anomaly score of the trees in the forest.
173172
174173 The measure of normality of an observation given a tree is the depth
175174 of the leaf containing this observation, which is equivalent to
176- the number of splitting required to isolate this point. In case of
177- several observations n_left in the leaf, the average length path of
175+ the number of splittings required to isolate this point. In case of
176+ several observations n_left in the leaf, the average path length of
178177 a n_left samples isolation tree is added.
179178
180179 Parameters
0 commit comments