From cfbed88d10b179c08926deb52dbc56d451a5527a Mon Sep 17 00:00:00 2001 From: dclambert Date: Sun, 9 Jun 2013 19:55:36 -0500 Subject: [PATCH 01/11] LICENSE tweak --- LICENSE | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/LICENSE b/LICENSE index f088e96..800a790 100644 --- a/LICENSE +++ b/LICENSE @@ -8,14 +8,14 @@ * * Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. -* * Neither the name of the University of California, Berkeley nor the -* names of its contributors may be used to endorse or promote products -* derived from this software without specific prior written permission. +* * Neither the name of David C. Lambert nor the names of other contributors +* may be used to endorse or promote products derived from this software +* without specific prior written permission. * -* THIS SOFTWARE IS PROVIDED BY THE DAVID C LAMBERT "AS IS" AND ANY +* THIS SOFTWARE IS PROVIDED BY DAVID C. LAMBERT "AS IS" AND ANY * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE -* DISCLAIMED. IN NO EVENT SHALL THE DAVID C LAMBERT BE LIABLE FOR ANY +* DISCLAIMED. IN NO EVENT SHALL DAVID C. LAMBERT BE LIABLE FOR ANY * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND From f8e6fb38cbfa1466dbd914e25c71c3004d499737 Mon Sep 17 00:00:00 2001 From: dclambert Date: Sun, 9 Jun 2013 19:59:10 -0500 Subject: [PATCH 02/11] README.md update --- README.md | 115 +++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 96 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index 7095e78..0009df7 100644 --- a/README.md +++ b/README.md @@ -2,35 +2,112 @@ Python-ELM ========== Extreme Learning Machine implementation in Python -Version 0.2 +Version 0.3 -This is an implementation of the Extreme Learning Machine in python, -based on the scikit-learn machine learning library. +This is an implementation of the [Extreme Learning Machine](http://www.extreme-learning-machines.org) [1][2] in Python, based on [scikit-learn](http://scikit-learn.org). -Distance and dot product based hidden layers are provided via the -RBFRandomLayer and SimpleRandomLayer classes respectively. +It's a work in progress, so things can/might/will change. -The SimpleRandomLayer provides the following activation functions: +__David C. Lambert__ +__dcl [at] panix [dot] com__ - tanh, sine, tribas, sigmoid, hardlim +__Copyright © 2013__ +__License: Simple BSD__ -The RBFRandomLayer provides the following activation functions: +Files +----- +####__random_layer.py__ - gaussian, multiquadric and polyharmonic spline ('poly_spline') +Contains the __RandomLayer__, __MLPRandomLayer__, __RBFRandomLayer__ and __GRBFRandomLayer__ classes. -In addition, each random hidden layer class can take a callable user +RandomLayer is a transformer that creates a feature mapping of the +inputs that corresponds to a layer of hidden units with randomly +generated components. + +The transformed values are a specified function of input activations +that are a weighted combination of dot product (multilayer perceptron) +and distance (rbf) activations: + + input_activation = alpha * mlp_activation + (1-alpha) * rbf_activation + + mlp_activation(x) = dot(x, weights) + bias + rbf_activation(x) = rbf_width * ||x - center||/radius + +_mlp_activation_ is multi-layer perceptron input activation + +_rbf_activation_ is radial basis function input activation + +_alpha_ and _rbf_width_ are specified by the user + +_weights_ and _biases_ are taken from normal distribution of +mean 0 and sd of 1 + +_centers_ are taken uniformly from the bounding hyperrectangle +of the inputs, and + + radius = max(||x-c||)/sqrt(n_centers*2) + +(All random components can be supplied by the user by providing entries in the dictionary given as the _user_components_ parameter.) + +The input activation is transformed by a transfer function that defaults +to numpy.tanh if not specified, but can be any callable that returns an +array of the same shape as its argument (the input activation array, of +shape [n_samples, n_hidden]). + +Transfer functions provided are: + +* sine +* tanh +* tribas +* inv_tribas +* sigmoid +* hardlim +* softlim +* gaussian +* multiquadric +* inv_multiquadric + +MLPRandomLayer and RBFRandomLayer classes are just wrappers around the RandomLayer class, with the _alpha_ mixing parameter set to 1.0 and 0.0 respectively (for 100% MLP input activation, or 100% RBF input activation) + +The RandomLayer, MLPRandomLayer, RBFRandomLayer classes can take a callable user provided transfer function. See the docstrings and the example ipython notebook for details. -There's a little demo in plot_elm_comparison.py (based on scikit-learn's -plot_classifier_comparison). +The GRBFRandomLayer implements the Generalized Radial Basis Function from [[3]](http://sci2s.ugr.es/keel/pdf/keel/articulo/2011-Neurocomputing1.pdf) + +####__elm.py__ + +Contains the __ELMRegressor__, __ELMClassifier__, __GenELMRegressor__, and __GenELMClassifier__ classes. + +GenELMRegressor and GenELMClassifier both take *RandomLayer instances as part of their contructors, and an optional regressor (conforming to the sklearn API)for performing the fit (instead of the default linear fit using the pseudo inverse from scipy.pinv2). +GenELMClassifier is little more than a wrapper around GenELMRegressor that binarizes the target array before performing a regression, then unbinarizes the prediction of the regressor to make its own predictions. + +The ELMRegressor class is a wrapper around GenELMRegressor that uses a RandomLayer instance by default and exposes the RandomLayer parameters in the constructor. ELMClassifier is similar for classification. + +####__plot_elm_comparison.py__ + +A small demo ()based on scikit-learn's plot_classifier_comparison) that shows the decision functions of a couple of different instantiations of the GenELMClassifier on three different datasets. + +####__elm_notebook.py__ + +An IPython notebook, illustrating several ways to use the __\*ELM*__ and __\*RandomLayer__ classes. + +Requirements +------------ + +Written using Python 2.7.3, numpy 1.6.1, scipy 0.10.1, scikit-learn 0.13.1 and ipython 0.12.1 -Requires that scikit-learn be installed, along with its usual prerequisites, -and ipython to use elm_notebook.py (though it can be tweaked to run without -it). +References +---------- +``` +[1] http://www.extreme-learning-machines.org -This is a work in progress, it may be restructured as time goes by. +[2] G.-B. Huang, Q.-Y. Zhu and C.-K. Siew, "Extreme Learning Machine: + Theory and Applications", Neurocomputing, vol. 70, pp. 489-501, + 2006. + +[3] Fernandez-Navarro, et al, "MELM-GRBF: a modified version of the + extreme learning machine for generalized radial basis function + neural networks", Neurocomputing 74 (2011), 2502-2510 +``` -- David C Lambert - March, 2013 - [dcl -at- panix -dot- com] From 3a2761ab437af1b2ca62950fe30683eb7dbd1f05 Mon Sep 17 00:00:00 2001 From: dclambert Date: Sun, 9 Jun 2013 20:22:07 -0500 Subject: [PATCH 03/11] major changes to random_layer.py - main class is RandomLayer, and MLPRandomLayer and RBFRandomLayer are derived from it. main classes in elm.py changed to GenELMRegressor and GenELMClassifier, with ELMRegressor derived from GenELMRegressor, and ELMClassifier derived from ELMRegressor plot_elm_comparison.py and elm_notebook.py changed to follow API changes --- elm.py | 207 +++++++++++-------- elm_notebook.py | 127 ++++++------ plot_elm_comparison.py | 34 ++- random_layer.py | 455 ++++++++++++++++++++++++----------------- 4 files changed, 469 insertions(+), 354 deletions(-) diff --git a/elm.py b/elm.py index 2c13266..fbda9ee 100644 --- a/elm.py +++ b/elm.py @@ -31,12 +31,12 @@ from sklearn.base import BaseEstimator, ClassifierMixin, RegressorMixin from sklearn.preprocessing import LabelBinarizer -from random_layer import SimpleRandomLayer +from random_layer import RandomLayer, MLPRandomLayer __all__ = ["ELMRegressor", "ELMClassifier", - "SimpleELMRegressor", - "SimpleELMClassifier"] + "GenELMRegressor", + "GenELMClassifier"] # BaseELM class, regressor and hidden_layer attributes @@ -92,7 +92,7 @@ def predict(self, X): """ -class ELMRegressor(BaseELM, RegressorMixin): +class GenELMRegressor(BaseELM, RegressorMixin): """ ELMRegressor is a regressor based on the Extreme Learning Machine. @@ -103,8 +103,8 @@ class ELMRegressor(BaseELM, RegressorMixin): Parameters ---------- - `hidden_layer` : random_hidden_layer instance, optional - (default=SimpleRandomLayer(random_state=0)) + `hidden_layer` : random_layer instance, optional + (default=MLPRandomLayer(random_state=0)) `regressor` : regressor instance, optional (default=None) If provided, this object is used to perform the regression from hidden @@ -124,8 +124,7 @@ class ELMRegressor(BaseELM, RegressorMixin): See Also -------- - RBFRandomLayer, SimpleRandomLayer, ELMRegressor, ELMClassifier - SimpleELMRegressor, SimpleELMClassifier + RBFRandomLayer, MLPRandomLayer, ELMRegressor, ELMClassifier References ---------- @@ -136,10 +135,10 @@ class ELMRegressor(BaseELM, RegressorMixin): """ def __init__(self, - hidden_layer=SimpleRandomLayer(random_state=0), + hidden_layer=MLPRandomLayer(random_state=0), regressor=None): - super(ELMRegressor, self).__init__(hidden_layer, regressor) + super(GenELMRegressor, self).__init__(hidden_layer, regressor) self.coefs_ = None self.fitted_ = False @@ -147,7 +146,7 @@ def __init__(self, def _fit_regression(self, y): """ - fit regression using internal linear regression + fit regression using Moore-Penrose pseudo-inverse or supplied regressor """ if (self.regressor is None): @@ -185,7 +184,7 @@ def fit(self, X, y): return self - def _get_predictions(self, X): + def _get_predictions(self): """get predictions using internal least squares/supplied regressor""" if (self.regressor is None): preds = safe_sparse_dot(self.hidden_activations_, self.coefs_) @@ -214,14 +213,14 @@ def predict(self, X): self.hidden_activations_ = self.hidden_layer.transform(X) # compute output predictions for new hidden activations - predictions = self._get_predictions(X) + predictions = self._get_predictions() return predictions -class ELMClassifier(BaseELM, ClassifierMixin): +class GenELMClassifier(BaseELM, ClassifierMixin): """ - ELMClassifier is a classifier based on the Extreme Learning Machine. + GenELMClassifier is a classifier based on the Extreme Learning Machine. An Extreme Learning Machine (ELM) is a single layer feedforward network with a random hidden layer components and ordinary linear @@ -230,8 +229,11 @@ class ELMClassifier(BaseELM, ClassifierMixin): Parameters ---------- - `hidden_layer` : random_hidden_layer instance, optional - (default=SimpleRandomLayer(random_state=0)) + `hidden_layer` : random_layer instance, optional + (default=MLPRandomLayer(random_state=0)) + + `binarizer` : LabelBinarizer, optional + (default=LabelBinarizer(-1, 1)) `regressor` : regressor instance, optional (default=None) If provided, this object is used to perform the regression from hidden @@ -243,16 +245,12 @@ class ELMClassifier(BaseELM, ClassifierMixin): `classes_` : numpy array of shape [n_classes] Array of class labels - `binarizer_` : LabelBinarizer instance - Used to transform class labels - `elm_regressor_` : ELMRegressor instance Performs actual fit of binarized values See Also -------- - RBFRandomLayer, SimpleRandomLayer, ELMRegressor, ELMClassifier - SimpleELMRegressor, SimpleELMClassifier + RBFRandomLayer, MLPRandomLayer, ELMRegressor, ELMClassifier References ---------- @@ -262,14 +260,16 @@ class ELMClassifier(BaseELM, ClassifierMixin): 2006. """ def __init__(self, - hidden_layer=SimpleRandomLayer(random_state=0), + hidden_layer=MLPRandomLayer(random_state=0), + binarizer=LabelBinarizer(-1, 1), regressor=None): - super(ELMClassifier, self).__init__(hidden_layer, regressor) + super(GenELMClassifier, self).__init__(hidden_layer, regressor) + + self.binarizer = binarizer self.classes_ = None - self.binarizer_ = LabelBinarizer(-1, 1) - self.elm_regressor_ = ELMRegressor(hidden_layer, regressor) + self.genelm_regressor_ = GenELMRegressor(hidden_layer, regressor) def decision_function(self, X): """ @@ -286,7 +286,7 @@ class on an array of test vectors X. Decision function values related to each class, per sample. In the two-class case, the shape is [n_samples,] """ - return self.elm_regressor_.predict(X) + return self.genelm_regressor_.predict(X) def fit(self, X, y): """ @@ -310,9 +310,9 @@ def fit(self, X, y): """ self.classes_ = np.unique(y) - y_bin = self.binarizer_.fit_transform(y) + y_bin = self.binarizer.fit_transform(y) - self.elm_regressor_.fit(X, y_bin) + self.genelm_regressor_.fit(X, y_bin) return self def predict(self, X): @@ -328,52 +328,74 @@ def predict(self, X): Predicted values. """ raw_predictions = self.decision_function(X) - class_predictions = self.binarizer_.inverse_transform(raw_predictions) + class_predictions = self.binarizer.inverse_transform(raw_predictions) return class_predictions -# ELMRegressor with default SimpleRandomLayer -class SimpleELMRegressor(BaseEstimator, RegressorMixin): +# ELMRegressor with default RandomLayer +class ELMRegressor(BaseEstimator, RegressorMixin): """ - SimpleELMRegressor is a regressor based on the Extreme Learning Machine. + ELMRegressor is a regressor based on the Extreme Learning Machine. An Extreme Learning Machine (ELM) is a single layer feedforward network with a random hidden layer components and ordinary linear least squares fitting of the hidden->output weights by default. [1][2] - SimpleELMRegressor is a wrapper for an ELMRegressor that uses a - SimpleRandomLayer and passes the __init__ parameters through - to the hidden layer generated by the fit() method. + ELMRegressor is a wrapper for an GenELMRegressor that uses a + RandomLayer and exposes the RandomLayer's parameters in its + own constructor. Parameters ---------- `n_hidden` : int, optional (default=20) Number of units to generate in the SimpleRandomLayer + `alpha` : float, optional (default=0.5) + Mixing coefficient for distance and dot product input activations: + activation = alpha*mlp_activation + (1-alpha)*rbf_width*rbf_activation + + `rbf_width` : float, optional (default=1.0) + multiplier on rbf_activation + `activation_func` : {callable, string} optional (default='tanh') Function used to transform input activation - It must be one of 'tanh', 'sine', 'tribas', 'sigmoid', 'hardlim' or + + It must be one of 'tanh', 'sine', 'tribas', 'inv_tribase', 'sigmoid', + 'hardlim', 'softlim', 'gaussian', 'multiquadric', 'inv_multiquadric' or a callable. If none is given, 'tanh' will be used. If a callable is given, it will be used to compute the hidden unit activations. `activation_args` : dictionary, optional (default=None) Supplies keyword arguments for a callable activation_func + `user_components`: dictionary, optional (default=None) + dictionary containing values for components that woud otherwise be + randomly generated. Valid key/value pairs are as follows: + 'radii' : array-like of shape [n_hidden] + 'centers': array-like of shape [n_hidden, n_features] + 'biases' : array-like of shape [n_hidden] + 'weights': array-like of shape [n_hidden, n_features] + + `regressor` : regressor instance, optional (default=None) + If provided, this object is used to perform the regression from hidden + unit activations to the outputs and subsequent predictions. If not + present, an ordinary linear least squares fit is performed + `random_state` : int, RandomState instance or None (default=None) Control the pseudo random number generator used to generate the hidden unit weights at fit time. Attributes ---------- - `elm_regressor_` : ELMRegressor object + `genelm_regressor_` : GenELMRegressor object Wrapped object that actually performs the fit. See Also -------- - RBFRandomLayer, SimpleRandomLayer, ELMRegressor, ELMClassifier - SimpleELMRegressor, SimpleELMClassifier + RandomLayer, RBFRandomLayer, MLPRandomLayer, + GenELMRegressor, GenELMClassifier, ELMClassifier References ---------- @@ -383,16 +405,28 @@ class SimpleELMRegressor(BaseEstimator, RegressorMixin): 2006. """ - def __init__(self, n_hidden=20, + def __init__(self, n_hidden=20, alpha=0.5, rbf_width=1.0, activation_func='tanh', activation_args=None, - random_state=None): + user_components=None, regressor=None, random_state=None): self.n_hidden = n_hidden + self.alpha = alpha + self.random_state = random_state self.activation_func = activation_func self.activation_args = activation_args - self.random_state = random_state + self.user_components = user_components + self.rbf_width = rbf_width + self.regressor = regressor - self.elm_regressor_ = None + self._genelm_regressor_ = None + + def _create_random_layer(self): + return RandomLayer(n_hidden=self.n_hidden, + alpha=self.alpha, random_state=self.random_state, + activation_func=self.activation_func, + activation_args=self.activation_args, + user_components=self.user_components, + rbf_width=self.rbf_width) def fit(self, X, y): """ @@ -414,13 +448,10 @@ def fit(self, X, y): Returns an instance of self. """ - rhl = SimpleRandomLayer(n_hidden=self.n_hidden, - activation_func=self.activation_func, - activation_args=self.activation_args, - random_state=self.random_state) - - self.elm_regressor_ = ELMRegressor(hidden_layer=rhl) - self.elm_regressor_.fit(X, y) + rhl = self._create_random_layer() + self.genelm_regressor_ = GenELMRegressor(hidden_layer=rhl, + regressor=self.regressor) + self.genelm_regressor_.fit(X, y) return self def predict(self, X): @@ -436,25 +467,27 @@ def predict(self, X): C : numpy array of shape [n_samples, n_outputs] Predicted values. """ - if (self.elm_regressor_ is None): + if (self.genelm_regressor_ is None): raise ValueError("SimpleELMRegressor not fitted") - return self.elm_regressor_.predict(X) + return self.genelm_regressor_.predict(X) -# ELMClassifier with default SimpleRandomLayer -class SimpleELMClassifier(BaseEstimator, ClassifierMixin): +class ELMClassifier(ELMRegressor): """ - SimpleELMClassifier is a classifier based on the Extreme Learning Machine. + ELMClassifier is a classifier based on the Extreme Learning Machine. An Extreme Learning Machine (ELM) is a single layer feedforward network with a random hidden layer components and ordinary linear least squares fitting of the hidden->output weights by default. [1][2] - SimpleELMClassifier is a wrapper for an ELMClassifier that uses a - SimpleRandomLayer and passes the __init__ parameters through - to the hidden layer generated by the fit() method. + ELMClassifier is an ELMRegressor subclass that first binarizes the + data, then uses the superclass to compute the decision function that + is then unbinarized to yield the prediction. + + The RandomLayer used for the input transform are exposed in the + ELMClassifier constructor. Parameters ---------- @@ -463,7 +496,9 @@ class SimpleELMClassifier(BaseEstimator, ClassifierMixin): `activation_func` : {callable, string} optional (default='tanh') Function used to transform input activation - It must be one of 'tanh', 'sine', 'tribas', 'sigmoid', 'hardlim' or + + It must be one of 'tanh', 'sine', 'tribas', 'inv_tribase', 'sigmoid', + 'hardlim', 'softlim', 'gaussian', 'multiquadric', 'inv_multiquadric' or a callable. If none is given, 'tanh' will be used. If a callable is given, it will be used to compute the hidden unit activations. @@ -479,13 +514,10 @@ class SimpleELMClassifier(BaseEstimator, ClassifierMixin): `classes_` : numpy array of shape [n_classes] Array of class labels - `elm_classifier_` : ELMClassifier object - Wrapped object that actually performs the fit - See Also -------- - RBFRandomLayer, SimpleRandomLayer, ELMRegressor, ELMClassifier - SimpleELMRegressor, SimpleELMClassifier + RandomLayer, RBFRandomLayer, MLPRandomLayer, + GenELMRegressor, GenELMClassifier, ELMClassifier References ---------- @@ -495,20 +527,23 @@ class SimpleELMClassifier(BaseEstimator, ClassifierMixin): 2006. """ - def __init__(self, n_hidden=20, + def __init__(self, n_hidden=20, alpha=0.5, rbf_width=1.0, activation_func='tanh', activation_args=None, + user_components=None, regressor=None, + binarizer=LabelBinarizer(-1, 1), random_state=None): - self.n_hidden = n_hidden - self.activation_func = activation_func - self.activation_args = activation_args - self.random_state = random_state - - self.elm_classifier_ = None + super(ELMClassifier, self).__init__(n_hidden=n_hidden, + alpha=alpha, + random_state=random_state, + activation_func=activation_func, + activation_args=activation_args, + user_components=user_components, + rbf_width=rbf_width, + regressor=regressor) - @property - def classes_(self): - return self.elm_classifier_.classes_ + self.classes_ = None + self.binarizer = binarizer def decision_function(self, X): """ @@ -525,7 +560,7 @@ class on an array of test vectors X. Decision function values related to each class, per sample. In the two-class case, the shape is [n_samples,] """ - return self.elm_classifier_.decision_function(X) + return super(ELMClassifier, self).predict(X) def fit(self, X, y): """ @@ -547,13 +582,11 @@ def fit(self, X, y): Returns an instance of self. """ - rhl = SimpleRandomLayer(n_hidden=self.n_hidden, - activation_func=self.activation_func, - activation_args=self.activation_args, - random_state=self.random_state) + self.classes_ = np.unique(y) - self.elm_classifier_ = ELMClassifier(hidden_layer=rhl) - self.elm_classifier_.fit(X, y) + y_bin = self.binarizer.fit_transform(y) + + super(ELMClassifier, self).fit(X, y_bin) return self @@ -570,7 +603,11 @@ def predict(self, X): C : numpy array of shape [n_samples, n_outputs] Predicted values. """ - if (self.elm_classifier_ is None): - raise ValueError("SimpleELMClassifier not fitted") + raw_predictions = self.decision_function(X) + class_predictions = self.binarizer.inverse_transform(raw_predictions) + + return class_predictions - return self.elm_classifier_.predict(X) + def score(self, X, y): + from sklearn.metrics import accuracy_score + return accuracy_score(y, self.predict(X)) diff --git a/elm_notebook.py b/elm_notebook.py index 8e27fa6..ff05acf 100644 --- a/elm_notebook.py +++ b/elm_notebook.py @@ -3,7 +3,7 @@ # -# Demo python notebook for elm and random_hidden_layer modules +# Demo python notebook for sklearn elm and random_hidden_layer modules # # Author: David C. Lambert [dcl -at- panix -dot- com] # Copyright(c) 2013 @@ -13,14 +13,15 @@ from time import time from sklearn.cluster import k_means -from elm import ELMClassifier, ELMRegressor, SimpleELMClassifier, SimpleELMRegressor -from random_layer import SimpleRandomLayer, RBFRandomLayer + +from elm import ELMClassifier, ELMRegressor, GenELMClassifier, GenELMRegressor +from random_layer import RandomLayer, MLPRandomLayer, RBFRandomLayer, GRBFRandomLayer # def make_toy(): x = np.arange(0.25,20,0.1) - y = x*np.cos(x)+np.random.randn(x.shape[0]) + y = x*np.cos(x)+0.5*sqrt(x)*np.random.randn(x.shape[0]) x = x.reshape(-1,1) y = y.reshape(-1,1) return x, y @@ -28,7 +29,7 @@ def make_toy(): # def res_dist(x, y, e, n_runs=100, random_state=None): - x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=random_state) + x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=random_state) test_res = [] train_res = [] @@ -38,7 +39,7 @@ def res_dist(x, y, e, n_runs=100, random_state=None): e.fit(x_train, y_train) train_res.append(e.score(x_train, y_train)) test_res.append(e.score(x_test, y_test)) - if (i%5 == 0): print "%d"%i, + if (i%(n_runs/10) == 0): print "%d"%i, print "\nTime: %.3f secs" % (time() - start_time) @@ -60,14 +61,14 @@ def res_dist(x, y, e, n_runs=100, random_state=None): irx_train, irx_test, iry_train, iry_test = train_test_split(irx, iry, test_size=0.2) digits = load_digits() -dgx, dgy = stdsc.fit_transform(digits.data), digits.target +dgx, dgy = stdsc.fit_transform(digits.data/16.0), digits.target dgx_train, dgx_test, dgy_train, dgy_test = train_test_split(dgx, dgy, test_size=0.2) diabetes = load_diabetes() dbx, dby = stdsc.fit_transform(diabetes.data), diabetes.target dbx_train, dbx_test, dby_train, dby_test = train_test_split(dbx, dby, test_size=0.2) -mrx, mry = make_regression(n_samples=2000, n_targets=2) +mrx, mry = make_regression(n_samples=2000, n_targets=4) mrx_train, mrx_test, mry_train, mry_test = train_test_split(mrx, mry, test_size=0.2) xtoy, ytoy = make_toy() @@ -77,101 +78,94 @@ def res_dist(x, y, e, n_runs=100, random_state=None): # -# SimpleELMClassifier test -elmc = SimpleELMClassifier(n_hidden=500) -elmc.fit(dgx_train, dgy_train) -print elmc.score(dgx_train, dgy_train), elmc.score(dgx_test, dgy_test) +# RBFRandomLayer tests +for af in RandomLayer.activation_func_names(): + print af, + elmc = ELMClassifier(activation_func=af) + tr,ts = res_dist(irx, iry, elmc, n_runs=200, random_state=0) # -# SimpleELMRegressor test -elmr = SimpleELMRegressor() -elmr.fit(xtoy_train, ytoy_train) -print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) -plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) +elmc.classes_ # -# RBF tests -elmc = ELMClassifier(RBFRandomLayer(activation_func='gaussian')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) - -elmc = ELMClassifier(RBFRandomLayer(activation_func='poly_spline', gamma=2)) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) +for af in RandomLayer.activation_func_names(): + print af + elmc = ELMClassifier(activation_func=af, random_state=0) + tr,ts = res_dist(dgx, dgy, elmc, n_runs=100, random_state=0) -elmc = ELMClassifier(RBFRandomLayer(activation_func='multiquadric')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) - -# Simple tests -elmc = ELMClassifier(SimpleRandomLayer(activation_func='sine')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) - -elmc = ELMClassifier(SimpleRandomLayer(activation_func='tanh')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) +# -elmc = ELMClassifier(SimpleRandomLayer(activation_func='tribas')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) +elmc = ELMClassifier(n_hidden=500, activation_func='multiquadric') +tr,ts = res_dist(dgx, dgy, elmc, n_runs=100, random_state=0) +scatter(tr, ts, alpha=0.1, marker='D', c='r') -elmc = ELMClassifier(SimpleRandomLayer(activation_func='sigmoid')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) +# -elmc = ELMClassifier(SimpleRandomLayer(activation_func='hardlim')) -tr,ts = res_dist(irx, iry, elmc, n_runs=100, random_state=0) +elmr = ELMRegressor(random_state=0, activation_func='gaussian', alpha=0.0) +elmr.fit(xtoy_train, ytoy_train) +print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) +plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) # -elmr = ELMRegressor(SimpleRandomLayer(random_state=0, activation_func='tribas')) +from sklearn import pipeline +from sklearn.linear_model import LinearRegression +elmr = pipeline.Pipeline([('rhl', RandomLayer(random_state=0, activation_func='multiquadric')), + ('lr', LinearRegression(fit_intercept=False))]) elmr.fit(xtoy_train, ytoy_train) print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) # -rhl = SimpleRandomLayer(n_hidden=200) -elmr = ELMRegressor(hidden_layer=rhl) -tr, ts = res_dist(mrx, mry, elmr, n_runs=20, random_state=0) +rhl = RandomLayer(n_hidden=200, alpha=1.0) +elmr = GenELMRegressor(hidden_layer=rhl) +tr, ts = res_dist(mrx, mry, elmr, n_runs=200, random_state=0) +scatter(tr, ts, alpha=0.1, marker='D', c='r') # -rhl = RBFRandomLayer(n_hidden=15, gamma=0.25) -elmr = ELMRegressor(hidden_layer=rhl) +rhl = RBFRandomLayer(n_hidden=15, rbf_width=0.8) +elmr = GenELMRegressor(hidden_layer=rhl) elmr.fit(xtoy_train, ytoy_train) print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) # -nh = 10 +nh = 15 (ctrs, _, _) = k_means(xtoy_train, nh) unit_rs = np.ones(nh) -#rhl = RBFRandomLayer(n_hidden=nh, activation_func='poly_spline', gamma=3) -#rhl = RBFRandomLayer(n_hidden=nh, activation_func='multiquadric', gamma=1) -rhl = RBFRandomLayer(n_hidden=nh, centers=ctrs, radii=unit_rs) -elmr = ELMRegressor(hidden_layer=rhl) + +#rhl = RBFRandomLayer(n_hidden=nh, activation_func='inv_multiquadric') +#rhl = RBFRandomLayer(n_hidden=nh, centers=ctrs, radii=unit_rs) +rhl = GRBFRandomLayer(n_hidden=nh, grbf_lambda=.0001, centers=ctrs) +elmr = GenELMRegressor(hidden_layer=rhl) elmr.fit(xtoy_train, ytoy_train) print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) # -rbf_rhl = RBFRandomLayer(n_hidden=100, random_state=0, gamma=0.1) -elmc_rbf = ELMClassifier(hidden_layer=rbf_rhl) +rbf_rhl = RBFRandomLayer(n_hidden=100, random_state=0, rbf_width=0.01) +elmc_rbf = GenELMClassifier(hidden_layer=rbf_rhl) elmc_rbf.fit(dgx_train, dgy_train) print elmc_rbf.score(dgx_train, dgy_train), elmc_rbf.score(dgx_test, dgy_test) def powtanh_xfer(activations, power=1.0): return pow(np.tanh(activations), power) -#tanh_rhl = SimpleRandomLayer(n_hidden=5000, random_state=0) -tanh_rhl = SimpleRandomLayer(n_hidden=5000, activation_func=powtanh_xfer, activation_args={'power':2.0}) -elmc_tanh = ELMClassifier(hidden_layer=tanh_rhl) +tanh_rhl = MLPRandomLayer(n_hidden=100, activation_func=powtanh_xfer, activation_args={'power':3.0}) +elmc_tanh = GenELMClassifier(hidden_layer=tanh_rhl) elmc_tanh.fit(dgx_train, dgy_train) print elmc_tanh.score(dgx_train, dgy_train), elmc_tanh.score(dgx_test, dgy_test) # -rbf_rhl = RBFRandomLayer(n_hidden=100, gamma=0.1) -tr, ts = res_dist(dgx, dgy, ELMClassifier(hidden_layer=rbf_rhl), n_runs=100, random_state=0) +rbf_rhl = RBFRandomLayer(n_hidden=100, rbf_width=0.01) +tr, ts = res_dist(dgx, dgy, GenELMClassifier(hidden_layer=rbf_rhl), n_runs=100, random_state=0) # @@ -180,30 +174,39 @@ def powtanh_xfer(activations, power=1.0): # +from sklearn.svm import SVR from sklearn.ensemble import RandomForestRegressor tr, ts = res_dist(dbx, dby, RandomForestRegressor(n_estimators=15), n_runs=100, random_state=0) hist(tr), hist(ts) print -rhl = RBFRandomLayer(n_hidden=15, gamma=0.01) -tr,ts = res_dist(dbx, dby, ELMRegressor(rhl), n_runs=100, random_state=0) +rhl = RBFRandomLayer(n_hidden=15, rbf_width=0.1) +tr,ts = res_dist(dbx, dby, GenELMRegressor(rhl), n_runs=100, random_state=0) hist(tr), hist(ts) print # -hist(ts), hist(tr) -print +elmc = ELMClassifier(n_hidden=1000, activation_func='gaussian', alpha=0.0, random_state=0) +elmc.fit(dgx_train, dgy_train) +print elmc.score(dgx_train, dgy_train), elmc.score(dgx_test, dgy_test) # -elmc = SimpleELMClassifier(n_hidden=500) +elmc = ELMClassifier(n_hidden=500, activation_func='hardlim', alpha=1.0, random_state=0) elmc.fit(dgx_train, dgy_train) print elmc.score(dgx_train, dgy_train), elmc.score(dgx_test, dgy_test) # -elmr = SimpleELMRegressor(activation_func='tribas') +elmr = ELMRegressor(random_state=0) +elmr.fit(xtoy_train, ytoy_train) +print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) +plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) + +# + +elmr = ELMRegressor(activation_func='inv_tribas', random_state=0) elmr.fit(xtoy_train, ytoy_train) print elmr.score(xtoy_train, ytoy_train), elmr.score(xtoy_test, ytoy_test) plot(xtoy, ytoy, xtoy, elmr.predict(xtoy)) diff --git a/plot_elm_comparison.py b/plot_elm_comparison.py index 70a605a..e8762d5 100644 --- a/plot_elm_comparison.py +++ b/plot_elm_comparison.py @@ -72,8 +72,8 @@ from sklearn.cross_validation import train_test_split from sklearn.linear_model import LogisticRegression -from elm import ELMClassifier -from random_layer import RBFRandomLayer, SimpleRandomLayer +from elm import GenELMClassifier +from random_layer import RBFRandomLayer, MLPRandomLayer def get_data_bounds(X): @@ -136,34 +136,26 @@ def make_classifiers(): # pass user defined transfer func sinsq = (lambda x: np.power(np.sin(x), 2.0)) - srhl_sinsq = SimpleRandomLayer(n_hidden=nh, - activation_func=sinsq, - random_state=0) + srhl_sinsq = MLPRandomLayer(n_hidden=nh, activation_func=sinsq) # use internal transfer funcs - srhl_tanh = SimpleRandomLayer(n_hidden=nh, - activation_func='tanh', - random_state=0) + srhl_tanh = MLPRandomLayer(n_hidden=nh, activation_func='tanh') - srhl_tribas = SimpleRandomLayer(n_hidden=nh, - activation_func='tribas', - random_state=0) + srhl_tribas = MLPRandomLayer(n_hidden=nh, activation_func='tribas') - srhl_hardlim = SimpleRandomLayer(n_hidden=nh, - activation_func='hardlim', - random_state=0) + srhl_hardlim = MLPRandomLayer(n_hidden=nh, activation_func='hardlim') # use gaussian RBF - srhl_rbf = RBFRandomLayer(n_hidden=nh*2, gamma=0.1, random_state=0) + srhl_rbf = RBFRandomLayer(n_hidden=nh*2, rbf_width=0.1, random_state=0) log_reg = LogisticRegression() - classifiers = [ELMClassifier(srhl_tanh), - ELMClassifier(srhl_tanh, regressor=log_reg), - ELMClassifier(srhl_sinsq), - ELMClassifier(srhl_tribas), - ELMClassifier(srhl_hardlim), - ELMClassifier(srhl_rbf)] + classifiers = [GenELMClassifier(hidden_layer=srhl_tanh), + GenELMClassifier(hidden_layer=srhl_tanh, regressor=log_reg), + GenELMClassifier(hidden_layer=srhl_sinsq), + GenELMClassifier(hidden_layer=srhl_tribas), + GenELMClassifier(hidden_layer=srhl_hardlim), + GenELMClassifier(hidden_layer=srhl_rbf)] return names, classifiers diff --git a/random_layer.py b/random_layer.py index f38007e..5a25177 100644 --- a/random_layer.py +++ b/random_layer.py @@ -3,10 +3,10 @@ # Copyright(c) 2013 # License: Simple BSD -"""The :mod:`random_hidden_layer` module -implements Random Hidden Layer transformers. +"""The :mod:`random_layer` module +implements Random Layer transformers. -Random hidden layers are arrays of hidden unit activations that are +Random layers are arrays of hidden unit activations that are random functions of input activation values (dot products for simple activation functions, distances from prototypes for radial basis functions). @@ -21,15 +21,18 @@ import numpy as np import scipy.sparse as sp -from scipy.spatial.distance import cdist +from scipy.spatial.distance import cdist, pdist, squareform from sklearn.metrics import pairwise_distances from sklearn.utils import check_random_state, atleast2d_or_csr from sklearn.utils.extmath import safe_sparse_dot from sklearn.base import BaseEstimator, TransformerMixin -__all__ = ['SimpleRandomLayer', - 'RBFRandomLayer'] +__all__ = ['RandomLayer', + 'MLPRandomLayer', + 'RBFRandomLayer', + 'GRBFRandomLayer', + ] # Abstract Base Class for random hidden layers @@ -38,6 +41,10 @@ class BaseRandomLayer(BaseEstimator, TransformerMixin): _internal_activation_funcs = dict() + @classmethod + def activation_func_names(cls): + return cls._internal_activation_funcs.keys() + # take n_hidden and random_state, init components_ and # input_activations_ def __init__(self, n_hidden=20, random_state=0, activation_func=None, @@ -130,29 +137,64 @@ def transform(self, X, y=None): return self._compute_hidden_activations(X) -class SimpleRandomLayer(BaseRandomLayer): - """Simple Random Hidden Layer transformer +class RandomLayer(BaseRandomLayer): + """RandomLayer is a transformer that creates a feature mapping of the + inputs that corresponds to a layer of hidden units with randomly + generated components. + + The transformed values are a specified function of input activations + that are a weighted combination of dot product (multilayer perceptron) + and distance (rbf) activations: + + input_activation = alpha * mlp_activation + (1-alpha) * rbf_activation + + mlp_activation(x) = dot(x, weights) + bias + rbf_activation(x) = rbf_width * ||x - center||/radius - Creates a layer of units as a specified functions of an activation - value determined by the dot product of the input and a random vector - plus a random bias term: + alpha and rbf_width are specified by the user - f(a), s.t. a = dot(x, hidden_weights) + bias + weights and biases are taken from normal distribution of + mean 0 and sd of 1 - and transfer function f() which defaults to numpy.tanh if not supplied - but can be any callable that returns an array of the same shape as - its argument (input activation array, shape [n_samples, n_hidden]) + centers are taken uniformly from the bounding hyperrectangle + of the inputs, and radii are max(||x-c||)/sqrt(n_centers*2) + + The input activation is transformed by a transfer function that defaults + to numpy.tanh if not specified, but can be any callable that returns an + array of the same shape as its argument (the input activation array, of + shape [n_samples, n_hidden]). Functions provided are 'sine', 'tanh', + 'tribas', 'inv_tribas', 'sigmoid', 'hardlim', 'softlim', 'gaussian', + 'multiquadric', or 'inv_multiquadric'. Parameters ---------- `n_hidden` : int, optional (default=20) Number of units to generate + `alpha` : float, optional (default=0.5) + Mixing coefficient for distance and dot product input activations: + activation = alpha*mlp_activation + (1-alpha)*rbf_width*rbf_activation + + `rbf_width` : float, optional (default=1.0) + multiplier on rbf_activation + + `user_components`: dictionary, optional (default=None) + dictionary containing values for components that woud otherwise be + randomly generated. Valid key/value pairs are as follows: + 'radii' : array-like of shape [n_hidden] + 'centers': array-like of shape [n_hidden, n_features] + 'biases' : array-like of shape [n_hidden] + 'weights': array-like of shape [n_features, n_hidden] + `activation_func` : {callable, string} optional (default='tanh') Function used to transform input activation - It must be one of 'tanh', 'sine', 'tribas', 'sigmoid', 'hardlim' or - a callable. If none is given, 'tanh' will be used. If a callable - is given, it will be used to compute the hidden unit activations. + + It must be one of 'tanh', 'sine', 'tribas', 'inv_tribas', + 'sigmoid', 'hardlim', 'softlim', 'gaussian', 'multiquadric', + 'inv_multiquadric' or a callable. If None is given, 'tanh' + will be used. + + If a callable is given, it will be used to compute the activations. `activation_args` : dictionary, optional (default=None) Supplies keyword arguments for a callable activation_func @@ -172,103 +214,223 @@ class SimpleRandomLayer(BaseRandomLayer): See Also -------- - ELMRegressor, ELMClassifier, SimpleELMRegressor, SimpleELMClassifier, - RBFRandomLayer """ - - # - # internal transfer function (RBF) definitions - # - - # triangular transfer function + # triangular activation function _tribas = (lambda x: np.clip(1.0 - np.fabs(x), 0.0, 1.0)) - # sigmoid transfer function + # inverse triangular activation function + _inv_tribas = (lambda x: np.clip(np.fabs(x), 0.0, 1.0)) + + # sigmoid activation function _sigmoid = (lambda x: 1.0/(1.0 + np.exp(-x))) - # hard limit transfer function + # hard limit activation function _hardlim = (lambda x: np.array(x > 0.0, dtype=float)) - # internal transfer function table + _softlim = (lambda x: np.clip(x, 0.0, 1.0)) + + # gaussian RBF + _gaussian = (lambda x: np.exp(-pow(x, 2.0))) + + # multiquadric RBF + _multiquadric = (lambda x: + np.sqrt(1.0 + pow(x, 2.0))) + + # inverse multiquadric RBF + _inv_multiquadric = (lambda x: + 1.0/(np.sqrt(1.0 + pow(x, 2.0)))) + + # internal activation function table _internal_activation_funcs = {'sine': np.sin, 'tanh': np.tanh, 'tribas': _tribas, + 'inv_tribas': _inv_tribas, 'sigmoid': _sigmoid, - 'hardlim': _hardlim + 'softlim': _softlim, + 'hardlim': _hardlim, + 'inv_tribas': _inv_tribas, + 'gaussian': _gaussian, + 'multiquadric': _multiquadric, + 'inv_multiquadric': _inv_multiquadric, } - # default setup, plus initialization of activation_func - def __init__(self, n_hidden=20, random_state=None, - activation_func='tanh', activation_args=None): + def __init__(self, n_hidden=20, alpha=0.5, random_state=None, + activation_func='tanh', activation_args=None, + user_components=None, rbf_width=1.0): - super(SimpleRandomLayer, self).__init__(n_hidden, - random_state, - activation_func, - activation_args) + super(RandomLayer, self).__init__(n_hidden=n_hidden, + random_state=random_state, + activation_func=activation_func, + activation_args=activation_args) if (isinstance(self.activation_func, str)): func_names = self._internal_activation_funcs.keys() if (self.activation_func not in func_names): - msg = "unknown transfer function '%s'" % self.activation_func + msg = "unknown activation function '%s'" % self.activation_func raise ValueError(msg) + self.alpha = alpha + self.rbf_width = rbf_width + self.user_components = user_components + + self._use_mlp_input = (self.alpha != 0.0) + self._use_rbf_input = (self.alpha != 1.0) + + def _get_user_components(self, key): + try: + return self.user_components[key] + except (TypeError, KeyError): + return None + + # compute radii + def _compute_radii(self, X): + # use supplied radii if present + radii = self._get_user_components('radii') + + # compute radii + if (radii is None): + centers = self.components_['centers'] + + n_centers = centers.shape[0] + max_dist = np.max(pairwise_distances(centers)) + radii = np.ones(n_centers) * max_dist/sqrt(2.0 * n_centers) + + self.components_['radii'] = radii + + # compute centers + def _compute_centers(self, X, sparse, rs): + # use supplied centers if present + centers = self._get_user_components('centers') + + # use points taken uniformly from the bounding + # hyperrectangle + if (centers is None): + n_samples, n_features = X.shape + + if (sparse): + fxr = xrange(n_features) + cols = [X.getcol(i) for i in fxr] + + min_dtype = X.dtype.type(1.0e10) + sp_min = lambda col: np.minimum(min_dtype, np.min(col.data)) + min_Xs = np.array(map(sp_min, cols)) + + max_dtype = X.dtype.type(-1.0e10) + sp_max = lambda col: np.maximum(max_dtype, np.max(col.data)) + max_Xs = np.array(map(sp_max, cols)) + else: + min_Xs = X.min(axis=0) + max_Xs = X.max(axis=0) + + spans = max_Xs - min_Xs + ctrs_size = (self.n_hidden, n_features) + centers = min_Xs + spans * rs.uniform(0.0, 1.0, ctrs_size) + + self.components_['centers'] = centers + + def _compute_biases(self, X, rs): + # use supplied biases if present + biases = self._get_user_components('biases') + if (biases is None): + b_size = self.n_hidden + biases = rs.normal(size=b_size) + + self.components_['biases'] = biases + + def _compute_weights(self, X, rs): + # use supplied weights if present + weights = self._get_user_components('weights') + if (weights is None): + n_features = X.shape[1] + hw_size = (n_features, self.n_hidden) + weights = rs.normal(size=hw_size) + + self.components_['weights'] = weights + def _generate_components(self, X): """Generate components of hidden layer given X""" - rand_state = check_random_state(self.random_state) - n_features = X.shape[1] - - b_size = self.n_hidden - hw_size = (n_features, self.n_hidden) + rs = check_random_state(self.random_state) + if (self._use_mlp_input): + self._compute_biases(X, rs) + self._compute_weights(X, rs) - self.components_['biases'] = rand_state.normal(size=b_size) - self.components_['weights'] = rand_state.normal(size=hw_size) + if (self._use_rbf_input): + self._compute_centers(X, sp.issparse(X), rs) + self._compute_radii(X) def _compute_input_activations(self, X): """Compute input activations given X""" - b = self.components_['biases'] - w = self.components_['weights'] + n_samples = X.shape[0] - self.input_activations_ = safe_sparse_dot(X, w) - self.input_activations_ += b + mlp_acts = np.zeros((n_samples, self.n_hidden)) + if (self._use_mlp_input): + b = self.components_['biases'] + w = self.components_['weights'] + mlp_acts = self.alpha * (safe_sparse_dot(X, w) + b) + rbf_acts = np.zeros((n_samples, self.n_hidden)) + if (self._use_rbf_input): + radii = self.components_['radii'] + centers = self.components_['centers'] + scale = self.rbf_width * (1.0 - self.alpha) + rbf_acts = scale * cdist(X, centers)/radii -# Random Hidden Layer of radial basis function units -class RBFRandomLayer(BaseRandomLayer): - """Random RBF Hidden Layer transformer + #print rbf_acts.shape, mlp_acts.shape, self.alpha + self.input_activations_ = mlp_acts + rbf_acts - Creates a layer of radial basis function units where: - f(a), s.t. a = ||x-c||/r +class MLPRandomLayer(RandomLayer): + def __init__(self, n_hidden=20, random_state=None, + activation_func='tanh', activation_args=None, + weights=None, biases=None): - with c the unit center and r = max(||x-c||)/sqrt(n_centers*2). + user_components = {'weights': weights, 'biases': biases} + super(MLPRandomLayer, self).__init__(n_hidden=n_hidden, + random_state=random_state, + activation_func=activation_func, + activation_args=activation_args, + user_components=user_components, + alpha=1.0) - f() defaults to exp(-gamma * a^2) (gaussian rbf) - gamma defaults to 1.0 - If centers are not provided and use_exemplars is False (see below), - then centers are uniformly distributed over the input space. +class RBFRandomLayer(RandomLayer): + def __init__(self, n_hidden=20, random_state=None, + activation_func='gaussian', activation_args=None, + centers=None, radii=None, rbf_width=1.0): + + user_components = {'centers': centers, 'radii': radii} + super(RBFRandomLayer, self).__init__(n_hidden=n_hidden, + random_state=random_state, + activation_func=activation_func, + activation_args=activation_args, + user_components=user_components, + rbf_width=rbf_width, + alpha=0.0) + + +class GRBFRandomLayer(RBFRandomLayer): + """Random Generalized RBF Hidden Layer transformer + + Creates a layer of radial basis function units where: + + f(a), s.t. a = ||x-c||/r + + with c the unit center + and f() is exp(-gamma * a^tau) where tau and r are computed + based on [1] Parameters ---------- `n_hidden` : int, optional (default=20) Number of units to generate, ignored if centers are provided - `activation_func` : {callable, string} optional (default='gaussian') - Function used to transform input activation. - It must be one of 'gaussian', 'poly_spline', 'multiquadric' or - a callable. If none is given, 'gaussian' will be used. If a - callable is given, it will be used to compute the hidden unit - activations. - - `activation_args` : dictionary, optional (default=None) - Supplies keyword arguments for a callable activation_func + `grbf_lambda` : float, optional (default=0.05) + GRBF shape parameter `gamma` : {int, float} optional (default=1.0) - Width multiplier for RBF distance argument, ignored if callable - activation_func is provided. Must be an int > 0 when activation_func - is 'poly_spline'. + Width multiplier for GRBF distance argument `centers` : array of shape (n_hidden, n_features), optional (default=None) If provided, overrides internal computation of the centers @@ -297,132 +459,53 @@ class RBFRandomLayer(BaseRandomLayer): -------- ELMRegressor, ELMClassifier, SimpleELMRegressor, SimpleELMClassifier, SimpleRandomLayer - """ - - # - # internal transfer function (RBF) definitions - # - - # gaussian RBF - _gaussian = (lambda x, gamma: np.exp(-gamma * pow(x, 2.0))) - # multiquadric spline RBF - _multiquadric = (lambda x, gamma: - np.sqrt(1.0 + pow(gamma * x, 2.0))) - - # polyharmonic spline RBF - def _poly_spline(acts, gamma): - if (not isinstance(gamma, int) or gamma < 1): - msg = 'Gamma must be integer > 0 for poly_spline' - raise ValueError(msg) - - # add epsilon to avoid log(0) exception - epsilon = 1.0e-8 - acts += epsilon - - X_new = pow(acts, gamma) - if ((gamma % 2) == 0): - X_new *= np.log(acts) - - return X_new - - # internal RBF table - _internal_activation_funcs = {'gaussian': _gaussian, - 'poly_spline': _poly_spline, - 'multiquadric': _multiquadric - } - - def __init__(self, n_hidden=20, random_state=None, - activation_func='gaussian', activation_args=None, - gamma=1.0, centers=None, radii=None, - use_exemplars=False): - - super(RBFRandomLayer, self).__init__(n_hidden, - random_state, - activation_func, - activation_args) - - if (isinstance(self.activation_func, str)): - func_names = self._internal_activation_funcs.keys() - if (self.activation_func not in func_names): - msg = "unknown transfer function '%s'" % self.activation_func - raise ValueError(msg) + References + ---------- + .. [1] Fernandez-Navarro, et al, "MELM-GRBF: a modified version of the + extreme learning machine for generalized radial basis function + neural networks", Neurocomputing 74 (2011), 2502-2510 - self.radii = radii - self.centers = centers - self.gamma = gamma - self.use_exemplars = use_exemplars + """ + def _grbf(acts, taus): + return np.exp(np.exp(-pow(acts, taus))) - # property methods for 'gamma' arg, use - # self._extra_args dictionary - @property - def gamma(self): - return self._extra_args['gamma'] + _internal_activation_funcs = {'grbf': _grbf} - @gamma.setter - def gamma(self, value): - self._extra_args['gamma'] = value + def __init__(self, n_hidden=20, grbf_lambda=0.001, + centers=None, radii=None, random_state=None): - def _generate_components(self, X): - """Generate components of hidden layer given X""" + super(GRBFRandomLayer, self).__init__(n_hidden=n_hidden, + activation_func='grbf', + centers=centers, radii=radii, + random_state=random_state) - sparse = sp.issparse(X) - self._compute_centers(X, sparse) - self._compute_radii(X, sparse) + self.grbf_lambda = grbf_lambda + self.dN_vals = None + self.dF_vals = None + self.tau_vals = None - def _compute_input_activations(self, X): - """Compute input activations given X""" + # get centers from superclass, then calculate tau_vals + # according to ref [1] + def _compute_centers(self, X, sparse, rs): + super(GRBFRandomLayer, self)._compute_centers(X, sparse, rs) - radii = self.components_['radii'] centers = self.components_['centers'] + sorted_distances = np.sort(squareform(pdist(centers))) + self.dF_vals = sorted_distances[:, -1] + self.dN_vals = sorted_distances[:, 1]/100.0 + #self.dN_vals = 0.0002 * np.ones(self.dF_vals.shape) - self.input_activations_ = cdist(X, centers)/radii + tauNum = np.log(np.log(self.grbf_lambda) / + np.log(1.0 - self.grbf_lambda)) - # determine centers - def _compute_centers(self, X, sparse): - # use supplied centers - if (self.centers is not None): - centers = self.centers + tauDenom = np.log(self.dF_vals/self.dN_vals) - else: - n_samples, n_features = X.shape - rs = check_random_state(self.random_state) + self.tau_vals = tauNum/tauDenom - # use examples from the data as centers - if (self.use_exemplars): - if (n_samples < self.n_hidden): - msg = "n_hidden must be <= n_samples when using exemplars" - raise ValueError(msg) - - max_index = n_samples - 1 - indices = rs.permutation(max_index)[:self.n_hidden] - centers = X[indices, :] - - # use uniformly distributed points from the input space as centers - else: - if (sparse): - X_dtype = X.dtype.type(0) - min_X = np.minimum(X_dtype, np.min(X.data)) - max_X = np.maximum(X_dtype, np.max(X.data)) - else: - min_X, max_X = np.min(X), np.max(X) + self._extra_args['taus'] = self.tau_vals - ctrs_size = (self.n_hidden, n_features) - centers = rs.uniform(min_X, max_X, ctrs_size) - - self.components_['centers'] = centers - - # compute radii - def _compute_radii(self, X, sparse): - # use supplied radii - if (self.radii is not None): - radii = self.radii - - else: - centers = self.components_['centers'] - - n_centers = centers.shape[0] - max_dist = np.max(pairwise_distances(centers)) - radii = np.ones(n_centers) * max_dist/sqrt(2.0 * n_centers) - - self.components_['radii'] = radii + # get radii according to ref [1] + def _compute_radii(self, X): + denom = pow(-np.log(self.grbf_lambda), 1.0/self.tau_vals) + self.components_['radii'] = self.dF_vals/denom From e6a876bd374ac83d5d8e5ae60cdde5351d5fff8f Mon Sep 17 00:00:00 2001 From: dclambert Date: Sun, 9 Jun 2013 20:33:23 -0500 Subject: [PATCH 04/11] README tweaks --- README.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 0009df7..1004e04 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,11 @@ -Python-ELM -========== +Python-ELM v0.3 +=============== -Extreme Learning Machine implementation in Python -Version 0.3 +######This is an implementation of the [Extreme Learning Machine](http://www.extreme-learning-machines.org) [1][2] in Python, based on [scikit-learn](http://scikit-learn.org). -This is an implementation of the [Extreme Learning Machine](http://www.extreme-learning-machines.org) [1][2] in Python, based on [scikit-learn](http://scikit-learn.org). +######From the abstract: + +> It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradient- based learning algorithms are extensively used to train neural networks, and 2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these traditional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single- hidden layer feedforward neural networks (SLFNs) which ran- domly chooses the input weights and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide the best generalization performance at extremely fast learning speed. The experimental results based on real- world benchmarking function approximation and classification problems including large complex applications show that the new algorithm can produce best generalization performance in some cases and can learn much faster than traditional popular learning algorithms for feedforward neural networks. It's a work in progress, so things can/might/will change. From d69c1f2efb3956c5b16ba372964d7ab465e3d8f9 Mon Sep 17 00:00:00 2001 From: dclambert Date: Mon, 10 Jun 2013 10:58:28 -0500 Subject: [PATCH 05/11] comment tweak --- elm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/elm.py b/elm.py index fbda9ee..6f8df27 100644 --- a/elm.py +++ b/elm.py @@ -146,7 +146,7 @@ def __init__(self, def _fit_regression(self, y): """ - fit regression using Moore-Penrose pseudo-inverse + fit regression using pseudo-inverse or supplied regressor """ if (self.regressor is None): From ea10aed103fb8b0ad0868eb4f8eeb0a2c5e20cf3 Mon Sep 17 00:00:00 2001 From: dclambert Date: Fri, 28 Jun 2013 16:21:39 -0400 Subject: [PATCH 06/11] Markdown tweak --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 1004e04..bf17d9e 100644 --- a/README.md +++ b/README.md @@ -91,7 +91,7 @@ A small demo ()based on scikit-learn's plot_classifier_comparison) that shows th ####__elm_notebook.py__ -An IPython notebook, illustrating several ways to use the __\*ELM*__ and __\*RandomLayer__ classes. +An IPython notebook, illustrating several ways to use the __ELM__ and __RandomLayer__ classes. Requirements ------------ From 71c437675db24f046eae54f20baccde32f150e48 Mon Sep 17 00:00:00 2001 From: dclambert Date: Wed, 3 Jul 2013 11:12:46 -0500 Subject: [PATCH 07/11] added some comments, removed X from _compute_radii and _compute_biases, removed dup entry for inv_tribas --- random_layer.py | 48 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/random_layer.py b/random_layer.py index 5a25177..713a795 100644 --- a/random_layer.py +++ b/random_layer.py @@ -1,4 +1,4 @@ -# -*- coding: utf8 +#-*- coding: utf8 # Author: David C. Lambert [dcl -at- panix -dot- com] # Copyright(c) 2013 # License: Simple BSD @@ -35,14 +35,15 @@ ] -# Abstract Base Class for random hidden layers class BaseRandomLayer(BaseEstimator, TransformerMixin): + """Abstract Base Class for random layers""" __metaclass__ = ABCMeta _internal_activation_funcs = dict() @classmethod def activation_func_names(cls): + """Get list of internal activation function names""" return cls._internal_activation_funcs.keys() # take n_hidden and random_state, init components_ and @@ -248,7 +249,6 @@ class RandomLayer(BaseRandomLayer): 'sigmoid': _sigmoid, 'softlim': _softlim, 'hardlim': _hardlim, - 'inv_tribas': _inv_tribas, 'gaussian': _gaussian, 'multiquadric': _multiquadric, 'inv_multiquadric': _inv_multiquadric, @@ -277,13 +277,15 @@ def __init__(self, n_hidden=20, alpha=0.5, random_state=None, self._use_rbf_input = (self.alpha != 1.0) def _get_user_components(self, key): + """Look for given user component""" try: return self.user_components[key] except (TypeError, KeyError): return None - # compute radii - def _compute_radii(self, X): + def _compute_radii(self): + """Generate RBF radii""" + # use supplied radii if present radii = self._get_user_components('radii') @@ -297,15 +299,16 @@ def _compute_radii(self, X): self.components_['radii'] = radii - # compute centers def _compute_centers(self, X, sparse, rs): + """Generate RBF centers""" + # use supplied centers if present centers = self._get_user_components('centers') # use points taken uniformly from the bounding # hyperrectangle if (centers is None): - n_samples, n_features = X.shape + n_features = X.shape[1] if (sparse): fxr = xrange(n_features) @@ -328,7 +331,9 @@ def _compute_centers(self, X, sparse, rs): self.components_['centers'] = centers - def _compute_biases(self, X, rs): + def _compute_biases(self, rs): + """Generate MLP biases""" + # use supplied biases if present biases = self._get_user_components('biases') if (biases is None): @@ -338,6 +343,8 @@ def _compute_biases(self, X, rs): self.components_['biases'] = biases def _compute_weights(self, X, rs): + """Generate MLP weights""" + # use supplied weights if present weights = self._get_user_components('weights') if (weights is None): @@ -352,12 +359,12 @@ def _generate_components(self, X): rs = check_random_state(self.random_state) if (self._use_mlp_input): - self._compute_biases(X, rs) + self._compute_biases(rs) self._compute_weights(X, rs) if (self._use_rbf_input): self._compute_centers(X, sp.issparse(X), rs) - self._compute_radii(X) + self._compute_radii() def _compute_input_activations(self, X): """Compute input activations given X""" @@ -377,11 +384,13 @@ def _compute_input_activations(self, X): scale = self.rbf_width * (1.0 - self.alpha) rbf_acts = scale * cdist(X, centers)/radii - #print rbf_acts.shape, mlp_acts.shape, self.alpha self.input_activations_ = mlp_acts + rbf_acts class MLPRandomLayer(RandomLayer): + """Wrapper for RandomLayer with alpha (mixing coefficient) set + to 1.0 for MLP activations only""" + def __init__(self, n_hidden=20, random_state=None, activation_func='tanh', activation_args=None, weights=None, biases=None): @@ -396,6 +405,9 @@ def __init__(self, n_hidden=20, random_state=None, class RBFRandomLayer(RandomLayer): + """Wrapper for RandomLayer with alpha (mixing coefficient) set + to 0.0 for RBF activations only""" + def __init__(self, n_hidden=20, random_state=None, activation_func='gaussian', activation_args=None, centers=None, radii=None, rbf_width=1.0): @@ -467,8 +479,12 @@ class GRBFRandomLayer(RBFRandomLayer): neural networks", Neurocomputing 74 (2011), 2502-2510 """ - def _grbf(acts, taus): - return np.exp(np.exp(-pow(acts, taus))) + # def _grbf(acts, taus): + # """GRBF activation function""" + + # return np.exp(np.exp(-pow(acts, taus))) + + _grbf = (lambda acts, taus: np.exp(np.exp(-pow(acts, taus)))) _internal_activation_funcs = {'grbf': _grbf} @@ -488,6 +504,8 @@ def __init__(self, n_hidden=20, grbf_lambda=0.001, # get centers from superclass, then calculate tau_vals # according to ref [1] def _compute_centers(self, X, sparse, rs): + """Generate centers, then compute tau, dF and dN vals""" + super(GRBFRandomLayer, self)._compute_centers(X, sparse, rs) centers = self.components_['centers'] @@ -506,6 +524,8 @@ def _compute_centers(self, X, sparse, rs): self._extra_args['taus'] = self.tau_vals # get radii according to ref [1] - def _compute_radii(self, X): + def _compute_radii(self): + """Generate radii""" + denom = pow(-np.log(self.grbf_lambda), 1.0/self.tau_vals) self.components_['radii'] = self.dF_vals/denom From 25a7eae634caf433b94aedaea005f2c1a2083590 Mon Sep 17 00:00:00 2001 From: dclambert Date: Wed, 3 Jul 2013 11:32:31 -0500 Subject: [PATCH 08/11] cosmetic and comment changes --- elm.py | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/elm.py b/elm.py index 6f8df27..2c33336 100644 --- a/elm.py +++ b/elm.py @@ -245,7 +245,7 @@ class GenELMClassifier(BaseELM, ClassifierMixin): `classes_` : numpy array of shape [n_classes] Array of class labels - `elm_regressor_` : ELMRegressor instance + `genelm_regressor_` : ELMRegressor instance Performs actual fit of binarized values See Also @@ -344,8 +344,8 @@ class ELMRegressor(BaseEstimator, RegressorMixin): [1][2] ELMRegressor is a wrapper for an GenELMRegressor that uses a - RandomLayer and exposes the RandomLayer's parameters in its - own constructor. + RandomLayer and passes the __init__ parameters through + to the hidden layer generated by the fit() method. Parameters ---------- @@ -418,9 +418,11 @@ def __init__(self, n_hidden=20, alpha=0.5, rbf_width=1.0, self.rbf_width = rbf_width self.regressor = regressor - self._genelm_regressor_ = None + self._genelm_regressor = None def _create_random_layer(self): + """Pass init params to RandomLayer""" + return RandomLayer(n_hidden=self.n_hidden, alpha=self.alpha, random_state=self.random_state, activation_func=self.activation_func, @@ -449,9 +451,9 @@ def fit(self, X, y): Returns an instance of self. """ rhl = self._create_random_layer() - self.genelm_regressor_ = GenELMRegressor(hidden_layer=rhl, + self._genelm_regressor = GenELMRegressor(hidden_layer=rhl, regressor=self.regressor) - self.genelm_regressor_.fit(X, y) + self._genelm_regressor.fit(X, y) return self def predict(self, X): @@ -467,10 +469,10 @@ def predict(self, X): C : numpy array of shape [n_samples, n_outputs] Predicted values. """ - if (self.genelm_regressor_ is None): + if (self._genelm_regressor is None): raise ValueError("SimpleELMRegressor not fitted") - return self.genelm_regressor_.predict(X) + return self._genelm_regressor.predict(X) class ELMClassifier(ELMRegressor): @@ -486,8 +488,8 @@ class ELMClassifier(ELMRegressor): data, then uses the superclass to compute the decision function that is then unbinarized to yield the prediction. - The RandomLayer used for the input transform are exposed in the - ELMClassifier constructor. + The params for the RandomLayer used in the input transform are + exposed in the ELMClassifier constructor. Parameters ---------- @@ -609,5 +611,8 @@ def predict(self, X): return class_predictions def score(self, X, y): + """Force use of accuracy score since we don't inherit + from ClassifierMixin""" + from sklearn.metrics import accuracy_score return accuracy_score(y, self.predict(X)) From 2f4ffa5cf36fe55dce4a9fd3f01c312bc308b819 Mon Sep 17 00:00:00 2001 From: dclambert Date: Thu, 4 Jul 2013 09:33:29 -0500 Subject: [PATCH 09/11] formatting --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index bf17d9e..2f37fd4 100644 --- a/README.md +++ b/README.md @@ -87,11 +87,11 @@ The ELMRegressor class is a wrapper around GenELMRegressor that uses a RandomLay ####__plot_elm_comparison.py__ -A small demo ()based on scikit-learn's plot_classifier_comparison) that shows the decision functions of a couple of different instantiations of the GenELMClassifier on three different datasets. +A small demo (based on scikit-learn's plot_classifier_comparison) that shows the decision functions of a couple of different instantiations of the GenELMClassifier on three different datasets. ####__elm_notebook.py__ -An IPython notebook, illustrating several ways to use the __ELM__ and __RandomLayer__ classes. +An IPython notebook, illustrating several ways to use the __\*ELM\*__ and __\*RandomLayer__ classes. Requirements ------------ From e37fdbf03a4d69394dd870676e6d7744aacf8be0 Mon Sep 17 00:00:00 2001 From: Santiago Castro Date: Mon, 17 Apr 2017 22:01:31 -0300 Subject: [PATCH 10/11] Fix broken Markdown headings --- README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 2f37fd4..3370cdf 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,9 @@ Python-ELM v0.3 =============== -######This is an implementation of the [Extreme Learning Machine](http://www.extreme-learning-machines.org) [1][2] in Python, based on [scikit-learn](http://scikit-learn.org). +###### This is an implementation of the [Extreme Learning Machine](http://www.extreme-learning-machines.org) [1][2] in Python, based on [scikit-learn](http://scikit-learn.org). -######From the abstract: +###### From the abstract: > It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradient- based learning algorithms are extensively used to train neural networks, and 2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these traditional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single- hidden layer feedforward neural networks (SLFNs) which ran- domly chooses the input weights and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide the best generalization performance at extremely fast learning speed. The experimental results based on real- world benchmarking function approximation and classification problems including large complex applications show that the new algorithm can produce best generalization performance in some cases and can learn much faster than traditional popular learning algorithms for feedforward neural networks. @@ -17,7 +17,7 @@ __License: Simple BSD__ Files ----- -####__random_layer.py__ +#### __random_layer.py__ Contains the __RandomLayer__, __MLPRandomLayer__, __RBFRandomLayer__ and __GRBFRandomLayer__ classes. @@ -76,7 +76,7 @@ notebook for details. The GRBFRandomLayer implements the Generalized Radial Basis Function from [[3]](http://sci2s.ugr.es/keel/pdf/keel/articulo/2011-Neurocomputing1.pdf) -####__elm.py__ +#### __elm.py__ Contains the __ELMRegressor__, __ELMClassifier__, __GenELMRegressor__, and __GenELMClassifier__ classes. @@ -85,11 +85,11 @@ GenELMClassifier is little more than a wrapper around GenELMRegressor that binar The ELMRegressor class is a wrapper around GenELMRegressor that uses a RandomLayer instance by default and exposes the RandomLayer parameters in the constructor. ELMClassifier is similar for classification. -####__plot_elm_comparison.py__ +#### __plot_elm_comparison.py__ A small demo (based on scikit-learn's plot_classifier_comparison) that shows the decision functions of a couple of different instantiations of the GenELMClassifier on three different datasets. -####__elm_notebook.py__ +#### __elm_notebook.py__ An IPython notebook, illustrating several ways to use the __\*ELM\*__ and __\*RandomLayer__ classes. From 00a231e5b917d893fedb9f1747446ba0ac032b8e Mon Sep 17 00:00:00 2001 From: "David C. Lambert" <3647567+dclambert@users.noreply.github.com> Date: Fri, 5 Mar 2021 10:41:38 -0600 Subject: [PATCH 11/11] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 3370cdf..39e2cea 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,8 @@ Python-ELM v0.3 =============== +__---> ARCHIVED March 2021 <---__ + ###### This is an implementation of the [Extreme Learning Machine](http://www.extreme-learning-machines.org) [1][2] in Python, based on [scikit-learn](http://scikit-learn.org). ###### From the abstract: