bigfreecoder
diff --git a/‎CHANGELOG.rst
Lines changed: 32 additions & 0 deletions b/‎CHANGELOG.rst
Lines changed: 32 additions & 0 deletions
diff --git a/‎docs/Makefile
Lines changed: 20 additions & 0 deletions b/‎docs/Makefile
Lines changed: 20 additions & 0 deletions
diff --git a/‎docs/_static/custom.css
Lines changed: 3 additions & 0 deletions b/‎docs/_static/custom.css
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/api_reference.rst
Lines changed: 14 additions & 0 deletions b/‎docs/api_reference.rst
Lines changed: 14 additions & 0 deletions
diff --git a/‎docs/changelog.rst
Lines changed: 3 additions & 0 deletions b/‎docs/changelog.rst
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/conf.py
Lines changed: 74 additions & 0 deletions b/‎docs/conf.py
Lines changed: 74 additions & 0 deletions
diff --git a/‎docs/experiments.rst
Lines changed: 172 additions & 0 deletions b/‎docs/experiments.rst
Lines changed: 172 additions & 0 deletions
@@ -0,0 +1,32 @@
+Changelog
+=========
+
++---------------+-----------------------------------------------------------+
+| Badge         | Meaning                                                   |
++===============+===========================================================+
+| |Feature|     | Add something that cannot be achieved before.             |
++---------------+-----------------------------------------------------------+
+| |Efficiency|  | Improve the efficiency on the computation or memory.      |
++---------------+-----------------------------------------------------------+
+| |Enhancement| | Miscellaneous minor improvements.                         |
++---------------+-----------------------------------------------------------+
+| |Fix|         | Fix up something that does not work as expected.          |
++---------------+-----------------------------------------------------------+
+| |API|         | You will need to change the code to have the same effect. |
++---------------+-----------------------------------------------------------+
+
+Version 0.1.*
+-------------
+
+.. role:: raw-html(raw)
+   :format: html
+
+.. role:: raw-latex(raw)
+   :format: latex
+
+.. |MajorFeature| replace:: :raw-html:`<span class="badge badge-success">Major Feature</span>` :raw-latex:`{\small\sc [Major Feature]}`
+.. |Feature| replace:: :raw-html:`<span class="badge badge-success">Feature</span>` :raw-latex:`{\small\sc [Feature]}`
+.. |Efficiency| replace:: :raw-html:`<span class="badge badge-info">Efficiency</span>` :raw-latex:`{\small\sc [Efficiency]}`
+.. |Enhancement| replace:: :raw-html:`<span class="badge badge-info">Enhancement</span>` :raw-latex:`{\small\sc [Enhancement]}`
+.. |Fix| replace:: :raw-html:`<span class="badge badge-danger">Fix</span>` :raw-latex:`{\small\sc [Fix]}`
+.. |API| replace:: :raw-html:`<span class="badge badge-warning">API Change</span>` :raw-latex:`{\small\sc [API Change]}`
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@@ -0,0 +1,3 @@
+.wy-nav-content {
+    max-width: 60%;
+}
@@ -0,0 +1,14 @@
+API Reference
+=============
+
+Below is the class and function reference for :mod:`deepforest`. Notice that the package is under active development, and some features may not be stable yet.
+
+CascadeForestClassifier
+-----------------------
+
+.. autoclass:: deepforest.CascadeForestClassifier
+    :members:
+    :inherited-members:
+    :show-inheritance:
+    :no-undoc-members:
+    :member-order: bysource
@@ -0,0 +1,3 @@
+.. _changelog:
+
+.. include:: ../CHANGELOG.rst
@@ -0,0 +1,74 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+
+import os
+import sys
+
+sys.path.insert(0, os.path.abspath('..'))
+
+
+# -- Project information -----------------------------------------------------
+
+project = 'Deep Forest'
+copyright = '2021, LAMDA Group, Nanjing University'
+author = 'Yi-Xuan Xu'
+
+# The master toctree document.
+master_doc = 'index'
+
+
+# -- General configuration ---------------------------------------------------
+
+autodoc_mock_imports = ["joblib", "scikit-learn"]
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    'sphinx.ext.autodoc',
+    'sphinx.ext.autosummary',
+    'sphinx.ext.todo',
+    'sphinx.ext.napoleon',
+    'sphinx_panels',
+    'sphinx_copybutton'
+]
+
+autoapi_dirs = ['../deepforest']
+
+autodoc_member_order = 'bysource'
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+
+# The name of the Pygments (syntax highlighting) style to use.
+pygments_style = "default"
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+
+html_theme = "sphinx_rtd_theme"
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
+
+html_css_files = [
+    'custom.css',
+]
@@ -0,0 +1,172 @@
+Experiments
+===========
+
+Baseline
+********
+For all experiments, we used 5 popular tree-based ensemble methods as baselines. Details on the baselines are listed in the following table:
+
++------------------+---------------------------------------------------------------+
+|       Name       |                          Introduction                         |
++==================+===============================================================+
+| `Random Forest`_ | An efficient implementation of Random Forest in Scikit-Learn  |
++------------------+---------------------------------------------------------------+
+|     `HGBDT`_     |             Histogram-based GBDT in Scikit-Learn              |
++------------------+---------------------------------------------------------------+
+| `XGBoost EXACT`_ |                The vanilla version of XGBoost                 |
++------------------+---------------------------------------------------------------+
+|  `XGBoost HIST`_ |          The histogram optimized version of XGBoost           |
++------------------+---------------------------------------------------------------+
+|    `LightGBM`_   |                Light Gradient Boosting Machine                |
++------------------+---------------------------------------------------------------+
+
+Environment
+***********
+For all experiments, we used a single linux server. Details on the specifications are listed in the table below. All processors were used for training and evaluating.
+
++------------------+-----------------+--------+
+|        OS        |       CPU       | Memory |
++==================+=================+========+
+| Ubuntu 18.04 LTS |   Xeon E-2288G  | 128GB  |
++------------------+-----------------+--------+
+
+Setting
+*******
+We kept the number of decision trees the same across all baselines, while remaining hyper-parameters were set to their default values. Running scripts on reproducing all experiment results are available, please refer to this `Repo`_.
+
+Dataset
+*******
+
+We have collected a number of datasets for both binary and multi-class classification, as listed in the table below. They were selected based on the following criteria:
+
+- Publicly available and easy to use;
+- Cover different application areas;
+- Reflect high diversity in terms of the number of samples, features, and classes.
+
+As a result, some baselines may fail on datasets with too many samples or features. Such cases are indicated by ``N/A`` in all tables below.
+
++------------------+------------+-----------+------------+-----------+
+|       Name       | # Training | # Testing | # Features | # Classes |
++==================+============+===========+============+===========+
+|     `ijcnn1`_    |   49,990   |   91,701  |     22     |     2     |
++------------------+------------+-----------+------------+-----------+
+|   `pendigits`_   |    7,494   |   3,498   |     16     |     10    |
++------------------+------------+-----------+------------+-----------+
+|     `letter`_    |   15,000   |   5,000   |     16     |     26    |
++------------------+------------+-----------+------------+-----------+
+|   `connect-4`_   |   67,557   |   20,267  |     126    |     3     |
++------------------+------------+-----------+------------+-----------+
+|     `sector`_    |    6,412   |   3,207   |   55,197   |    105    |
++------------------+------------+-----------+------------+-----------+
+|    `covtype`_    |   406,708  |  174,304  |     54     |     7     |
++------------------+------------+-----------+------------+-----------+
+|      `susy`_     |  4,500,000 |  500,000  |     18     |     2     |
++------------------+------------+-----------+------------+-----------+
+|     `higgs`_     | 10,500,000 |  500,000  |     28     |     2     |
++------------------+------------+-----------+------------+-----------+
+|      `usps`_     |    7,291   |   2,007   |     256    |     10    |
++------------------+------------+-----------+------------+-----------+
+|     `mnist`_     |   60,000   |   10,000  |     784    |     10    |
++------------------+------------+-----------+------------+-----------+
+| `fashion mnist`_ |   60,000   |   10,000  |     784    |     10    |
++------------------+------------+-----------+------------+-----------+
+
+Classification Accuracy
+***********************
+
+The table below shows the testing accuracy of each method, with the best result on each dataset **bolded**. Each experiment was conducted over 5 independently trials, and the average result was reported.
+
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|      Name     |   RF  | HGBDT | XGB EXACT |  XGB HIST |  LightGBM | Deep Forest |
++===============+=======+=======+===========+===========+===========+=============+
+|     ijcnn1    | 98.07 | 98.43 |   98.20   |   98.23   | **98.61** |    98.16    |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|   pendigits   | 96.54 | 96.34 |   96.60   |   96.60   |   96.17   |  **97.50**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|     letter    | 95.39 | 91.56 |   90.80   |   90.82   |   88.94   |  **95.92**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|   connect-4   | 70.18 | 70.88 |   71.57   |   71.57   |   70.31   |  **72.05**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|     sector    | 85.62 |  N/A  |   66.01   |   65.61   |   63.24   |  **86.74**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|    covtype    | 73.73 | 64.22 |   66.15   |   66.70   |   65.00   |  **74.27**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|      susy     | 80.19 | 80.31 |   80.32   | **80.35** |   80.33   |    80.18    |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|     higgs     |  N/A  | 74.95 |   75.85   |   76.00   |   74.97   |  **76.46**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|      usps     | 93.79 | 94.32 |   93.77   |   93.37   |   93.97   |  **94.67**  |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+|     mnist     | 97.20 | 98.35 |   98.07   |   98.14   | **98.42** |    98.11    |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+| fashion mnist | 87.87 | 87.02 |   90.74   |   90.80   | **90.81** |    89.66    |
++---------------+-------+-------+-----------+-----------+-----------+-------------+
+
+Runtime
+*******
+
+Runtime in seconds reported in the table below covers both the training stage and evaluating stage.
+
++---------------+---------+--------+-----------+----------+----------+-------------+
+|      Name     |    RF   |  HGBDT | XGB EXACT | XGB HIST | LightGBM | Deep Forest |
++===============+=========+========+===========+==========+==========+=============+
+|     ijcnn1    |   9.60  |  6.84  |   11.24   |   1.90   |   1.99   |     8.37    |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|   pendigits   |   1.26  |  5.12  |    0.39   |   0.26   |   0.46   |     2.21    |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|     letter    |   0.76  |  1.30  |    0.34   |   0.17   |   0.19   |     2.84    |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|   connect-4   |   5.17  |  7.54  |   13.26   |   3.19   |   1.12   |    10.73    |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|     sector    |  292.15 |   N/A  |   632.27  |  593.35  |  18.83   |    521.68   |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|    covtype    |  84.00  |  2.56  |   58.43   |  11.62   |   3.96   |    164.18   |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|      susy     | 1429.85 |  59.09 |  1051.54  |  44.85   |  34.40   |   1866.48   |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|     higgs     |   N/A   | 523.74 |  7532.70  |  267.64  |  209.65  |   7307.44   |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|      usps     |   9.28  |  8.73  |    9.43   |   5.78   |   9.81   |     6.08    |
++---------------+---------+--------+-----------+----------+----------+-------------+
+|     mnist     |  590.81 | 229.91 |  1156.64  |  762.40  |  233.94  |    599.55   |
++---------------+---------+--------+-----------+----------+----------+-------------+
+| fashion mnist |  735.47 |  32.86 |  1403.44  | 2061.80  |  428.37  |    661.05   |
++---------------+---------+--------+-----------+----------+----------+-------------+
+
+Some observations are listed as follow:
+
+* Histogram-based GBDT (e.g., :class:`HGBDT`, :class:`XGB HIST`, :class:`LightGBM`) are typically faster mainly because decision tree in GBDT tends to have a much smaller tree depth;
+* With the number of input dimensions increasing (e.g., on mnist and fashion-mnist), random forest and deep forest can be faster.
+
+.. _`Random Forest`: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
+
+.. _`HGBDT`: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html
+
+.. _`XGBoost EXACT`: https://xgboost.readthedocs.io/en/latest/index.html
+
+.. _`XGBoost HIST`: https://xgboost.readthedocs.io/en/latest/index.html
+
+.. _`LightGBM`: https://lightgbm.readthedocs.io/en/latest/
+
+.. _`Repo`: https://github.com/xuyxu/deep_forest_benchmarks
+
+.. _`ijcnn1`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#ijcnn1
+
+.. _`pendigits`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#pendigits
+
+.. _`letter`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#letter
+
+.. _`connect-4`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#connect-4
+
+.. _`sector`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#sector
+
+.. _`covtype`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#covtype
+
+.. _`susy`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#SUSY
+
+.. _`higgs`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#HIGGS
+
+.. _`usps`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#usps
+
+.. _`mnist`: https://keras.io/api/datasets/mnist/
+
+.. _`fashion mnist`: https://keras.io/api/datasets/fashion_mnist/
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+.wy-nav-content {`
	`2`	`+ max-width: 60%;`
	`3`	`+}`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+.. _changelog:`
	`2`	`+`
	`3`	`+.. include:: ../CHANGELOG.rst`