_        __  __ _      
    /\        | |      |  \/  | |     
   /  \  _   _| |_ ___ | \  / | |     
  / /\ \| | | | __/ _ \| |\/| | |     
 / ____ \ |_| | || (_) | |  | | |____ 
/_/    \_\__,_|\__\___/|_|  |_|______|

AutoML: taking the human expert out of the loop

 

HomeHPOlibAutomated Hyperparameter Importance Analysis
Index Algorithms and Datasets Software DownloadManual

Fork me on GitHub

Benchmarks Overview

To run these algorithms and datasets with hyperparameter optimizers you need to install

  1. the HPOlib software from here
  2. the benchmark data: An algorithm and depending on the benchmark a wrapper and/or data

Then the benchmarks can easily be used, like described here
Our software allows to integrate your own benchmarks as well. Here is the HowTo

NOTE: For all bechmarks crossvalidation is possible, but not extra listed. Althoug possible, it obviously makes no sense to do crossvalidation on functions like Branin and pre-computed results like the LDA ongrid. Whether it makes sense to do so is indicated in the column CV.

Available Benchmarks
Algorithm # hyperparams(condition.) contin./discr. Dataset Size(Train/Valid/Test) runtime programming language CV
Branin 2(-) 2/- - - < 1s Python no
Camelback function 2(-) 2/- - - < 1s Ruby no
Hartmann 6d 6(-) 6/- - - < 1s Python no
LDA ongrid 3(-) -/3 wikipedia articles - <1s Python no
SVM ongrid 3(-) -/3 UniPROBE - <1s Python no
Logistic Regression 4(-) 4/- MNIST 50k/10k/10k <1m (Intel Xeon E5-2650 v2; OpenBlas@2cores) Python yes
hp-nnet 14(4) 7/7 MRBI
convex
10k/2k/50k
6.5k/1.5k/50k
~25m (GPU, NVIDIA Tesla M2070)
~6m (GPU, NVIDIA Tesla M2070)
Python yes
hp-dbnet 36(27) 19/17 MRBI
convex
10k/2k/50k
6.5k/1.5k/50k
~15m (GPU, Gefore GTX780)
~10m (GPU, Gefore GTX780)
Python yes
autoweka 786(784) 296/490 convex 6.5k/1.5k/50k ~15m Python/Java yes
Surrogate Benchmarks as original as original as original - <1sec Python sometimes

Description

Branin, Hartmann 6d and Camelback Function

This benchmark already comes with the basic HPOlib bundle.

Dependencies: None
Recommended: None

Branin, Camelback and the Hartmann 6d function are three simple test functions, which are easy and cheap to evaluate. More test functions can be found here
Branin has three global minima at (-pi, 12.275), (pi, 2.275), (9.42478, 2.475) where f(x)=0.397887.
Camelback has two global minima at (0.0898, -0.7126) and (-0.0898, 0.7126) where f(x) = -1.0316
Hartmann 6d is more difficult with 6 local minima and one global optimum at (0.20169, 0.150011, 0.476874, 0.275332, 0.311652, 0.6573) where f(x)=3.32237.

LDA ongrid/SVM ongrid

This benchmark already comes with the basic HPOlib bundle.

Dependencies: None
Recommended: None

Online Latent Dirichlet Allocation (LDA) is a very expensive algorithm to evaluate. To make this less time consuming, a 6x6x8 grid of hyperparameter configurations resulting in 288 data points was preevaluated. This grid forms the search space.

Same holds for the Support Vector Machine task, which has 1400 evaluated configurations.

The Online LDA code is written by Hoffman et. al. and the procedure is explained in Online Learning for Latent Dirichlet Allocation. Latent Structured Support Vector Machine code is written by Kevin Mill et. al. and explained in the paper Max-Margin Min-Entropy Models. The grid search was performed by Jasper Snoek and previously used in Practical Bayesian Optimization of Machine Learning Algorithms.

Logistic Regression

Dependencies: theano, scikit-data
Recommended: CUDA

NOTE: scikit-data downloads the dataset from the internet when using the benchmark for the first time.
NOTE: This benchmarks can use a gpu, but this feature is switched off to run it off-the-shelf. To use a gpu you need to change the THEANO flags in config.cfg. See here for changing to gpu and for further information about the THEANO configuration here
NOTE: In order to run the benchmark you must adjust the paths in the config files.

You can download this benchmark by clicking here or running this command from a shell:

wget www.automl.org/logistic.tar.gz
tar -xf logistic.tar.gz

This benchmark performs a logistic regression to classifiy the popular MNIST dataset. The implementation is Theano based, so that a GPU can be used. The software is written by Jasper Snoek and was first used in the paper Practical Bayesian Optimization of Machine Learning Algorithms.

NOTE: This benchmark comes with the version of hyperopt-nnet which we used for our experiments. There might be a newer version with improvements.

HP-NNet and HP-DBNet

Dependencies: theano, scikit-data (github version, not pyPI), hyperopt-nnet
Recommended: CUDA

NOTE: scikit-data downloads the dataset from the internet when using the benchmark for the first time.
NOTE: In order to run the benchmark you must adjust the paths in the config files.

You can download this benchmark by clicking here or running this command from a shell:

wget www.automl.org/hpnnet.tar.gz
tar -xf hpnnet.tar.gz

The HP-Nnet (HP-DBNet) is a Theano based implementation of a (deep) neural network. It can be run on a CPU, but is drastically faster on a GPU (please follow the theano flags instructions of the logistic regression example). Both of them are written by James Bergstra and were used in the papers Random Search for Hyper-Parameter Optimization and Algorithms for Hyper-Parameter Optimization.

AutoWEKA

NOTE: AutoWEKA is not yet available for download!

AutoWEKA is a software package which combines the machine learning toolbox WEKA with hyperparameter optimization software. But AutoWEKA goes one step further and also includes model selection inside the hyperparameter optimization. It can choose from 27 classifiers which are implemented in the WEKA toolbox.

Surrogate Benchmarks

Fork me on GitHub

Our surrogate benchmarks mimic the behaviour of the corresponding real benchmark but need far less time (<1sec) to return a performance. While we also provide two table-look up benchmarks (onlindeLDAongrid and SVMongrid) our surrogate benchmarks consist of regression models that are trained on data obtained by previous optimization runs. Based on this training data they can predict performance for new configurations. Here you can download most of our surrogate benchmark, which (in combination with the surrogateBenchmark library) should work out of the box.

For further information have a look at our AAAI paper introducing surrogate benchmarks:

NOTE: It might happen that you cannot load the surrogate model (because you don't have a 64-Bit system, you use a different version of numpy/scikit-learn). In this case you can easily retrain the surrogate by yourself. We will soon provide further information on how to do this.

Available Surrogates
Algorithm #hyperparams(condition.) #training data klick to download Size
Online LDA 3(-) 1 999 onlineLDA_surrogate.tar.gz 233M
Logistic Regression 4(-) 4 000 logreg_surrogate.tar.gz 310M
HP-NNET mrbi 14(4) 8 000 hpnnet_surrogate.tar.gz 639M
HP-NNET convex 8 000
HP-NNET mrbi 5CV 20 000
HP-NNET convex 5CV 19 998
HP-DBNET mrbi 36(27) 7 997 hpdbnet_surrogate.tar.gz 306M
HP_DBNET convex 7 916