In the context of multi-fidelity HPO, each fidelity yields a different learning curve. An example of learning curve is the progression of accuracies over training epochs. In practice, human experts commonly rely on the observed learning curve to stop poor configurations. This line of research aims to exploit the same information to speed up HPO methods. Concretely, we propose Bayesian models to extrapolate the observed learning curve to higher fidelities. By estimating the performance of configurations at higher fidelities, we can effectively discard unpromising runs.
- Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves introduces, to the best of our knowledge, the first practical Bayesian learning curve prediction relying on a set of parametric increasing functions trained via Markov Chain Monte Carlo.
- LCNet extends the previous work to incorporate hyper-parameter configurations.
- LC-PFN adopts a meta-learning perspective utilizing a Transformer pre-trained on artificial data.