In practical Bayesian optimization, we must often search over structures with differing numbers of parameters. For instance, we may wish to search over neural network architectures with an unknown number of layers. To relate performance data gathered for different architectures, we define a new kernel for conditional parameter spaces that explicitly includes information about which parameters are relevant in a given structure. We show that this kernel improves model quality and Bayesian optimization results over several simpler baseline kernels.
- Kevin Swersky - University of Toronto - ([email protected])
- David Duvenaud - University of Cambridge - ([email protected])
- Jasper Snoek - Harvard University - ([email protected])
- Frank Hutter - Freiburg University - ([email protected])
- Michael A. Osborne - University of Oxford - ([email protected])