Coding: Bayesian Optimization from Scratch

Bayesian optimization is a model-based method for finding the minimum of a function that is expensive to evaluate. It involves constructing a probabilistic model for the function and then exploiting this model to determine where to sample next.

General steps of Bayesian optimization

The general steps to implement Bayesian optimization are:

  1. Specifying a surrogate model (usually a GP).

  2. Defining an acquisition function based on this model.

  3. Iterating the following steps for a number of rounds:

    1. Using the acquisition function to decide where to sample.

    2. Updating the surrogate model incorporating the new sample.

Example of machine learning

In the realm of practical machine learning, we often employ a function called f(x) as a way to control the complexity of our models during the training process. Imagine we’re training a machine learning model, for example, a neural network, to make predictions, and we want to avoid the model becoming overly complex or memorizing the training data. The f(x) function comes into play as a tool to measure this complexity. When we train our model, we’re essentially adjusting its internal settings, known as parameters, to make accurate predictions. However, we also want to ensure that these parameters aren’t getting too intricate. To achieve this, we use f(x) to gauge their complexity, with higher values of f(x) indicating more intricate models. Our ultimate goal is to find the best parameters for our model. To do this, we create an objective function by combining the MSE (that tells us how well our model is predicting) and the f(x) value (that informs us about model complexity). This combined objective function guides our optimization process like a compass as we search for parameter values that simultaneously deliver accurate predictions and maintain model simplicity.

Get hands-on with 1200+ tech skills courses.