Optax: Advanced Features

This lesson will continue to introduce the advanced features of Optax in this lesson.

We'll cover the following

Learning-rate scheduling
Combining optimizers
- Chain
- Multi-transform
Optax transformations
- Gradient clipping
- Gradient noise
Differential privacy (DP)
References

While the last lesson reviewed some common loss functions and optimizers, Optax has much more to offer. More than we can reasonably cover in this lesson, actually, so we’ll restrict ourselves to just a handful of functionalities here.

Learning-rate scheduling

Not content with the default setting of a constant learning rate, the deep learning community has been experimenting with variable learning rates. Optax offers more than a dozen versions of this technique, which is known as learning-rate scheduling. Let’s review a few:

Exponential decay
Cosine decay
Combining (multiple, existing) schedules
Injecting hyperparameters

Exponential decay

This scheduling scheme follows an exponential distribution.

\eta_t = \eta_{0}e^{-kT}

We can switch between a continuous or discrete sampling by setting up the staircase attribute to True or False.

Cosine decay

In 2017, Ilya Loshchilov & Frank Hutter proposed Stochastic Gradient Descent with warm Restarts, SGDR. It uses a cosine decay scheduling, which can be represented as:

\eta_t = \eta_{\min}^i+\frac{1}{2}(\eta_{\max}^i - \eta_{\min}^i)(1 + \cos(\frac {T_{curr}}{T_i}\pi))

We can define this as cosine_decay_schedule() with the parameters:

init_value
decay_steps
alpha

Combining schedules

We can even combine two schedules using join_schedule().

Injecting hyperparameters

It’s a common practice to update the learning rate during training. Often we’ll want to update the other hyperparameters as well during the training. We can do this easily by using optax.inject_hyperparameters().

Get hands-on with 1200+ tech skills courses.

Introduction

JAX Programming Model

Linear Algebra

Random Variables and Distributions

JAX Ecosystem

Project: GAN Using the JAX ecosystem

Appendix