MLOps with Optuna – Save Yourself Time

By Jessie Hobb On Mar 1, 2023

Don’t waste your time, use Optuna

Generated with DALLE-2, hyper-parameter optimization (photo by author)

For anyone familiar with the arduous process of hyperparameter tuning, Optuna is a lifesaver.

The ability to tune a range of models using different hyperparameter optimization techniques is nothing short of amazing.

If you’re still tuning your models through grid search you need to change your approach — You’re losing performance

This article contains ready-to-use code you can implement right away. Fast forward to the end of the article if you want to experiment yourself. Don’t forget to load the functions throughout the article. This post will discuss hyperparameter tuning at a high level before diving into Optuna.

It will outline how you can create new studies with custom parameter grids and metrics.
It will showcase how to save and load studies and gather the best trials and models.
And finally, it will show how to fork studies to continue searches with updated search spaces.

When learning machine learning you will be confronted with hyperparameter tuning which dictates the structure of the model you optimize. The most common method is grid search, where permutations of parameters are used to train and test models.

Grid search is wildly inefficient. Both in terms of wasting time and exploring less of your hyperparameter space.

The result is a worse-performing model.

There are multiple ways to improve over brute force grid searches. I’ve outlined exactly why different search methods, including random, outperform grid searches.

Essentially don’t use grid search. It takes too long to analyze your hyperparameter search space.

What matters more is how to manage the different models you create with more effective Bayesian techniques like ‘Tree-structured Parzen Estimators’.

But you’ll quickly find that you’re rapidly creating and saving more and more models. Thus, you’ll find yourself needing to track, store, and monitor the different models you’ve optimized.

Optuna is a hyperparameter optimization framework for tuning models. It lets you understand how hyperparameters affect your model and improves your model performance.

I previously wrote about how you can use this library to quickly optimize models with incredibly large hyperparameter spaces with some ease.

There are many samplers available to tune your models. It still contains the standard grid search and random search models. But, in addition, you can also choose :

Tree-structured Parzen Estimator (used in this article)
A Quasi-Monte Carlo Sampler
An Intersection Search Space Sampler

And a half dozen more options, all of which more systematically search through your hyperparameter space.

Each optimization within Optuna takes the form of a study. These studies track many different components of the hyperparameter optimization process. They let you view performance at different steps, view the effects of certain hyperparameters, or select models from the best trails.

Model parameters

One issue with studies is the fixed parameter grids. This is a limitation of the optimization function required for the study. In Optuna tutorials you’ll see this function must follow a standard format to track trials.

The optimization function does not allow a user to pass in different models on the fly. It also does not allow different parameter grids to the optimization function.

But, there is a workaround which allows variable parameter grids and variable models. In comes lambda functions.

By defining our optimization functions as lambdas, we can pass in multiple values. This lambda then calls an underlying function. The result is a much more robust and flexible study setup.

Dynamic Model Optimization

Below I’ve defined a few functions that act as wrappers for the underlying Optuna functions. These wrappers all you to quickly pass in different models and parameter grids without having to define whole new optimization functions each time.

Now, you can simply create different parameter grids for different experiments in python dictionaries and pass in whatever models you may like.

Dynamic Optuna Study Initialization (Code by Author)

Here, you can also see that the models will be uniquely saved whenever a model name isn’t specifically passed in.

Since you can quickly update your search spaces and models, the number of studies rapidly increases. Therefore, loading, renaming, forking and otherwise tracking models become more problematic.

Load model

Optuna studies are stored in a db file which can be loaded using their load_study function. This function also provides the opportunity to change the sampler used in the underlying optimization.

Load Optuna Studies from Local Database (Code by Author)

Analyze Studies

After generating a slew of models from your hyperparameter samples, what’s next is to analyze your results.

Below I’ve defined some more functions to help.

Sometimes the best model isn’t always the top performer. For various reasons, your target metrics and goal problem may add additional layers of complexity that require handling.

With the function below you can grab the top n models to review.

Get Top Trials from an Optuna Study (Code by Author)

After running studies, and determining your best models you’ll need to set a model into production.

To accomplish this you can identify your ideal study and retrieve the parameters of the best underlying model with the following function.

Get Best Parameters from Optuna Study (Code by Author)

These best trail parameters are a dictionary of parameters. These can then be loaded into the model you’re using with the double star ‘**’ python operator.

Fork Study

Sometimes your model is making progress but you haven’t quite run the model long enough. This is an easy fix for model optimization. But, this is a more difficult problem for hyperparameter optimization.

Fortunately, with limited adjustment, you can load old studies and continue your search. Moreover, upon the analysis, you may find that your best models are within a certain range of hyperparameters.

With the fork function, you can split off your studies and explore different hyper-parameter grids.

Think your learning rate isn’t quite low enough? Adjust the parameter grid and keep running. The underlying objective model for the hyperparameter search continues to optimize when continuing the study.

Fork Existing Optuna Study (Code by Author)

Rename Study

Just to iron out some other utilities which might come in handy, I’ve also created a study rename function. With so many studies, the best models may get lost in the mix.

Using the parse studies function defined above along with the rename study function below you can easily run a bulk search across different models and parameter grids.

Then once you’ve found a great model you can quickly rename these and their underlying data storage to keep track of your progress.

Rename Existing Optuna Study and Local Database (Code by Author)

Experiments

To showcase this dynamic optimization I’ve set up a brief experiment which you can re-run and change with your datasets. This code is intended to be used directly by modifying the dataset. After which you can adjust the model and it’s own respective hyperparameters. The aim here was to use this code to support my own MLOps pipelines to build, optimize, and track many different models.

Data used

The dataset used is the open-source toy diabetes dataset from scikit-learn. This dataset comes standard with your installation. This is a regression dataset with numerical variables and a numerical target. Perfect to showcase how to set up these Optuna studies.

Setup

You’ll need to specify a name for the study and the number of trails that you want to optimize. For my brief example, I’m showcasing a light gradient boosting model. These models have a lot of hyperparameters making them difficult to tune.

Take the time to review the parameter distributions used as they highlight the different ways you can search over a hyperparameter space. There are various distributions which you can use to fine tune how new candidate values are sampled.

Run Dynamic Optuna Study (Code by Author)

After completing your hyper-parameter search with the study you can view your results. Besides the functions I’ve defined, you may also find the trails_dataframe() function to be useful. This function simply returns all the study details as a dataframe.

The next step is to load the best parameters from either the best trail to use. Or, one of the top trials and use these in a model.

This is a straightforward process, with the set_params() function of your model.

Display Results of Dynamic Optuna Study (Code by Author)

Optuna is set up to create a parameter grid for a study and optimize over a series of trials.

Something you may have also done is reload your studies and continue to optimize them. But, if you’ve already exhausted your search space, your results may not improve much.

Yet, with my dynamic study setup, you can load an existing trial, fork the trial, change your parameter grid, and continue your hyper-optimization search.

Unfortunately, you can only update numerical parameter distributions in your hyper-parameter optimization. This issue appears to be a current limitation of Optuna.

But you can adjust your hyper-parameter setting to a range completely outside the start range. Thus, you can pick up right where you left off or explore a new distribution entirely.

Fork and Continue Optimization of Optuna Study (Code by Author)