🍃Experiment 2 - Optimization with Specified Search Space

After experimenting on defualt HPO settings, now it's time to make the improvement, and a straightforward way is to specify a better search space, meaning, to suggest a better value ranges for hyperparameters. In this experiment, Lady H. uses the 30-class leaves dataset as an example.

FLAML Specified Search Space

To specify the search space in FLAML, users must create a new class with specified hyperparameters for each learner. For example, Lady H. wanted to adjust the hyperparameters of LGBM, and she had to create a new learner class like this:

Each model has specific guidelines for defining an effective search space. Hyperparameters can be tuned to improve accuracy, increase speed, reduce overfitting, and more.

🌻 Check LGBM tips for search space specification >>

After creating the new learner class, users just need to add it to the estimator_list and to the AutoML instance:

🌻 Look into FLAML experiment details >>

In the code, you'll also notice that Lady H. tried CFO and Blend Search with a larger search space, but neither performed well. The key takeaway: a larger search space doesn't always mean a better search space.

Optuna Specified Search Space

Same as Experiment 1, in Optuna, to specify a search space, users need to create an objective function. Meanwhile, in order to apply cross validation as the above FLAML experiment, LightGBMTunerCV() was used here:

The Optuna experiment took much longer running time than FLAML, therefore, Lady H. was wondering whether she should add pruning to skip those unpromising trials in order to save time, it might also reduce the overfitting. TPE was still used as the search strategy in Optuna experiments. In order to achieve a better performance result,hyperband is a suggested pruner to be used alongside TPE. What's more, because the testing performance from above Optuna experiment was unsatisfactory, Lady H. increased the number of trials to explore potential performance improvements.

🌻 Look into Optuna experiments details >>

In these experiments, FLAML and Optuna share the same specified search space. However, comparing with their default settings, FLAML made 11.87% improvement in testing performance without increasing the time cost, while Optuna had 8% decrease in the testing performance and increased its time cost by 3 times.

NOTE: The time cost shown in Table 1.5 for the Optuna experiments is based on the time for 10 trials.

Looking into Optuna's experiment code, you may have noticed that the time cost increased 9 times when the number of trials increased 9 times. However, this won't happen in FLAML, because FLAML's time complexity is affected by the number of hyperparameters rather than the number of trials.

The output of Optuna experiments didn't show prunning had happened at all. This made Lady H. wonder, was it because Optuna's pruner doesn't work well with cross validation? And how to make Optuna's pruner work? So, she decided to test more in the next experiment.

PreviousExperiment 1 - Optimization with Default Settings NextExperiment 3 - Customized Optimization for A Regression Problem

Last updated 1 year ago