go back

Improved Automated CASH Optimization with Tree Parzen Estimators for Class Imbalance Problems

Duc Anh Nguyen, Jiawen Kong, Hao Wang, Stefan Menzel, Bernhard Sendhoff, Anna Kononova, Thomas Bäck, "Improved Automated CASH Optimization with Tree Parzen Estimators for Class Imbalance Problems", IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2021.


The imbalanced classification problem is very relevant in both academic and industrial applications. The task of finding the best machine learning model to use for a specific imbalanced dataset is complicated due to the large number of existing algorithms and their hyperparameters. The Combined Algorithm Selection and Hyperparameter optimization (CASH) was introduced to tackle both aspects at the same time. However, CASH has not been studied in detail in the class imbalance domain, where the best combination of resampling technique and classification algorithm is searched for, together with their optimized hyperparameters. Thus, we target the CASH problem for imbalanced classification. We experiment with a search space of five classification algorithms, twenty-one resampling approaches and sixty-four relevant hyperparameters in total. This paper investigates the performances of two well-known optimization approaches, namely Bayesian optimization and Random search. Moreover, we perform grid search on all combinations of resampling techniques and classification algorithms with their default hyperparameters, in order to compare with the performance of the CASH approaches. Our experimental results show that a Bayesian optimization approach outperforms the other approaches for CASH in this application domain.

Download Bibtex file Per Mail Request