Duc Anh Nguyen, Jiawen Kong, Hao Wang, Stefan Menzel, Bernhard Sendhoff, Anna Kononova, Thomas Bäck,
"Improved Automated CASH Optimization with Tree Parzen Estimators for Class Imbalance Problems",
IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2021.
The imbalanced classiﬁcation problem is very relevant in both academic and industrial applications. The task of ﬁnding the best machine learning model to use for a speciﬁc imbalanced dataset is complicated due to the large number of existing algorithms and their hyperparameters. The Combined Algorithm Selection and Hyperparameter optimization (CASH) was introduced to tackle both aspects at the same time. However, CASH has not been studied in detail in the class imbalance domain, where the best combination of resampling technique and classiﬁcation algorithm is searched for, together with their optimized hyperparameters. Thus, we target the CASH problem for imbalanced classiﬁcation. We experiment with a search space of ﬁve classiﬁcation algorithms, twenty-one resampling approaches and sixty-four relevant hyperparameters in total. This paper investigates the performances of two well-known optimization approaches, namely Bayesian optimization and Random search. Moreover, we perform grid search on all combinations of resampling techniques and classiﬁcation algorithms with their default hyperparameters, in order to compare with the performance of the CASH approaches. Our experimental results show that a Bayesian optimization approach outperforms the other approaches for CASH in this application domain.
Download Bibtex file
Per Mail Request