go back

Towards time series feature engineering in automated machine learning for multi-step forecasting

Can Wang, Mitra Baratchi, Thomas Bäck, Holger Hoos, Steffen Limmer, Markus Olhofer, "Towards time series feature engineering in automated machine learning for multi-step forecasting", Proc. of International Conference on Time Series and Forecasting (ITISE2022), vol. 18, no. 1, 2022.


Feature engineering is an essential step in the pipelines used for many machine learning tasks, including time-series forecasting. Although existing AutoML approaches partly automate feature engineering, they do not support specialised approaches for time-series data such as multi-step forecasting. Multi-step forecasting is the task of predicting a sequence of values in a time series. Two kinds of approaches are commonly used for multi-step forecasting. A typical approach is to apply one model to predict the value for the next time step, then the model uses this predicted value as an input to forecast the value for the next time step. Another approach is to use multi-output models to make the predictions for multiple time steps of each time-series directly. In this work, we demonstrate how automated machine learning can be enhanced with feature engineering techniques for multi-step time-series forecasting. Specifically, we combine a state-of-the-art automated machine learning system, auto-sklearn, with tsfresh, a library for feature extraction from time-series. In addition to optimising machine learning pipelines, we propose to optimise the size of the window over which time-series data are used for predicting future time-steps. This is an essential hyperparameter in time-series forecasting. We propose and compare (i) auto-sklearn with automated window size selection and (ii) auto-sklearn with tsfresh features. We evaluate these approaches with statistical techniques, machine learning techniques and state-of-the-art automated machine learning techniques, on a diverse set of benchmarks for multi-step time-series forecasting, covering 20 synthetic and real-world problems. Our empirical results indicate a significant potential for improving the accuracy of multi-step time-series forecasting by using automated machine learning in combination with automatically optimised feature extraction techniques.

Download Bibtex file Download PDF