Viktor Losing, Barbara Hammer, Heiko Wersing, "Tackling Heterogeneous Concept Drift with the Self Adjusting Memory (SAM)", Knowledge and Information Systems, vol. 54, no. 1, pp. 171-201, 2018.
AbstractData Mining in non-stationary data streams is gaining more attention recently, especially in the context of Internet of Things and Big Data. It is a highly challenging task, since the fundamentally different types of possibly occurring drift undermine classical assumptions such as data independence or stationary distributions. Available algorithms are either struggling with certain forms of drift or require a priori knowledge in terms of a task specific setting. We propose the Self Adjusting Memory (SAM) model for the k Nearest Neighbor (kNN) algorithm since kNN constitutes a proven classifier within the streaming setting. SAM-kNN can deal with heterogeneous concept drift, i.e different drift types and rates, using biologically inspired memory models and their coordination. It can be easily applied in practice since an optimization of the meta parameters is not necessary. The basic idea is to construct dedicated models for the current and former concepts and apply them according to the demands of the given situation. Usually, the drift characteristics for real world datasets are unknown. However, this information is not only crucial for a detailed evaluation of drift algorithms, but also relevant for the practice, since it facilitates the choice of an appropriate algorithm given a specific problem. Therefore, we firstly use two statistical tests to determine the type of drift as well as its prevailing strength in all utilized datasets. Afterwards, we conduct an extensive evaluation of SAM using artificial and real world benchmarks. Thereby, we explicitly add new benchmarks enabling a precise performance evaluation on multiple types of drift. The highly competitive results throughout all experiments, each with different drift characteristics, underline the robustness of SAM-kNN as well as its capability to handle heterogeneous concept drift.