go back

Dealing with Inaccurate and Incomplete Labels in Industrial Streaming Data - Talk at Uni Creete (27.09.2023)

Andrea Castellani, "Dealing with Inaccurate and Incomplete Labels in Industrial Streaming Data - Talk at Uni Creete (27.09.2023)", Crete University, 2023.

Abstract

Machine learning techniques are an essential option for processing large volumes of data and are capable to capture complex relationships within it. However, obtaining meaningfully annotated data is a real challenge and typically incurs large costs. Especially, in an industrial setting where few labelled data samples are available and drifting data features poses a severe challenge. In this talk, I will address: (1) how to efficiently train models with only partially labeled data, and (2) how to train models when a sizable fraction of the label information is not correct or is changing over time. I will present several strategies how to deal with these questions and evaluate their performance on stationary and non-stationary benchmark data sets, as well as real-world industrial application data. The core of the approaches is to use constrained embedding representations for the raw input data. These representations are shown to be efficient for dealing with limited annotated data by analysis of the labeled and unlabeled data based on similarities in the embedding space.



Download Bibtex file Per Mail Request

Search