Ingenieurpraxis/Bachelor’s thesis - Unsupervised Learning and Out-of-distribution Detection


Topic

A common obstacle for the successful adoption of machine learning (ML) techniques in real world applications is the problem of out-of-distribution data. In realistic contexts, ML models are likely to be confronted with data that differs significantly from the data they were trained on. This could for example be due to label noise, outliers, or a shift or drift in the underlying distribution of the model’s new input data. The goal of this Ingenieurpraxis or Bachelor’s thesis will be to evaluate the usefulness of certain unsupervised learning metrics to detect and handle such out-of-distribution data.

The student work will be carried out in conjunction with the CellFace project, which investigates the potential of computer vision and machine learning techniques for the task of automated blood cell diagnosis. You can read more about CellFace here.

Requirements

  • Python coding skills
  • Prior knowledge of machine learning techniques
  • Experiences with Git
  • Capacity for teamwork

For a more in-depth introduction to the relevant research topics, see:

Gama, João, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. “A Survey on Concept Drift Adaptation.” ACM Computing Surveys 46 (4): 1–37. https://doi.org/10.1145/2523813.

Supervisors

Alice Hein, M.Sc. and Stefan Röhrl, M.Sc.

Chair for Data Processing

Contact Information

Alice Hein, M.Sc.

Chair for Data Processing

TUM Department of Electrical and Computer Engineering

Technical University of Munich

 

Arcisstr. 21, 80333 Munich

Room Z942

alice.hein@tum.de