In this IDP an existing python package for extracting low-level audio features, still image features and motion features will be improved and extended. The goal is to provide a set of (sequential) features for a video that are interpretable and can be used for tasks like affective video content analysis, video summarization, action recognition. In particular, there is interest in using already existing, pre-trained deep learning models to create a set of a sort of "high level" features. Applications like Object Detection, Face Detection and Recognition, Facial Expression Recognition, Emotion Recognition from Speech, Image Captioning, Speech-to-Text etc. are possible.
Furthermore, it will be evaluated how the results of this high-level feature extraction can be stored for further processing.
- Very good knowledge of python and packaging
- Very good knowledge of at least one Deep Learning Framework (Tensorflow/Keras/Pytorch)
- Good knowledge in software design
- Basic knowledge of UNIX/Linux
Data and machines/servers at the chair can be accessed from home via VPN. Presence work is currently not possible due to Corona.
If you are interested write an email with your motivation and some words about your skills and experience to:
Philipp Paukner, M.Sc.
Technische Universität München
Lehrstuhl für Datenverarbeitung
Tel. +49 (0)89 289 23618