Stochastic Methods for Data-Driven Applications

Lecturer (assistant)
Number0000005389
TypeSeminar
Duration2 SWS
TermSommersemester 2020
Language of instructionEnglish
Position within curriculaSee TUMonline
DatesSee TUMonline

Dates

Admission information

Objectives

After successful completion of the module, the participants are able: 1) to investigate the effect of randomness as a disturbance factor in the data-driven algorithms, and propose some approaches for countermeasure, 2) to recognize and discuss the potential of the randomness as a factor leading to the increase of the performance of data-driven algorithms. 3) to become acquainted with a new mathematical field, to prepare and to present a scientific talk

Description

Stochastics with random variables find applications in data-driven applications such as signal processing and machine learning. Especially, stochastic methods such as the concentration inequalities have become necessary ingredients of the mentioned applications. In this seminar, examples of technical results, as well as recent applications of selected stochastic methods, are discussed, where the particular interest lies in a deeper reflection of the proof methods applied to establish the mentioned results. Exemplary topics on the technical results: • Concentration inequality for scalar random variables, e.g., Chernov’s and Bernstein‘s tail bounds, • Concentration inequality for martingales, e.g., Azuma’s inequality, • Concentration inequality for matrices, • Mixing property for Markov chains. Exemplary topics on the applications: • Effects of noise in the first-order optimization methods (e.g., Gradient Descent), • Investigations on the generalization ability of supervised learning, • Distributed algorithms and Markov chains, • Community detection and clustering.

Prerequisites

Basic knowledge of analysis, linear algebra, and elementary stochastics. Basic knowledge in information theory and/or signal theory.

Teaching and learning methods

Seminar: Presentation of the lecturers and students based on scientific publications.

Examination

The form of examination is a scientific report and contains a 2 hours presentation and (at least 8 single- or 4 doublepages) written summary of the presentation topic. The final grade is an averaged grade from the written report (50%) and the given presentation (50%).

Recommended literature

R. Vershynin: “High-Dimensional Probability – An Introduction with Applications in Data Science, Cambridge University Press”, 2018 J. Tropp: “An Introduction to Matrix Concentration Inequalities” (Foundations and Trends in Machine Learning), Now Publ. Inc., 2015. T Harsti, R. Tibshirini, and J. Friedman: “The Elements of Statistical Learning”, Springer-Verlag New York, 2009. S. Boucheron, G. Lugosi, and P. Massart: “Concentration Inequalities: A Nonasymptotic Theory of independence”, Oxford University Press, 2016 M. Mitzenmacher and E. Upfal: “Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis”, Cambridge University Press, 2017

Links