Source coding and big data : massive random access and learning over compressed data
Dr. Elsa Dupraz
Signal & Communication dpt, IMT-Atlantique, Brest, France. (formerly Telecom Bretagne)
The amount of data is growing to such an extent that 90% of the content available online has been created in the last two years. The data (pictures, video, sensor measurements, etc.) is usually stored in huge databases which contain redundancies (e.g: videos from the same scene, sensors in the same neighborhood, 3D or multi-view television, etc.). In this context, the role of source coding is, as usual, to eliminate as much as possible the redundancies in order to greatly reduce the storage needs and the amount of transmitted data.
The storage of these huge databases raises questions about the access and the processing of the data. In particular, it is not desirable to transmit and decode a whole piece of content (e.g. multi-view video, all jointly encoded) when a user is only interested in a small portion of the data (e.g. one of the views of the video). It is not desirable neither to decompress every single item before applying a learning process over the database. This talk addresses these two issues from an information-theoretic perspective and shows that it is possible to design flexible source coding techniques which respond to the challenges raised by the volume of data.
Elsa Dupraz has been an assistant professor at Telecom Bretagne since October 2015. She graduated from ENS Cachan and earned her Master degree in Advanced Systems of Radiocommunications in 2010. She received her Ph.D degree from University Paris-Sud in 2013. From January 2014 to September 2015 she held a post-doctoral position at ETIS (France) and ECE department of the University of Arizona (United States). Her current research interests are on source coding for Massive Random Access, on distributed source coding, and on LDPC decoders on unreliable hardware.