Einzelne angebotene Forschungspraxen oder MSCE Internships können auch als Aufgabe im Rahmen des Projektpraktikums Integrated Systems durchgeführt werden. Für die betreffenden Ausschreibungen ist dies im Ausschreibungstext explizit angegeben.

Offene Arbeiten

Interesse an einer Studien- oder Abschlussarbeit? In unseren Arbeitsgruppen sind oftmals Arbeiten in Vorbereitung, die hier noch nicht aufgelistet sind. Teilweise besteht auch die Möglichkeit, ein Thema entsprechend Ihrer speziellen Interessenslage zu definieren. Kontaktieren Sie hierzu einfach einen Mitarbeiter aus dem entsprechenden Arbeitsgebiet. Falls Sie darüber hinaus allgemeine Fragen zur Durchführung einer Arbeit am LIS haben, wenden Sie sich bitte an Dr. Thomas Wild.

Having an understanding of what the system is doing and how long certain steps require is important to determine the performance of the system and identifying potential bottlenecks.

Software tracing tools, which are widely available, provide a good performance overview of the system, showing the different operations or function each CPU is executing. However, extensice tracing in software results in substantial overhead for the system, meaing that it is inadequate to trace short events in the range of a couple of clock cycles.

In full system simulators like gem5 we can introduce custom hardware tracepoints for various components in the system. Hardware tracepoints can be directly generated by the simulated component such as CPUs, accelerators, caches...

The goal of this internship is to combine the software and hardware tracing option to obtain a better overview of the whole system.

Voraussetzungen

Proficient in:

Python

C/C++

Good knowledge of Linux

Betreuer:

Tim Twardzik

Feature Improvements for an MPSoC Demonstrator System

Feature Improvements for an MPSoC Demonstrator System

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel. To demonstrate the abilities of a hybrid NoC with protection switching for critical traffic, an MPSoC demonstrator system was developed at LIS.

Goal:

The goal of this work is to implement and integrate various improvements to the existing demonstrator system—particularly the GUI and the software backend—in order to improve performance, stability, and enable new features.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

Good Python programming skills and knowledge in OOP

At least basic Javascript programming skills

Basic knowledge of embedded systems architecture

Self-motivated and structured work style

Learning Objectives:

By completing this project, you will be able to:

Comfortably develop applications on different layers of the software stack

Adopt software to interface with and utilize hardware modules

Document your work in form of a scientific report and a presentation

Runtime Reconfigurable Winograd-based FPGA Accelerator for CNN Inference

Beschreibung

Convolutional neural networks have proven their success in extracting features from images and producing predictions for different tasks such as classification, segmentation and object detection. However, the superior performance of modern deep neural networks can be mostly backtracked to high model complexity and extensive hardware requirements. In this research internship, the complexity of convolution is reduced by quantization and Winograd minimal filtering algorithms. The prediction quality is regulated using dynamic reconfigurable Winograd acceleration.

Voraussetzungen

To successfully complete this project, you should have the following skills and experiences:

Good programming skills in C/C++

Good knowledge of neural networks, particularly convolutional neural networks

VHDL/Verilog or OpenCL would be encouraged.

The student is expected to be highly motivated and independent. By completing this project, you will be able to:

Understand the impact of quantization, Winograd convolution and task specific accuracy.

Implementation of run-time reconfigurable Winograd Convolution on FPGA using OpenCL.

Evaluate trade-offs between flexibility, prediction accuracy and resource consumption

Implementation of an Approximated FIR Filter on FPGA for Laser Line Extraction from Pixel Data

Beschreibung

Current 3D laser line scanners have precision in the range of a micrometer. These scanners work on the principle of laser triangulation and use a camera chip in the receive path. The captured pixel data is then processed on an FPGA to generate 3D profile data. In order to do this, the lsaser line, as seen by the camera, must be extracted from the pixel data. For this task, several methods have been proposed. One of these methods employs an FIR filter to calculate the derivative of the incoming pixel stream orthogonally to the laser line direction. Afterwards, the zero crossing of this derivative is detected. The position of the zero crossing marks the position of the laser line in the camera image. From this position, the distance of the laser scanner to the scanned object can be derived.

Approximate computing is an emerging design paradigm that trades in accuracy for resource consumption, i.e. a certain inaccuracy of the calculations is allowed with the goal of reducing the overall resource consumption of the implemented design. In this thesis, such approximation methods should be integrated to the data processing pipeline and the results should be evaluated.

This thesis includes the implementation of a simple data processing pipeline for the extraction of the laser line from pixel data using an FIR filter-based approach. The implementation should be done in VHDL. Furthermore, the necessity for prefiltering (e.g. smoothing) of the pixel data should be assessed and implemented if necessary. Finally, the potential for the integration of approximate computing methods into the data processing pipeline should be evaluated.

Voraussetzungen

The student should have the following skills in order to successfully complete the thesis.

Good programming skills in VHDL

A basic understanding of FIR filter design

A basic understanding of image processing

The ability to work independently

Previous experience with approximate computing is helpful, but not essentially required

The student can work on the thesis remotely from his home office.

Kontakt

Arne Kreddig Doctoral Candidate and FPGA Design Engineer SmartRay GmbH

arne.kreddig@smartray.com

Betreuer:

Arne Kreddig - (SmartRay GmbH)

Preference-Based Multi-objective Optimization using Genetic Algorithms

Preference-Based Multi-objective Optimization using Genetic Algorithms

Beschreibung

Multi-objective Optimization (MOO) is inevitable in many real-world applications for effective trade-off analysis between the competing objectives. The MOO results in the formation of Pareto Optimal points that allows the decision-maker to select the points based on his desired trade-off in an application. One typical example for MOO is a Genetic Algorithm based on NSGA selection [1]. However, NSGA algorithms often lead to the exploration and optimization of the entire design space in each objective dimension. This is not necessary for many cases and a significant computational effort is wasting for regions outside the threshold values in the decision maker's mind.

This research work aims to investigate different genetic algorithm-based multi-objective optimization approaches which form Pareto Optimal solutions based on the preference given by the designer (e.g. [2]) and test an appropriate approach on benchmark problems.

[1] K. Deb, S. Agrawal, A. Pratap and T. Meyarivan, "A Fast Elitist Nondominated Sorting Genetic Algorithm for Multi-objective Optimization: NSGA-II", Parallel Problem Solving from Nature PPSN VI ser. Lecture Notes in Computer Science, pp. 849-858, 2000.

[2] Kalyanmoy Deb and J. Sundar. Reference point based multi-objective optimization using evolutionary algorithms. In Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, GECCO ‘06, 635–642. New York, NY, USA, 2006.

Voraussetzungen

Basic understanding of optimization techniques

Good programming skills in Python or Matlab

High motivation and ability to work independently

Kontakt

Manu Manuel Department of Electrical and Computer Engineering Chair of Integrated Systems

Email: manu.manuel@tum.de

Betreuer:

Manu Manuel

Investigation of state-of-the-art optimization approaches for approximate computing

Investigation of state-of-the-art optimization approaches for approximate computing

Beschreibung

The increased demands for technological advancements in computing systems in applications domains such as signal or image processing lead to the idea of approximate computing. Approximate computing provides a new design paradigm by performing inexact calculations instead of the actual one and exploiting the resilience of applications to this inexactness. As a result, fewer resources are used on the computing devices, more functions can be implemented, and the energy efficiency of the calculations is improved. Many approximation techniques have been proposed over the last decades and combining multiple of them in a larger system can increase the resulting benefits. However, this leads to a design space exploration problem to determine the best parametrization across all the employed methods. Many pieces of research have been already proposed metaheuristic optimization approaches to determine the trade-off between the benefits of approximations and loss of application quality and thereby determining the best parametrization [1] [2] [3].

This research practice aims to implement state-of-the-art optimization approaches used for determining the trade-off between approximation benefits and loss of application quality with an image processing pipeline specified in [1].

[1] M. Manuel, A. Kreddig, S. Conrady, N. A. V. Doan, and W. Stechele, Model-Based Design Space Exploration for Approximate Image Processing on FPGA, DOI:10.1109/NorCAS51424.2020.9265138.

[2] B. S. Prabakaran, V. Mrazek, Z. Vasicek, L. Sekanina, M. Shafique, ApproxFPGAs: Embracing ASIC-Based Approximate Arithmetic Components for FPGA-Based Systems, DOI:10.1109/DAC18072.2020.9218533.

[3] J. Castro-God?nez, J. Mateus-Vargas, M. Shafique, J. Henkel, AxHLS: Design space exploration and high-level synthesis of approximate accelerators using approximate functional units and analytical models, DOI:10.1145/3400302.3415732.

Voraussetzungen

Basic understanding of metaheuristics optimization approaches such as genetic algorithms, tabu search, hill-climbing algorithm, etc.

Good programming skills in Python or Matlab

High motivation and ability to work independently.

Convolutional neural networks (CNNs) are the defacto standard for many computer vision (CV) applications. These range from medical technology, robotics applications to autonomous driving. However, most modern CNNs are very memory and compute intensive, particularly when they are dimensioned for complex CV problems.

Compressing neural networks is essential for a variety of real-world applications. Pruning is a widely used technique for reducing the complexity of a neural network by removing redundant and superfluous parameters. One characteristic of this approach is the pruning granularity, which describes the substructures that should be removed from the neural network. Another aspect is the method for finding the redundant and unused structures, which plays a central role in effective pruning without loss of task-related accuracy. The optimization goal determines which elements (kernel, filter, channel) can be removed from the topology of the CNN.

The goal of this work is to learn the internal relationships between the channels, filters, kernels of the layers by means of a graph neural network, and identify their relevance to the classification task of the CNN. The learned relationships are then used for pruning the neural network.

Voraussetzungen

Prerequisites

To successfully complete this project, you should have the following skills and experiences:

Good programming skills in Python and Tensorflow

Good knowledge of neural networks, particularly convolutional neural networks

The student is expected to be highly motivated and independent.

Kontakt

Nael Fasfous Department of Electrical and Computer Engineering Chair of Integrated Systems

Benchmarking CNNs on Hardware Accelerators for Embedded Applications (NVIDIA)

Beschreibung

Convolutional Neural Networks (CNNs) are the state of the art for most computer vision tasks. Although their accuracy is unrivaled when compared to classical segmentation and classification algorithms, they present many challenges for implementation on hardware platforms. Most performant CNNs tend to be computationally complex for low-power embedded applications. Finding a good trade-off between accuracy and efficiency can be critical when deciding the network architecture and the target hardware.

This work focuses on benchmarking different CNNs on existing hardware accelerators in order to find solutions for different embedded application scenarios.

Voraussetzungen

To successfully complete this project, you should have the following skills and experiences:

Very good programming skills in HDL/HLS

Basic prgramming skills in Python and Tensorflow

Good knowledge of neural networks, particularly convolutional neural networks

The student is expected to be highly motivated and independent. By completing this project, you will be able to:

Assess the feasibility of a CNN-Accelerator for a given application

Optimize CNNs and their target hardware accelerator to improve overall system performance

Test and evaluate solutions for correctness and applicability

Present your work in the form of a scientific report

Kontakt

Nael Fasfous Department of Electrical and Computer Engineering Chair of Integrated Systems Arcisstr. 21 80333 Munich Germany