Studentische Arbeiten am Lehrstuhl für Medientechnik
Im Rahmen unserer aktuellen Forschungsprojekte bieten wir spannende Aufgabenstellungen für studentische Projekte (Ingenieurpraxis, Forschungspraxis, Werkstudententätigkeiten, IDPs) und Abschlussarbeiten (Bachelor- und Masterarbeiten) an.
Offene Arbeiten
BAMAIDPFPIPSEMSHK
Titel
---✔✔--
How Well Perform Today's Autonomous Driving Models
How Well Perform Today's Autonomous Driving Models
Autonomous Driving
Beschreibung
This work can be done in German or English
At the current stage of autonomous driving, failures in complex situations are inevitable. A learning-based method to predict such failures could prevent dangerous situations or crashes. However, collecting real-life training data of crashes caused by autonomous vehicles is not feasible. A different solution is to use data from realistic simulations of a self-driving car, such as CARLA [1].
In this project, the objective is to setup available autonomous driving models such as [2,3] and use our existing data logging pipeline to evaluate these model's failure cases. The whole process should be further improved by extending our logging pipeline.
Tasks
- Improving the existing data logging pipeline
- Setup of existing autonomous driving models
- Collection of driving data with the implemented system
- Evaluation of autonomous driving model failures and collection of failure data
References
[1] A. Dosovitskiy, „CARLA: An Open Urban Driving Simulator“, S. 16, 2017.
[2] https://github.com/erdos-project/pylot
[3] https://github.com/commaai/research
Voraussetzungen
- Experience with Python (ROS and Linux)
- Knowledge about Docker would be helpful
- General knowledge about Machine Learning
Betreuer:
✔✔-✔✔--
Multimodal Object Detection with Introspective Experts
Multimodal Object Detection with Introspective Experts
Beschreibung
Object detection can be performed with different sensor modalities, such as camera images, LIDAR point clouds or a fusion thereof. Adverse weather conditions such as rain or fog can cause different failure cases for different sensor types. In this work, an approach for finding the best suited models for different conditions is investigated. In training, each sensor configuration (camera, LIDAR or fusion) is trained separately. Then, the models are finetuned using only the training images they performed well on, leading to a set of expert models designed to work best with a subset of the training images. Finally, a selection model needs to be designed that is trained to select which expert model is most suitable for the current scene. The models can be trained and evaluated in the CARLA simulator, where different weather conditions can be easily generated.
Voraussetzungen
Knowledge of Deep Learning and Linux
Betreuer:
✔------
Reverse Synthesis for Object Detection Failure Prediction
Reverse Synthesis for Object Detection Failure Prediction
Beschreibung
Bounding boxes proposed by object detection networks do not always contain the classified object. In this work, a prediction method for such failures is investigated. The idea is based on reverse synthesis: Using the predicted class and the proposed bound box, an autoencoder is trained for each class to reconstruct the input image patch. If the bounding box does not contain the predicted class, the reconstructed image ideally contains traces of the imagined or misclassified object which can then be detected.
Voraussetzungen
Basic knowledge of deep learning and Linux
Betreuer:
--✔✔✔--
Development of Interactive Segmentation Interface for Robotic Teleassistance
Development of Interactive Segmentation Interface for Robotic Teleassistance
Beschreibung
In this project we will use RGB-D sensor data coming from a robot and we will further develop our existing system that has a server-client architecture. The client side is a web-based GUI and the server side includes various interactive segmentation algorithms. We will conduct tests using the developed interface and an RGB-D sensor.
Voraussetzungen
- Basic knowledge of /experience with:
- Python
- ROS
- Vue.js JavaScript Framework
- Motivation to learn the necessary tools and yield a successful work.
Kontakt
furkan.kaynar@tum.de
Please provide your CV and Transcript of Records for application.
Betreuer:
✔✔✔✔✔--
Designing Augmented Reality Markers for Robotic Grasp Assistance
Designing Augmented Reality Markers for Robotic Grasp Assistance
Beschreibung
Augmented reality (AR) technology is attracting interest in various fields such as entertainment and human-robot-interface (HRI) design. HRI can be used for giving commands or providing demonstrations to the robot for further processing. Robots operating in unstructured environments may require human help since autonomous algorithms are more prone to fail in such environments. In this project, we will design AR-based tools for assisting robotic grasping. We will use Unity game engine for designing, testing and using the AR markers.
Voraussetzungen
- Basic knowledge of image processing / computer vision.
- Experience with Unity game engine or motivation to learn it.
- Basic coding experience (Unity requires C#).
- Motivation to yield a successful work.
Kontakt
furkan.kaynar@tum.de
Please provide your CV and Transcript of Records for application.
Betreuer:
✔✔✔✔✔--
Unity-based Teleassistance Interface for Robotic Grasping
Unity-based Teleassistance Interface for Robotic Grasping
Beschreibung
Although there is intensive research in the field of robotics since decades, autonomous robotic grasping and manipulation still remain as challenging abilities under real-life conditions. Autonomous algorithms fail more in unstructured environments such as household environments, which limits the practical use of robots in daily human life. In unsructured environments, the perception gains importance and there can often be novel and unseen cases by which the autonomous algorithms tend to fail. By these cases there is need for human correction or demonstration to increase the task performance or teach new abilities to robots. For this aim, we will create a user interface which is intuitive to use by the user on a mobile device and at the same time provides the necessary data to efficiently assist a robot in a daily home environment. The main application will be teleassistance for robotic grasping.
Voraussetzungen
- Basic knowledge of image processing / computer vision.
- Basic coding experience, especially with C#.
- Experience with Unity game engine.
- Basic experience with ROS.
- Motivation to yield a successful work.
Kontakt
furkan.kaynar@tum.de
Betreuer:
---✔---
Failure Prediction for LIDAR-Based Semantic Segmentation
Failure Prediction for LIDAR-Based Semantic Segmentation
Failure Prediction, Semantic Segmentation, LIDAR, Autonomous Driving
Beschreibung
LIDAR sensors allow to capture a scene in 3D while being more robust than cameras to distortions like rain. They are therefore an important part of autonomous driving, where they can be used for semantic segmentation of the environment. For this, each point in the 3D point cloud is classified as belonging to a semantic class such as "car", "pedestrian" or "road". In a safety-critical application such as driving, knowing when such a classification can be trusted or not is important. To this end, failure prediction methods such as introspection [1] can be used to predict where the segmentation failed.
In this internship, a state-of-the-art neural network such as [2] will be implemented to perform semantic segementation of LIDAR point clouds. After implementing the semantic segmentation, a state-of-the-art failure prediction approach will be implemented to detect incorrect classifications. The evaluation will be done using the CARLA driving simulator [3]. A reference implementation based on camera input for both semantic segmentation and failure prediction is available for a comparison.
References:
[1] "Introspective Failure Prediction for Semantic Image Segmentation", Kuhn et al., IEEE ITSC 2020
[2] "RangeNet++: Fast and accurate LiDAR semantic segmentation", Milioto et al., IEEE IROS 2019
[3] https://carla.org/
Voraussetzungen
Basic knowledge of Machine Learning, Python and Linux
Betreuer:
✔--✔---
Contact Skill Imitation Learning
Contact Skill Imitation Learning
LfD, imitation learning, haptics, contact skills
Beschreibung
Imitation learning or Learning from Demonstration (LfD) is a framework that enables fast and efficient teaching of skills to robots by providing them with the examples of the required tasks. Behaviour cloning (BC) methods (such as Gaussian Mixture Modesl (GMM), Hidden Markov Models (HMM), Dynamic Movement Primitives (DMP) have already shown success in trajectory representation and learning. Not only the positional trajectories, but also the interaction forces play an important role in the task success. In this work, we would like to study the impact of force generalization and hybrid position-force control in the task reproduction for LfD.
Tasks:
- Literature research on learning from demonstration for in-contact tasks
- Design and implementation of in-contact tasks in CHAI3D simulation (such as push button, surface cleaning, sliding etc.)
- Collection of demonstrations via haptic interface
- Development of the learning algorithm
- Evaluation and comparison of position+force and position only learning
References:
[1] F. Steinmetz, A. Montebelli, and V. Kyrki, "Simultaneous kinesthetic teaching of positional and force requirements for sequential in-contact tasks", 2015 IEEE-RAS 15th International Conference on Humanoid Robotics.
[2] M. Racca, J. Pajarinen, A. Montebelli and V. Kyrki, "Learning in-contact control strategies from demonstration," 2016 IEEE/RJS International Conference on Intelligent Robots and Systems (IROS)
[3] C. Zeng, C.Yang, J.Zhong, and J.Zhang, "Encoding Multiple Sensor Data for Robotic Learning Skills from Multimodal Demonstration", in IEEE Acces, vol. 7, pp. 145604-145613, 2019.
Voraussetzungen
- C/C++ background
- CHAI3D knowledge is a plus
- Knowledge in LfD is a plus
Betreuer:
-✔-----
A blind/referenceless subjective quality assessment for time-delayed teleoperation
A blind/referenceless subjective quality assessment for time-delayed teleoperation
quality metrics, teleoperation, time delay, haptics
Beschreibung
Using a teleoperation system with haptic feedback, the users can thus truly immerse themselves into a distant environment, i.e., modify it, and execute tasks without physically being present but with the feeling of being there. A typical teleoperation system with haptic feedback (referred to as a teleoperation system) comprises three main parts: the human operator OP)/master system, the teleoperator (TOP)/slave system, and the communication link/network in between [43]. During teleoperation, the slave and master devices exchange multimodal sensor information over the communication link. This work aims to develop a referenceless subjective quality index for time-delayed teleoperation system. This index is able to describe the subjective quality of experience based on a series of objective metrics.
Your work:
(1) build up a teleoperation system.
(2) design subjective test and collect enough subjective evaluation data as the ground truth.
(3) collect the objective metrics of teleoperation systems and design a training scheme based on the ground truth. Machine learning approaches may be needed.
(4) Evaluate the proposed quality index under different conditions.
Voraussetzungen
C++
Betreuer:
--✔-✔-✔
UDP-based haptic communication in Linux system
UDP-based haptic communication in Linux system
Beschreibung
In this topic, you need to implement a UDP-based haptic communication framework in Linux. A windows version is already implemented. Your task is to shift all the functions to Linux and add additional required functions.
Voraussetzungen
C++, Linux, UDP socket programming in C++
Betreuer:
✔✔✔✔---
Student Thesis & Internships at Hospital on Mobile (Extern)
Student Thesis & Internships at Hospital on Mobile (Extern)
Beschreibung
Do you want to do your thesis with Stanford professors and work with Apple Siri’s founder Dr. Tom Gruber?
Hospital on Mobile is a Silicon Valley-based Startup in California. Our vision is to make predictive and personalized healthcare accessible to every individual. We use smartphones and embedded sensors to monitor users’ behavioral, environmental, fitness, and health data. We collaborate our research with the world’s leading universities, including Stanford, Harvard, Oxford, and MIT. Achieving such a big vision also requires high-caliber expert support such as worldwide renowned German AI engineer and Chief Scientist of Salesforce Dr. Richard Socher, as well as Siri’s former founder Chief Technology Officer Dr. Tom Gruber, which are just two of them.
We are especially specialized in vital signs monitoring, like heart rate, respiratory rate, oxygen saturation, and blood pressure monitoring through smartphone sensors. Our technology uses computer vision, deep learning and signal processing. The pain attack or cycle of a disease can be predicted and treated personalized at an early stage. Our mission is to make it accessible and affordable (even free to everyone) regardless of somebody’s wealth status. Therefore, we decided to use smartphones, compared to smartwatches or other smart wearables, as they are widely used, even the most remote place in the world. Also, this solution does not need additional devices and break the flow of people’s daily life. With the world’s top universities, we are in the process of running multiple clinical studies for neurodegenerative diseases like Parkinson and Alzheimer; mental health problems like anxiety and depression; migraine; infectious diseases including COVID.
If you want to create huge impact, to help people not only in your country but even in deep Africa, and to work with world’s leading universities like Stanford alongside top scientists, we are looking for you. Don’t be shy, shoot us an email with your CV and a short paragraph about yourself!
www.hospitalonmobile.com www.migraine.ai www.virologic.io
684 Roble Avenue, Menlo Park, CA, USA
Prerequisites:
- Experience with python
- Very good to excellent knowledge in signal processing, image-/video processing
- Good to very good knowledge in linear algebra
- Logic and Algorithms
- Very high motivation and commitment
- Motivation to learn necessary skills and yield a successful work
Optional/Preferred Prerequisites:
- Knowledge in machine learning
- Knowledge in computer vision
- Knowledge in NLP
- Probability Theory
- Good to very good knowledge in linear algebra
- Experience in C/C++
- Background in iOS/Android programmings
- Experience in visual computing and communication
- Experience in electronics, hardware, and firmware programming
What we offer:
- Cutting-edge topics; work that creates impact
- Highly motivated team with weekly group meetings
- Work with top engineers and scientists worldwide
- Remote work
- Compensation
Kontakt
tamaykut@hospitalonmobile.com
Betreuer:
✔✔-----
Adaptive Camera Capture Controller for Autonomous Visual Inspection with Drones
Adaptive Camera Capture Controller for Autonomous Visual Inspection with Drones
Sensor Fusion, Image Processing, IMU, Robot
Beschreibung
In this work, the student is provided with a state-of-the-art global shutter 4K camera with an additional LED flash driver plus sensor data from an Inertial Measurement Unit (IMU). The task is to adjust the capturing parameters such as exposure time, brightness, and the flash time of the capture system adaptively according to the motion information from the IMU. The aim is to reduce the motion blur that occurs in the flight time using drones and minimize the distortion of the captured image. The final goal is to compute a metric of how good is the image quality for the inspection task. The work has a strong focus on ROS, Python, and C++, so students with experience will be preferred.
Voraussetzungen
ROS, Python, C++
Betreuer:
-✔-----
Multidimensional Vibrotactile Signal Acquisition
Multidimensional Vibrotactile Signal Acquisition
Beschreibung
The goal is to build a sensor array with which we can acquire vibrotactile signal data on many points simultanesouly. Special consideration is to be given to the human hand. The goal is to build a signal database using different materials and exploration patterns.
Voraussetzungen
MATLAB, FPGA programming recommended
Betreuer:
✔✔✔-✔--
Unity Simulation Environment for Human Activity Analysis
Unity Simulation Environment for Human Activity Analysis
3D Simulation, Unity3D, Computer Vision, Machine Learning
Beschreibung
This topic is about 3D simulation for human activity analysis in indoor environments. The student(s) will use the Unity3D game engine to replicate human activity flows from daily life in a 3D simulator. The student(s) will extend the simulator capabilities to cover a large and complex spectrum of activity flows.
If time permits, the 3D data generated from the simulation will be processed using Machine Learning Techniques for Human intention recognition/anticipation.
This is a great opportunity to contribute to Open-source Software.
Voraussetzungen
Interest/experience in 3D game engines (esp. Unity3D), C#
Kontakt
Betreuer:
✔✔✔✔✔--
Open topics in Computer Vision and Machine Learning / Deep Learning based Human Activity Recognition
Open topics in Computer Vision and Machine Learning / Deep Learning based Human Activity Recognition
RGBD, Computer Vision, Machine Learning, Deep Learning, 3D Simulation, Unity3D, Unreal Engine
Take state-of-the-art Machine Learning algorithms for estimating Human Intentions and Activities to the next level.
Beschreibung
Human Activities of Daily Living are driven by our underlying intentions. For example, the Intention of "making Pasta" spawns a sequence of activities like fetch pasta, boil it, fetch and chop vegetables for the sauce, and clean up after cooking.
We develop Computer Vision / Machine Learning algorithms for RGBD video and other sensors to recognize and anticipate human intentions and activities in indoor environments.
We also work on collecting state-of-the-art sensor datasets and 3D simulation of activities using game engines.
Take a look at the currently running projects below to get a more detailed idea about the above topics.
Voraussetzungen
Ambition, Motivation, interest and first experience in Computer Vision, ML/DL, Python, C++
Betreuer:
✔--✔✔--
Gemeinsame Bitratenkontrolle für teleoperierte Fahrzeuge
Gemeinsame Bitratenkontrolle für teleoperierte Fahrzeuge
Autonomes Fahren, Teleoperiertes Fahren, Bitratenkontrolle
Beschreibung
This work can be done in german and english.
Teleoperated driving can be a solution strategy for autonomous driving failures. In order to control the vehicle safely the operator requires low delay video information of the vehicles’ surroundings. This can be achieved by a setup of multiple cameras, which are individually encoded and streamed simultaneously over the same network to the operator. Each camera stream obtains an equal part of the current available transmission rate. To avoid congestion on the network and adapt to the individual available transmission rate, a constant bitrate control algorithm calculates the encoding parameter for each camera stream.
As the image content for each camera can be very different, the residual of the encoded perspectives can have different levels of quality. Especially frames with a higher amount of motion or lots of details require a higher bitrate for encoding compared to other frames. As the bitrate is fix this will result in worse visual quality. With the overall goal of providing equal visual quality for each camera stream, the equal distribution of the available transmission rate seems not well suited.
An overall rate allocation algorithm is required to calculate the individual transmission rate for each camera stream to achieve a minimum variance in visual quality among the different camera views. This could be done by estimating the complexity of the individual camera streams or minimize the overall rate distortion.
Tasks
- Analyze different state of the art joint rate control algorithms
- Implement the most suitable joint rate control algorithm
- Integrate implemented joint rate control algorithm into the existing TELECARLA [4] setup
- Evaluate impact on camera stream quality and transmission rate distribution
References
[1] Yu Wang, Lap-Pui Chau, und Kim-Hui Yap, „Joint Rate Allocation for Multiprogram Video Coding Using FGS“, IEEE Trans. Circuits Syst. Video Technol., Bd. 20, Nr. 6, S. 829–837, June 2010.
[2] H. Fan, L. Ding, H. Jia, und X. Xie, „A Novel Joint Rate Allocation Scheme of Multiple Streams“, IEEE Trans. Circuits Syst. Video Technol., Bd. 29, Nr. 3, S. 854–867, March 2019.
[3] W. Yao, L.-P. Chau, und S. Rahardja, „Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications“, IEEE Trans. on Broadcast., Bd. 58, Nr. 3, S. 417–427, Sep. 2012.
[4] TELECARLA: An Open Source Extension of the CARLA Simulator for Teleoperated Driving Research Using Off-the-Shelf Components, Markus Hofbauer, Christopher B. Kuhn, Goran Petrovic, Eckehard Steinbach; IV 2020.
Voraussetzungen
- Good unterstanding of video compression and rate control
- Experience with ROS and C++
Betreuer:
------✔
Simulation of Autonomous Airplane Inspection using Drones
Simulation of Autonomous Airplane Inspection using Drones
Autonomous drone inspection airplane gazebo simulation
Beschreibung
In this project of the aviation research programme V (LuFo V) of the Federal Ministry for Economic Affairs and Energy the goal is to develop an autonomous drone for airplane inspection inside a hangar. The drone is expected to be equipped with a variety of sensors, such as LiDAR, stereo cameras, IMUs, compass and optionally active RGB-D sensors. The challenge in this environment are the large metal structures from the airplane, but also from the hangar itself, having a great influence on GPS signals and compass. In this GPS-denied environment the drone will collect data from the sensors and send them to a ground station, where SLAM algorithms map the environment and localize the drone precisely. The drone will follow preset inspection points at which it is capturing inspection images from the surface of the airplane. The images are then sent to the ground station, which performs machine learning techniques using IBM Watson in order to retrieve a damage classification result. The results are collected and an overall report is generated, documenting the current condition of the airplane.
The LMT will develop the software for module communication (ROS-based), sensor data acquisition and encoding, as well as data processing in the ground station, such as SLAM and drone control. In order to test the algorithms two test platforms are used at the LMT. Our small platform is based on "DJI Flame Wheel 550" for autonomous control and stability tests and the bigger drone is based on "DJI Spreading Wings S1000+" to mount all available sensors for data acquisition tests. During the development phase this data is then analyzed offline to adjust parameters and algorithms. This leads to an improvement of localization precision and better real-time control.
More information on the project can be found on the project's website:
https://www.ei.tum.de/en/lmt/research/bmwi-ki-inspektionsdrohne/
Tasks:
Working tasks can vary from week to week. Mainly we are looking for a working student to improve our drone simulation in Gazebo. So students with knowledge in Gazebo simulation will be preferred.
Interested students should send their current grade sheet and CV to the contact below.
Voraussetzungen
The student is required to have excellent knowledge in C++/Python and experience with the ROS framework. We are looking for an employment of a student assistant for a time frame of +6 months with at least 10 hours per week working time.
Betreuer:
✔--✔✔--
Adaptive Region of Interest Maskierung für Teleoperiertes Fahren
Adaptive Region of Interest Maskierung für Teleoperiertes Fahren
Autonomes Fahren, Teleoperiertes Fahren
Beschreibung
This work can be done in German or English
For safe remote control of a vehicle the operator needs to be fully aware of the current traffic situation. The operator gets its information for camera data that are streamed from the vehicle over a constrained network to the operator workplace. A low transmission rate will result in a low visual quality which might be an issue for the operator in controlling the vehicle. But not all parts in the image are important for the operator. [1] suggest to apply a static mask on the camera data which would reduce the overall image size and result in a better image quality for the important regions.
The objective of this this thesis is to implement this mask in a block based way as an optimal input for a video encoder. Further there should be multiple masks which are applied according to the vehicle's lateral movement. The adaptive ROI masks implementation should be integrated into the existing teleoperation setup based on TELECARLA [2].
Tasks
- Implement adaptive block-based ROI masking as C++ ROS Node
- Integrate in to TELECARLA setup
- Evaluate driving performance and image qualityof ROI and non-ROI driving for an equal target bitrate
References
[1] Furman, Vadim, Andrew Hughes Chatham, Abhijit Ogale, und Dmitri Dolgov. Image and video compression for remote vehicle assistance. United States US9767369B2, filed 3. Juni 2016, und issued 19. September 2017.
[2] TELECARLA: An Open Source Extension of the CARLA Simulator for Teleoperated Driving Research Using Off-the-Shelf Components, Markus Hofbauer, Christopher B. Kuhn, Goran Petrovic, Eckehard Steinbach; IV 2020.
Voraussetzungen
- Experience with ROS and C++
- Knowledge of Linux and Python
- Basic understanding of video compression
Betreuer:
✔--✔✔--
Evaluierung von Point Cloud Komprimierunsmethoden mit Fokus auf teleoperiertes Fahren
Evaluierung von Point Cloud Komprimierunsmethoden mit Fokus auf teleoperiertes Fahren
Point Cloud Komprimierung, Autonomes Fahren
Beschreibung
This work can be done in German or English
LIDAR is one important sensor type for autonomous vehicles' perception. Human perception is mainly based on RGB data, in case of teleoperation captured by RGB cameras and transmitted to the remote operator through a communication network. In some situation a 3D representation of the scene might be helpful for the operator which could be achieved using LIDAR data. To avoid hight transmission rate LIDAR data need to be compressed. Existing methods for point cloud compression [1, 2, 3] don't have their focus on automotive LIDAR data.
The objective of this project is to setup existing point cloud compression implementations and compare them focusing on automotive point clouds.
Tasks
- Setup available point cloud compression implementations
* ROS PCL [1]
* MPEG L-PCC [2] (available at MPEG Repo)
* MPEG G-PCC [2] (available at MPEG Repo)
* MPEG Anchor Implementation [3] - Evaluate implementations in terms of
* Encoding time/complexity
* Compression rate
* Compression quality
Voraussetzungen
- Experience with ROS and Point Clouds
- Basic knowledge of C++ and Linux
Betreuer:
Wichtige Informationen zur Anfertigung der Ausarbeitung und zu Vorträgen am LMT, sowie Vorlagen für Powerpoint und LaTeX haben wir hier zusammengefasst.