Consent

This site uses third party services that need your consent.

Image of paper binders

Theses & Projects

Shape the future with your own research! We offer "wissenschaftliche Projekte" (WissPro's) as well as Bachelor, Master and PhD Theses with close supervision and exciting topics at the intersection of innovation and application in following areas:

  • Medical Data Science

  • AI Ethics in Medicine

  • Research Data Management

  • Medical Informatics Technologies

  • Innovation and Entrepreneurship

If you are interested to pursue a thesis in one of those areas, please reach out and, if possible, include your personal experience and interests.

Below you find a list of currently offered theses. If you do not find anything fitting, please still reach out.

Open WissPro's

Overview of Blockchain Applications in Healthcare

The emerging blockchain technology paves the way for new medical applications through a distributed network for transparent and accountable transactions. Examples for potential application fields of blockchain in healthcare are patient-controlled health records, patient consent management for instance for clinical trials or research purposes, distributed analytics & federated learning, or drug supply chain tracking. The student should investigate and present the concept of blockchain and provide either an overview of different blockchain applications in healthcare (braod) or a more focused perspective on a selected blockchain application. [Focus and application(s) will be chosen by student]

Physiology-informed models in intensive care/cardiology/glucose monitoring

Physiology-informed models combine machine learning with (differential or statistical) equations that model human physiology or broader biological systems. By combining the best of two worlds, they promise more explainable predictions, reduced need for training data as well as more robust models against domain shifts. The aim of this WissPro is to get an idea how well-established physiology-informed models are in a specific medical domain, which are the most popular and successful methods. As well as an overview about the advanced and disadvantages of using physiology-informed models as opposed to plain machine learning models. Synonyms you can have a look at to get an idea on the concept of physiology-informed models: - Physiology informed models - Domain informed AI - Hybrid modelling - Mechanistic learning Depending on the students interest we can also shift the focus of the literature review from intensive care monitoring towards cardiology, pulmonology or glucose monitoring.

Bias detection and mitigation in AI models

How objective or how biased are AI-based predictors? Bias detection and mitigation in AI focuses on identifying and reducing unfairness in AI systems. Bias can emerge from data, algorithms, or model assumptions, leading to unequal treatment across groups. Detecting these biases and applying mitigation strategies ensures AI is not only accurate but also fair, transparent, and socially responsible. The aim of this project is to learn - how to quantify metric using different kinds of bias metrics (Thirunagalingam 2024), - which algorithms exist to mitigate biases and how they work (Hort et al., 2024), - and how metrics and algorithms are implemented in a LLM (Sallami, D., & Aïmeur, E., 2025)

Practical cases of DiGA certifications for software products and services

DiGAs (‘Digitale Gesundheitsanwendungen’) offer a recently introduced way to bring healthcare and medical innovations closer to the market. The focus of this project is to provide practical information on the certification process and also provide example cases.

Representation of a patient in precision medicine and digital twins

This project explores how patients are represented in precision medicine and digital twin approaches, and how different representation strategies affect their usefulness for specific clinical tasks. In precision oncology, for example, only a small subset of a patient’s genes may be relevant for selecting an effective cancer treatment, while other information may be less important for that decision. Choosing the right way to represent the patient—what data to include, how detailed it should be, and how it is organized—can strongly influence clinical insights and outcomes. The goal of this project is to analyze and compare different ways of representing patients, from simple, task-focused summaries to more complex, dynamic models. By doing so, it aims to identify strategies for building patient representations that are both relevant and efficient for a given task, while remaining interpretable and clinically meaningful.

Seeing beyond the Scan - Methods to represent Medical Images in Data Analysis

Medical images contain rich information but AI systems do not interpret images like humans do. Instead, they rely on image representations, i.e. numbers representing the image. This literature review project - explores different approaches to transform medical images such as CT and MRI into image representations, - identifies important aspects to evaluate representations and - analyzes how well individual representation learning approaches perform with regard to these aspects.

The ethical challenges of the influence of Big Tech in Health Care

The use of AI for health has been pushed by companies, especially Big Tech companies. This growing power of Big Tech is a growing concern, as it is accompanied by a lack of transparency, aggressive data collection and use, and risks such as inequitable returns to the public sector in public-private medical partnerships or new dependencies on technology firms for the provision of health goods and services. The aim of this WissPro is to compile a literature review that paints a landscape of Big Tech in Health Care and the ethical challenges it poses.

xAI in Medicine - Explaining the Recommendations of AI to Medical Professionals

Artificial intelligence is increasingly used to support diagnosis, prognosis, and treatment decisions but many AI models remain black boxes simply providing a recommendation. In medicine, trust, accountability, and patient safety demand that AI systems can explain why they make a certain recommendation. This literature review project - explores the current state of Explainable Artificial Intelligence (xAI) in medicine, - suggests how xAI should explain its recommendations to meet the needs of medical professionals and - analyzes how well current xAI approached meet these needs.

Open Bachelors Theses

Multi-Objective Optimization for RAG-based LLM

This project explores how conversational information retrieval systems can intelligently balance multiple objectives such as efficiency, correctness, and completeness. The goal is to design mechanisms that decide whether a system should retrieve new information, reuse previous conversation history, or treat a user’s message as the start of a new topic. Students will develop and evaluate optimization strategies that aim to provide the user with the most relevant and accurate responses while minimizing latency and computational cost. The project combines principles from information retrieval, dialogue systems, and multi-objective optimization to enhance both user experience and resource efficiency.

Fair, unbiased prediction of Alzheimer’s Disease using a multimodal dataset

Artificial intelligence (AI) is becoming increasingly prevalent in the healthcare sector. Medical imaging in particular benefits from AI in the field of computer-aided diagnosis and visualization of medical images, as radiology is particularly well suited to using AI techniques due to its data-driven nature. Likewise, with the increasing application of AI, its biases have become visible concerning e.g., religion, race and ethnicity (biological and socio-cultural meaning), as well as sex and gender. Therefore, methods for bias detection and bias mitigation have been developed for bias identification in existing AI models as well as bias prevention during model development. Using the gender gap in the diagnosis of Alzheimer's disease as an example, the current project focuses on predicting Alzheimer's disease using 3D MRI image data as well as 2D clinical and genetic data. In addition, a fair and unbiased predictor is being developed using methods for detecting and mitigating bias.

Federated Breast Cancer Detection with Flower

Artificial intelligence (AI) models can provide clinical decision support, but require vast and representative patient data to perform well. However, relevant patient data collected by healthcare institutions, such as hospitals, can not be easily shared for computations due to privacy risks. One approach to enable AI model training on distributed medical data is federated learning, which enables privacy-preserving training of AI models. The goal of this project is to develop a federated learning approach for breast cancer detection using the established Breast Cancer Wisconsin dataset and the Flower framework. The dataset will be divided between two simulated hospital clients to mimic real-world distributed medical data. The simulated clients should collaboratively train a global model through iterative federated learning rounds, without sharing their local data. Different machine learning models and aggregation strategies are tested and compared and results will be documented and presented by the student.

Medical data mining on Reddit

Patients share a huge amount of health information online that rarely reaches clinicians or researchers. Reddit hosts thousands of personal accounts of symptoms, treatment challenges and day-to-day disease management that can reveal unmet needs long before they appear in clinical studies. This thesis investigates how such real-world insights can be responsibly mined and transformed into medically useful knowledge. The student will build an end-to-end pipeline for collecting and preprocessing health-related posts, then design methods to structure discussions and represent clinically relevant information such as symptoms, medication effects and quality-of-life concerns. A focused case study, for example diabetes or neuropathic pain, will demonstrate the approach and compare patient-reported experiences with what is typically captured in healthcare. The long-term goal is to support better study design and patient-centered healthcare by highlighting what really matters to people living with chronic conditions.

Benchmarking of domain adaptation methods in medical AI

AI systems in medicine often lose performance when models are transferred to new hospitals, scanners or patient groups. Domain adaptation promises to improve generalization, yet choosing the right method for a given clinical setting remains difficult. This thesis focuses on developing a robust benchmarking pipeline to systematically evaluate domain adaptation in medical AI. The student will assemble suitable datasets representing domain shift, implement a curated set of adaptation strategies and core model architectures, and define relevant evaluation metrics including performance, robustness and fairness across subpopulations. The pipeline will be used to identify strengths, weaknesses and practical deployment considerations of each approach. The final outcome will be an open, reproducible toolkit that helps researchers and clinicians select reliable adaptation methods for real-world healthcare applications.

From Graph Structure to Vector Space: GNN-Based Knowledge Graph Embeddings

Knowledge graphs enable the structured representation of complex relational information. By explicitly modeling entities and their relationships, knowledge graphs enable reasoning on rich representations beyond isolated data points. However, to effectively use knowledge graphs in traditional downstream machine learning tasks, they must be transformed into vector representations that preserve both structural and semantic properties. Graph Neural Networks (GNNs) offer a powerful and flexible way for learning such embeddings directly from graph-structured data, while outperforming traditional knowledge graph embedding methods such as graph2vec. In this project, the student will investigate how different GNN architectures, message-passing strategies, and hyperparameter settings influence the quality of knowledge graph embeddings. The goal is to understand how embedding models can be adapted to the requirements of a specific knowledge graph and task, and how design choices impact performance and interpretability.

Open Master Theses

From Graph Structure to Vector Space: GNN-Based Knowledge Graph Embeddings

Knowledge graphs enable the structured representation of complex relational information. By explicitly modeling entities and their relationships, knowledge graphs enable reasoning on rich representations beyond isolated data points. However, to effectively use knowledge graphs in traditional downstream machine learning tasks, they must be transformed into vector representations that preserve both structural and semantic properties. Graph Neural Networks (GNNs) offer a powerful and flexible way for learning such embeddings directly from graph-structured data, while outperforming traditional knowledge graph embedding methods such as graph2vec. In this project, the student will investigate how different GNN architectures, message-passing strategies, and hyperparameter settings influence the quality of knowledge graph embeddings. The goal is to understand how embedding models can be adapted to the requirements of a specific knowledge graph and task, and how design choices impact performance and interpretability.

Open PhD Theses

No entries found

Does this sound interesting?

We look forward to receiving your application! Simply send an email with your CV and a brief description of your relevant experience and interests to the contact person for the respective project.

Don’t see an officially advertised thesis?

No problem! We’re always working on exciting projects and are happy to discuss individual thesis ideas supporting these projects - just get in touch and we will surely find something fitting.

Ongoing Theses & Projects

In addition to our open thesis topics, our students are already exploring a variety of exciting research questions. These ongoing projects highlight innovative approaches and the diverse interests within BI-K.

PhD Theses

Integrating Computational Biosignal Analytics into Data-Driven Multimodal Approaches in Modern Cristical Care in Cardiac Surgery
Karen Anette Hornung,
Program:
Epidemiology and Clinical Research
Visualization for Data Communication
Daniel Braun,
Program:
Computer Science PhD
Perceptually Driven Visual Encoding
Laura Pelchmann,
Program:
Computer Science PhD
Leveraging data integration architectures for patient care: case of multimodal sensor data
Mayra Elwes,
Supervisor:
Program:
Computer Science PhD
Objectivization of malignant tumor pathology by artificial intelligence-based analysis and development of new diagnostic tools
Christian Lennard Harder,
Program:
Computer Science PhD
Enhancing the added value of a medical data platform for diverse stakeholders in a multi-actor healthcare environment
Henrike Oberlack,
Program:
Interdisciplinary Program Health Sciences (IPHS)
IN TAIM for Clinical Healthcare A digitalized task and information management system for professionals in clinical healthcare
Maria Beyer,
Supervisor:
Program:
Medical PhD
Generative deep learning modeling for medical image transformation application in Federated Learning framework
Feifei Li,
Program:
Interdisciplinary Program Health Sciences (IPHS)
Development of graph database for distributed machine-learning for the prediction of fertility; FAIR data management & exchange, quality control
Ahmad Abu Dayeh,
Supervisor:
Program:
Interdisciplinary Program Health Sciences (IPHS)
Enhancing Infrastructure and Application Monitoring in the MeDIC and BIK through Tailored Dashboards: A Transformative Research Proposal
Md. Mostafa Kamal,
Supervisor:
Program:
Interdisciplinary Program Health Sciences (IPHS)
Detetminants of the use of digital patient portals in german hospitals
Nina Goldberg,
Program:
Interdisciplinary Program Health Sciences (IPHS)
Defining and Enhancing the Data-Driven Innovation Cycle in the Clinical Domain
Oliver Diekmeier
Program:
Computer Science PhD
Load more

Master Theses

Distributed Analytics Operations (DAOps): Applying State of the Art Development Workflows in Distributed Analytics
Muhammad Hamza Akhtar,
Program:
Computer Science Master
An intrinisically explainable pipeline for MRI classification
Lars Quakulinski
Advisor:
Program:
Computer Science Master
Load more

Completed Theses

From its very beginning, the BI-K was invested in supporting the next generation of researchers. Below you can see theses which have already been completed.

PhD Theses

Classification of CUP Patients with Retrospective Data Analyzed by Using Machine Learning
Devina Johanna Soenarto
Supervisor:
Advisor:
Program:
Medical PhD
Load more

Master Theses

Towards a Comprehensive Process Model for Federated Learning
Sabith Haneef,
Program:
Computer Science Master
Improved Bottom-Up Deep Learning Approach for Neuronal Cell Instance Segmentation
Xuefeng Yin,
Program:
Computer Science Master
Fostering interoperability of unstructured radiology reports by designing and implementing an annotation guideline in oncology
Matthias Thelen,
Program:
Medical Data Science Master
Decentralized Identity and Access Management for Distributed Machine Learning System
Wie Lu,
Program:
Computer Science Master
Current state, challenges and recommendations for the use of data science and implementation of digital processes in psychiatry
Tanja Veselinović,
Program:
Computer Science Master
Comparison of Human- and AI-Created Dashboard Layouts, Based on Design Guidelines and Expert Evaluation in Visualization and Data Analytics
Luca Theile,
Program:
Computer Science Master
Bridging the gap between disign and deployment of statistical analyses in Distributed Analytics
Sven Weber,
Program:
Computer Science Master
Adversarial Attacks against Machine Learning Models
Ying-Hsuan Chiang,
Program:
Computer Science Master
A Containerized Pipeline for Distributed Analytics Algorithms: The Algorithm Assembly Line
Shubham Balyan,
Program:
Computer Science Master
CosDefence: A defence against Data Poisoning Attacks in Federrated Learning
Program:
Computer Science Master
Design and Evaluation of a Federated Machine Learning Model on Vertically Partitioned Data
Mehrshad Jaberansary,
Program:
Computer Science Master
Designing a Data Access and Integration Work ow for Medical Data Science: a use case of compiling a reusable data set for primary tumor discovery at MeDIC Cologne.

[...] Medical Data Integration Centers (MeDICs) like the MeDIC Cologne are currently established to ease the Data Acess and Integration (DAI) process for researchers making medical data available for research all over Germany. However, these MeDICs still face technical, legal, ethical, and organizational problems. This thesis aimed at supporting the definition of concrete DAI processes by proposing a DAI workflow that serves as a basis for discussing concrete DAI project designs as well as optimizing DAI processes at university hospitals[...]

Julia Gehrmann
Program:
Computer Science Master
Load more

Bachelor Theses

Membership Inference Attacks against Generativee Models and Differential Privacy Defense
Ege Beysel,
Program:
Computer Science Bachelor
Investigating the Effect of Model Averaging on catastrophic Forgetting in Distrubuted Analytics
Program:
Computer Science Bachelor
Disentangled Variational Representation Learning-based Brain Tumor Segmentation
Yuanbin Wang,
Program:
Computer Science Bachelor
Customizable Data Safes for a Distributes Analytics Plattform
Jannik Esser,
Program:
Computer Science Bachelor
Applying Anonymization Methodologies to Distributed Analytics
Hauke Heidemeyer,
Program:
Computer Science Bachelor
Personal Heallh Train: An Assessment of Automaled Pipelines' Contribution to lnfrastructure's Privacy & Security
Ahmet Polat
Program:
informatik
Load more