Consent

This site uses third party services that need your consent.

Oya Beyan (Univ. Prof. Dr.)

Head of the Institute for Biomedical Informatics
ORCID: 0000-0001-7611-3501

Biography

Leading the BI-K to learn, develop and apply technologies to enable data-driven medicine.
My research focuses on the reusability of health data, semantic interoperability, and clinical data science, with the aim of continuously improving healthcare through innovation and the creation of new knowledge.
Partnering with national and international data spaces, we enable real data use without compromising the fairness, equity, privacy, and confidentiality of individuals, as well as social groups and communities.

Contact

Academic Background

Areas of Expertise

Research Focus

  • Data Driven Medicine
  • Electronic Health Records
  • Research and Registry Data Repositories
  • Health Care Standards
  • Knowledge Graphs for Biomedicine

Current Teachings

Ethical aspects of Medical AI applications

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters SoSe & WiSe

Short (two sessions) introduction to the ethical issues of Medical AI applications. We identify the ethical implications of AI applications in Medicine, focus on data collection and use, the risks of artificial intelligence technologies, and bias and discrimination, and address the impact of automated decision-making in Medical AI Applications. The lectures are in English and are held online.

Show in KLIPS

Semantic Interoperability in Health: Data stewardship - Part 2

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe

This comprehensive session on data stewardship in medical informatics provides an in-depth exploration of the fundamental principles, best practices, and practical examples in the field.

Show in KLIPS

Medical Image processing

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe & WiSe + SoSe SoSe

This course provides an introduction to the most important concepts in medical image processing. The aim of the course is to impart practical basic knowledge required for the evaluation of medical image data.

Show in KLIPS

Seminar "MedTech: Medical Technology-based entrepreneurship and innovation"

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe & SoSe WiSe + SoSe

Dies ist ein interaktiver, erfahrungsbasierter Lernkurs, der den Studierenden helfen soll, grundlegende Kompetenzen zu erwerben und die Herausforderungen und Chancen zu bewältigen, mit denen Unternehmer bei der Gründung oder dem Wachstum eines MedTech-Startups konfrontiert sind. Die Studenten werden dabei unterstützt, Fallstudien zu erforschen und Inhalte und Strategien zu entwickeln, mit denen MedTech-Unternehmer vertraut sein müssen. Die Studierenden erhalten die nötige Unterstützung, um in Gruppen oder unabhängig voneinander praktische Erfahrungen mit den verschiedenen Aspekten der Gründung eines MedTech-Unternehmens zu sammeln und ihre ersten Ideen zu realisierbaren Geschäftsmöglichkeiten auszubauen.

Show in KLIPS

Einführung in die computergestützte medizinische Signal Analyse

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe & SoSe WiSe + SoSe

Der menschliche Körper sendet kontinuierlich Biosignale aus, die wertvolle Einblicke in physiologische Prozesse liefern. In der Medizin werden diese Signale sowohl zu Forschungszwecken genutzt als auch um Diagnose und Monitoring von Krankheiten und Patienten zu unterstützen. Diese Veranstaltung bietet eine praxisnahe Einführung in die computergestützte Biosignalanalyse. Nach einer kurzen theoretischen Einführung zu den Grundlagen der Signalverarbeitung, einschließlich Definition, Erfassung und Anwendungsmöglichkeiten, erfolgt eine praktische Einführung in die Datenanalyse. Anhand eines realistischen Beispiels aus dem Patientenmonitoring im Intensivmedizin-Setting werden essentielle Schritte vermittelt: Daten-Vorbereitung, Feature-Engineering und die Vorhersage des Signals mit modernen Machine Learning Methoden durchgeführt. Es wird von Teilnehmenden der Besuch der vorangegangenen Veranstaltung ¿Coding Basics¿ oder ein äquivalentes Vorwissen in der Programmierung in Python vorausgesetzt.

Show in KLIPS

Intro to Data Analysis in Python

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe & WiSe + SoSe SoSe

[This course is only offered in English] The objective of this course is to provide the basics of exploratory data analysis techniques in Python, including data exploration, data visualization, and data quality. Throughout this course, you will gain hands-on experience with available tools and libraries that help you to conduct initial investigations on your data in order to discover patterns, identify anomalies, test hypotheses, and verify assumptions using summary statistics and graphical representations.

Show in KLIPS

Studium Integrale: Hands-On Data Science

Bachelor Studies Master Studies WiSe

[This course is offered in English] Generating knowledge from data using machine learning (ML) is becoming increasingly important in every conceivable scientific field. To provide an introduction to data science, this course will cover various ML methods, including supervised and unsupervised methods, as well as techniques for evaluating and visualising the results.With a focus on practical implementation, all approaches presented will be briefly introduced theoretically and then implemented using the programming language python.Prior knowledge of programming is not required. The first lecture will cover a python demo. To pass the course the students have to apply the introduced methods in an own data science projects and present their results in a 5-10 minute presentation (depending on the number of participants). The projects and presentations will not be graded but have to meet the requirements presented in the lecture.

Show in KLIPS

Studium Integrale: AI Ethics

Bachelor Studies Master Studies WiSe

The course aims to familiarise students with the basic concepts of the domain and to highlight the challenges posed by the explosion of data-driven applications using Artificial Intelligence (AI) in recent years. Topics to be covered include an introduction to data-driven AI and Machine Learning, artificial agents, privacy and consent, bias and discrimination. The course is entirely online, held in English, and assessed through presentations based on scientific publications selected by the students in collaboration with the tutor.

Show in KLIPS

Kolloquium "Advances in Biomedical Informatics Research: Graduate Students"

Doctoral Studies WiSe + SoSe & SoSe WiSe

[Diese Lehrveranstaltung wird nur auf Englisch angeboten] The colloquium is aimed primarily at masters and doctoral students, researchers, and scientific personnel of the biomedical Informatics Institute. The co-supervised students from other clinics or research institutions in biomedical informatics and medical data science can also enroll. Students will be mentored and will present the outcomes of their ongoing research, or will review an article from a peer-reviewed journal.

Show in KLIPS

Studium Integrale: Interdisciplinary collaboration for digital solutions

Bachelor Studies Master Studies WiSe & SoSe WiSe + SoSe
Show in KLIPS

Transdisciplinary collaboration in medical research

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe

Aim of this seminar is to bring the importance of transdisciplinary collaborative work to the focus of the students, analyse typical pitfalls and provide a toolbox to facilitate such work in realistic conditions The advances in medical research reflect the importance of integrating the knowledge and data from different medical domains, as well as the expertise of non-clinical research groups, e.g. biologists, bioinformaticians, neuroscientists etc. The complexity of the research topics requires that the knowledge of the individual groups is not isolated, but reaches out efficiently across the domain borders. In practice, there are multiple obstacles on the way of building those transdisciplinary connections. Different working cultures, limited time resources, inertia and even active resistance to change paths contribute to the difficulties. The involvement in any collaborative research, be it a leadership role or a junior team member, can become much more efficient, if we understand the architectures of collaborative teams, are aware of typical issues expected during the project, and have a readily available toolbox of solutions which can be applied to facilitate the work.

Show in KLIPS

Medical AI - Vom Datenchaos zur richtigen Krebstherapie - Daten Aufbereitung für KI in der Onkologie

Clinical Semesters Clinicians Doctoral Studies PostDoc Preclinical Semesters WiSe + SoSe & SoSe WiSe

Im klinischen Alltag entstehen große Mengen an Daten, die wertvolle Erkenntnisse für die Forschung ermöglichen, insbesondere zur Verbesserung von Diagnosen und Therapien. Damit diese Daten für den Einsatz in Künstlicher Intelligenz (KI) nutzbar sind, müssen sie sorgfältig aufbereitet werden. In dieser Lehrveranstaltung erhalten die Teilnehmenden eine Einführung in die Themen Datenqualität und Datenvorverarbeitung für KI. Sie arbeiten mit einem synthetischen Datensatz, der echten klinischen Daten aus der Onkologie nachempfunden ist, und lernen die Herausforderungen der Datenaufbereitung aus erster Hand kennen. Zu Beginn des Kurses werden die Teilnehmenden mit den Grundlagen der Datenqualität und der Datenaufbereitung für KI-Modelle vertraut gemacht. Anschließend werden am Beispiel eines aktuellen onkologischen Forschungsprojektes typische Herausforderungen bei der Vorbereitung medizinischer Routinedaten für KI-basierte Auswertungen erläutert. Im praktischen Teil der Veranstaltung setzen die Teilnehmenden das Gelernte um, indem sie mit Python arbeiten und eigenständig einen Datensatz analysieren. Sie identifizieren Probleme in den Rohdaten, korrigieren fehlerhafte oder unvollständige Einträge und bereiten die Daten so auf, dass sie für eine KI-gestützte Analyse verwendet werden können. Der Kurs ist in drei Teile gegliedert. In einer zweistündigen Einführungssitzung, die am Institut stattfindet, werden theoretische Grundlagen vermittelt und die Aufgabenstellung erläutert. Danach haben die Teilnehmenden eine Woche Zeit, um in einer Hausaufgabe eigenständig die Datenqualität zu untersuchen und den Datensatz für die KI-Analyse vorzubereiten. In einer abschließenden dreistündigen Übungseinheit, die sowohl vor Ort als auch online besucht werden kann, werden die Ergebnisse gemeinsam besprochen und Herausforderungen diskutiert. Studierende, die an beiden Sitzungen teilnehmen, erhalten auf Anfrage eine Teilnahmebescheinigung.

Show in KLIPS

Medical AI - Explainable AI for Diabetes Prediction: A Hands-On Seminar with Python

Clinical Semesters Clinicians Doctoral Studies PostDoc Preclinical Semesters WiSe + SoSe & SoSe WiSe

In diesem Seminar lernen Medizinstudierende, wie künstliche Intelligenz zur Vorhersage von Diabetes eingesetzt werden kann. Anhand eines konkreten Beispieldatensatzes entwickeln sie ein einfaches KI-basiertes Klassifikationsmodell mit Python in einer interaktiven vorinstallierten Programmierumgebung. Es werden dabei Inhalte zur Grundthematik, dem Einlesen und Analysieren der Ausgangsdaten, der Klassifizierung sowie der Evaluierung der Ergebnisse vorgestellt und durch diverse Aufgaben vertieft. Besonderer Fokus wird dabei auf die Erklärbarkeit gesetzt, um Nachvollziehbarkeit der Klassifizierungen zu ermöglichen und Ergebnisse auch klinisch genauer betrachten zu können. Inhalte des Seminars: Einführung in die Problemstellung: Wie kann KI bei der Diabetes-Diagnose unterstützen? Datenexploration: Verständnis der Features und ihrer Bedeutung Aufbau eines KI-basierten Klassifikationsmodells Evaluierung der Modellleistung Einblick in Modellinterpretation: Welche Merkmale (Features) sind entscheidend? Studierende, die an beiden Sitzungen teilnehmen, erhalten auf Anfrage eine Teilnahmebescheinigung.

Show in KLIPS

Medical AI - From basics to pro: Heart Rate Variability & AI symbiosis in personalized medicine

Clinical Semesters Clinicians Doctoral Studies PostDoc Preclinical Semesters WiSe + SoSe & SoSe WiSe

[Diese Lehrveranstaltung wird nur auf Englisch angeboten] Heart rate variability (HRV) is widely used in clinical settings as a non-invasive autonomic nervous system function marker. It helps assess cardiovascular health, stress levels, and overall well-being. Clinicians use HRV to monitor conditions like heart disease, hypertension, and diabetes, as well as to evaluate recovery in post-surgical and critically ill patients. HRV also plays a role in mental health, aiding in the diagnosis and management of anxiety, depression, or PTSD. Additionally, it is used in sports medicine and rehabilitation to track recovery and optimize training. Its broad applications make it a valuable tool in personalized medicine. The development of AI methods allows to make more complex predictions using multiple HRV parameters simultaneously. This complexity enabled successful decision support in the domains where HRV was not previously prominent for clinical use, such as epileptology. In this lecture block, we will discuss the technical aspects of HRV assessment, such as different sensors, data quality control or different HRV measures. We will review various types of clinical applications, but also the ¿citizen science¿ approach and sports coaching. For the practical part we take a dataset with precomputed R-to-R intervals and different labels (e.g. RR Interval Time Series Modeling: The PhysioNet/Computing in Cardiology Challenge 2002 v1.0.0 ). We will test different machine learning approaches to classify the data, e.g. to detect whether the data was recorded in a stressed or relaxed phase. This block lecture does not require any previous coding experience, we will use the graphical low-code platform KNIME. Students are required to bring their own laptop.

Show in KLIPS

LLM Journal Club

Doctoral Studies PostDoc WiSe & SoSe WiSe + SoSe

[Diese Lehrveranstaltung wird nur auf Englisch angeboten] Each week, we review and discuss a recent research paper on Large Language Models (LLMs), with a focus on practical applications such as Retrieval-Augmented Generation (RAG) and LLM evaluation. The selected papers are drawn from top-tier conferences, including NAACL, ICML, and NeurIPS, ensuring exposure to cutting-edge developments in the field.

Show in KLIPS

Studium integrale: Evolution of Data Analysis

Master Studies Bachelor Studies WiSe

Vom Abakus bis zu ChatGPT hat sich unsere Fähigkeit, Daten zu erzeugen, aber auch die Art und Weise, wie wir sie analysieren, im letzten halben Jahrhundert erheblich verändert. Ziel dieses Kurses ist es, die Geschichte der ursprünglichen Datenerfassung und -nutzung zu erforschen, einige der wichtigsten Auswirkungen dieser Wissenschaft zu erörtern und Beispiele aus der Praxis zu geben, wie man das Wissen in der Informationsflut, die unsere Welt überschwemmt, finden kann. Die Teilnehmer sollten im Laufe des Seminars eine Kombination aus Vorlesungen und praktischen Übungen erwarten. Sowohl die Vorlesung sowie auch die Übungen werden einen Überblick über die Geschichte der für die Datenanalyse verwendeten Werkzeuge veritteln. Offene Diskussionen sind ebenfalls erwünscht, da in der Vorlesung wichtige, die Gesellschaft verändernde Themen in Bezug auf die damaligen Informationen und deren Verwendung behandelt werden.

Show in KLIPS

AI in Medicine Series

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe + SoSe & SoSe WiSe

Artificial intelligence is already fundamentally changing medicine, but how do the underlying methods work, and what opportunities and challenges do they present? In this series of seminars, each session will cover a new, practical topic, including the basics of some AI methods, ethical challenges and possible solutions. The lectures, depending on the speaker, could be in German or English, are thematically linked but self-contained

Show in KLIPS

Coding Basics in Python

Clinical Semesters Preclinical Semesters Doctoral Studies Clinicians PostDoc SoSe & WiSe WiSe + SoSe

Einführung in die grundlegenden Konzepte der Programmierung in Python, die für die Auswertung von medizinischen und Forschungsdaten erforderlich sind. Die Teilnehmer werden in diesem interaktiven Seminar aus erster Hand lernen, wie sie ihren eigenen Code entwickeln und ausführen können.

Show in KLIPS

Medical AI - Introduction to Deep Learning in Medicine and its Applications

Clinical Semesters PostDoc Preclinical Semesters Doctoral Studies WiSe + SoSe & SoSe WiSe

[Der Kurs wird nur in Englisch abgehalten] This course provides a structured introduction to deep learning with a focus on medical applications. It begins by clarifying key concepts in artificial intelligence, machine learning, and deep learning, emphasizing their relevance in modern medicine. Students will explore the basic structure of neural networks and understand how models are trained and evaluated. The course then introduces convolutional neural networks (CNNs)followed by an overview of transformers and foundational models used for analyzing clinical text, genomics, and multimodal data. Real-world case studies and clinical examples illustrate how deep learning is applied across radiology, pathology, dermatology, and beyond. The final sessions explore challenges in deploying AI in clinical settings, including issues of bias, explainability, and ethical use. Optional components may include hands-on demonstrations or guided review of influential research papers in the field.

Show in KLIPS

Medical AI - Large Language Models and Knowledge Graphs for Medical Decision Support

Clinical Semesters PostDoc Doctoral Studies Preclinical Semesters WiSe + SoSe

[Kurs wird nur in Englisch angeboten] Large Language Models (LLMs) and Knowledge Graphs are rapidly transforming clinical practice as powerful tools for medical decision support, documentation, and research. This course teaches medical students to understand and implement these technologies through hands-on coding with LLM endpoints and existing medical knowledge graphs. Students learn fundamental concepts of how LLMs process medical text and how knowledge graphs structure clinical information, then apply this knowledge by writing code to build Knowledge Graph-Retrieval Augmented Generation (KG-RAG) systems that combine both approaches. Students work with established medical knowledge bases like UMLS, SNOMED CT, and DrugBank, integrating them with LLM APIs to create robust clinical tools. Through coding exercises, participants build systems that leverage structured medical knowledge to improve LLM accuracy and reduce hallucinations in clinical contexts. Students evaluate their implementations for medical reliability, learning to identify limitations and bias in AI-generated clinical content. This practical approach prepares future physicians to critically assess, implement, and optimize AI tools in their clinical practice, making it essential training for modern evidence-based medicine.

Show in KLIPS

Effective Research Data Management: From FAIR Principles to Open Science

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe

[This course is only offered in English] The pace and unpredictability of research often leave little room for thorough documentation and metadata annotation. Yet, effective Research Data Management (RDM) is essential to ensure the reliability, reproducibility, and long-term value of research outputs. Central to RDM are the FAIR data principles¿making data Findable, Accessible, Interoperable, and Reusable¿which should be embedded across all stages of the data lifecycle. In parallel, RDM is a cornerstone of Open Science, a movement that fosters transparency, collaboration, and equitable access to scientific knowledge. By integrating FAIR and Open Science practices into everyday research, researchers not only increase the visibility and impact of their work but also contribute to a more robust and inclusive scientific ecosystem.

Show in KLIPS

Folders and File Names: The Survival Guide

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe

[This course is offered only in English] Struggling to stay on top of your digital files? This lecture and hands-on workshop will help you create a file naming and folder system that suits your needs and evolves with you. Using guiding questions and simple strategies, you¿ll learn to organise your files in a way that makes sense, improves access, and reduces clutter. Rather than offering a rigid method, we encourage flexible thinking and regular review so your system stays useful over time. Ideal for anyone looking to take control of their digital workspace.

Show in KLIPS

WissPro - Literaturrecherche

Preclinical Semesters SoSe

This course is offered to medical students interested in WissPro 1 and 2, involving literature research. It includes an introductory lecture on literature search strategies and best practices, as well as presentations on the topics offered by different members of our institute. The work is organised according to a schedule with several checkpoints and concludes with on-site presentations by the participating students.

Show in KLIPS

WissPro - Programmierung

Preclinical Semesters SoSe

Die Studierenden beschäftigen sich mit einem Projekt aus dem Themenbereich der medizinischen Datenanalyse. Im Rahmen eines Einführungsseminars werden grundlegende Kenntnisse in Python erworben. Im Anschluss erhalten die Studierenden eine Aufgabenstellung, die sie innerhalb von sechs Wochen implementieren sollen. Während dieser Zeit findet ein freiwilliges wöchentliches Seminar statt, indem Probleme besprochen werden. Zum Schluss sollen die Ergebnisse in einem Vortrag vorgestellt werden.

Show in KLIPS

Seminar "Innovation Ecosystems in Health and Medical Technologies"

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe

Technologische Fortschritte wie KI und Big Data werden das Gesundheitswesen positiv verändern. Bei diesen intelligenten Gesundheits- und Medizintechnologien kommen in der Regel bahnbrechende und disruptive Innovationen zum Einsatz. Ihre Einführung wird jedoch häufig durch mangelndes Bewusstsein und mangelnde Bereitschaft anderer Akteure in den jeweiligen Märkten und Branchen behindert. In diesem Kurs werden wir Beispiele spezifischer Ökosysteme untersuchen und mit Hilfe von Open-Innovation-Methoden und -Werkzeugen deren Grenzen aufzeigen. Der Kurs basiert auf den Ergebnissen und laufenden Forschungsarbeiten des Horizon Europe SHIFT-HUB-Projekts, das darauf abzielt, einen patientengesteuerten Ansatz für die Entwicklung und Einführung intelligenter Gesundheitslösungen zu entwickeln.

Show in KLIPS

Innovation Ecosystems in Health and Medical Technologies

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc Preclinical Semesters WiSe & WiSe + SoSe SoSe

Technologische Fortschritte wie KI und Big Data werden das Gesundheitswesen positiv verändern. Bei diesen intelligenten Gesundheits- und Medizintechnologien kommen in der Regel bahnbrechende und disruptive Innovationen zum Einsatz. Ihre Einführung wird jedoch häufig durch mangelndes Bewusstsein und mangelnde Bereitschaft anderer Akteure in den jeweiligen Märkten und Branchen behindert. In diesem Kurs werden wir Beispiele spezifischer Ökosysteme untersuchen und mit Hilfe von Open-Innovation-Methoden und -Werkzeugen deren Grenzen aufzeigen. Der Kurs basiert auf den Ergebnissen und laufenden Forschungsarbeiten des Horizon Europe SHIFT-HUB-Projekts, das darauf abzielt, einen patientengesteuerten Ansatz für die Entwicklung und Einführung intelligenter Gesundheitslösungen zu entwickeln.

Show in KLIPS

RDM4Researchers: Designing Reproducible Life Science Research Across the Data Lifecycle

Doctoral Studies Master Studies PostDoc SoSe

This course is designed for life science researchers who want their data to stay understandable, usable, and trustworthy long after an experiment is finished. Instead of treating Research Data Management (RDM) as paperwork or compliance, the course approaches it as a practical part of doing good research. Working with examples from omics, imaging, microscopy, and molecular biology, participants explore how everyday decisions—how data are named, documented, analysed, and shared—shape reproducibility, reuse, and long-term value across the research data lifecycle.

Show in KLIPS

Open Science Essentials: Tools, Principles, and Practices

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc SoSe

Lecture introduces the principles and practice of Open Science and the FAIR (Findable, Accessible, Interoperable, Reusable) principles in a clear and accessible way for researchers across disciplines. It explores why transparency, collaboration, and responsible data sharing are becoming central to research, particularly in light of funder requirements under programmes such as Horizon Europe. Participants will gain an overview of key concepts, policy context, and practical steps to integrate Open Science and FAIR principles into their everyday research workflows.

Show in KLIPS

Publications from Oya Beyan

Towards an ELSA Curriculum for Data Scientists

2024 - Open Access -

Abstract

The use of artificial intelligence (AI) applications in a growing number of domains in recent years has put into focus the ethical, legal, and societal aspects (ELSA) of these technologies and the relevant challenges they pose. In this paper, we propose an ELSA curriculum for data scientists aiming to raise awareness about ELSA challenges in their work, provide them with a common language with the relevant domain experts in order to cooperate to find appropriate solutions, and finally, incorporate ELSA in the data science workflow. ELSA should not be seen as an impediment or a superfluous artefact but rather as an integral part of the Data Science Project Lifecycle. The proposed curriculum uses the CRISP-DM (CRoss-Industry Standard Process for Data Mining) model as a backbone to define a vertical partition expressed in modules corresponding to the CRISP-DM phases. The horizontal partition includes knowledge units belonging to three strands that run through the phases, namely ethical and societal, legal and technical rendering knowledge units (KUs). In addition to the detailed description of the aforementioned KUs, we also discuss their implementation, issues such as duration, form, and evaluation of participants, as well as the variance of the knowledge level and needs of the target audience.

AI Ethics—A Bird’s Eye View

Abstract

The explosion of data-driven applications using Artificial Intelligence (AI) in recent years has given rise to a variety of ethical issues regarding data collection, annotation, and processing using mostly opaque algorithms, as well as the interpretation and employment of the results of the AI pipeline. The ubiquity of AI applications negatively impacts a variety of sensitive areas, ranging from discrimination against vulnerable populations to privacy invasion and the environmental cost that these algorithms entail, and puts into focus on the ever present domain of AI ethics. In this review article we present a bird’s eye view approach of the AI ethics landscape, starting from a historical point of view, examining the moral issues that were introduced by big datasets and the application of non-symbolic AI algorithms, the normative approaches (principles and guidelines) to these issues and the ensuing criticism, as well as the actualization of these principles within the proposed frameworks. Subsequently, we focus on the concept of responsibility, both as personal responsibility of the AI practitioners and sustainability, meaning the promotion of beneficence for both the society and the domain, and the role of professional certification and education in averting unethical choices. Finally, we conclude with indicating the multidisciplinary nature of AI ethics and suggesting future challenges.

Early Multimodal Data Integration for Data-Driven Medical Research - A Scoping Review

2024 - Open Access -

Introduction: Data-driven medical research (DDMR) needs multimodal data (MMD) to sufficiently capture the complexity of clinical cases. Methods for early multimodal data integration (MMDI), i.e. integration of the data before performing a data analysis, vary from basic concatenation to applying Deep Learning, each with distinct characteristics and challenges. Besides early MMDI, there exists late MMDI which performs modality-specific data analyses and then combines the analysis results.

Methods: We conducted a scoping review, following PRISMA guidelines, to find and analyze 21 reviews on methods for early MMDI between 2019 and 2024.

Results: Our analysis categorized these methods into four groups and summarized group-specific characteristics that are relevant for choosing the optimal method combination for MMDI pipelines in DDMR projects. Moreover, we found that early MMDI is often performed by executing several methods subsequently in a pipeline. This early MMDI pipeline is usually subject to manual optimization.

Discussion: Our focus was on structural integration in DDMR. The choice of MMDI method depends on the research setting, complexity, and the researcher team's expertise. Future research could focus on comparing early and late MMDI approaches as well as automating the optimization of MMDI pipelines to integrate vast amounts of real-world medical data effectively, facilitating holistic DDMR.

From normal to optimal: investigating metabolic and inflammatory parameters as predictors of survival in locally advanced cervical cancer

2025 - Open Access -
Mayra Elwes, Bhanu Prasanna Koppolu, Carminia Lapuz, Mark Tacey, Simone Marnitz, Oya Beyan, Maike Trommer, Ekaterina Kutafina

Cervical cancer is the third most common cancer in women, and recent studies have highlighted the importance of body composition markers in predicting patient outcomes. We build upon the data of 83 patients from the Uterus-11 study, to explore the relation of pairwise feature combinations to long-term progression-free survival. We propose a framework to identify the parameter combinations with pre-defined thresholds of “normal range” which provide good separation of the survival group. Further, we optimize the pair-wise thresholds to further improve the separation measured by F1 scores. This approach allowed us to improve the statistical significance of hazard ratios in comparison to the previous studies. The optimization results suggest that the normal ranges of well-established biomarkers such as body mass index could be shifted in the context of specific diseases to achieve optimal outcome.

Semi-automatic export of electrophysiological metadata to NFDI4Health Local Data Hubs: Use case of microneurography odML-tables: A technical Case Report

2024 - Open Access -
Mayra Roxana Elwes, Alina Troglio, Masoud Abedi, Martin Golebiewski, Frank Meineke, Toralf Kirsten, Barbara Namer, Oya Beyan, Ekaterina Kutafina

Introduction:

The Local Data Hub (LDH) is a platform for FAIR sharing of medical research (meta-)data. In order to promote the usage of LDH in different research communities, it is important to understand the domain-specific needs, solutions currently used for data organization and provide support for seamless uploads to a LDH. In this work, we analyze the use case of microneurography, which is an electrophysiological technique for analyzing neural activity.

Methods:

After performing a requirements analysis in dialogue with microneurography researchers, we propose a concept-mapping and a workflow, for the researchers to transform and upload their metadata. Further, we implemented a semi-automatic upload extension to odMLtables, a template-based tool for handling metadata in the electrophysiological community.

Results:

The open-source implementation enables the odML-to-LDH concept mapping, allows data anonymization from within the tool and the creation of custom-made summaries on the underlying data sets.

Discussion:

This concludes a first step towards integrating improved FAIR processes into the research laboratory’s daily workflow. In future work, we will extend this approach to other use cases to disseminate the usage of LDHs in a larger research community.

Harmonizing Microneurography Metadata with Local Data Hubs: A Concept

2026 - Open Access -
Mayra Roxana Elwes, Barbara Namer, Alina Troglio, Toralf Kirsten, Oya Beyan, Ekaterina Kutafina

This work aims to improve FAIR-ness of the microneurography research by integrating the local (meta)data to existing research data infrastructures. In the previous work, we developed an odML based solution for local metadata storage of microneurography data. However, this solution is limited to a narrow community. As a next step, we propose the integration into the Local Data Hubs, data-sharing services within NFDI4Health infrastructure. We outline a first concept, that streams chosen data from the established odMLtables GUI.

Linking international registries to FHIR and Phenopackets with RareLink: a scalable REDCap-based framework for rare disease data interoperability

2025 - Open Access -
Adam S.L. Graefe, Filip Rehburg, Samer Alkarkoukly, Daniel Danis, Ana Grönke, Miriam R. Hübner, Alexander Bartschke, Thomas Debertshäuser, Sophie A.I. Klopfenstein, Julian Saß, Julia Fleck, Mirko Rehberg, Jana Zschüntzsch, Elisabeth F. Nyoungui, Tatiana Kalashnikova, Luis Murguía-Favela, Beata Derfalvi, Nicola A.M. Wright, Shahida Moosa, Soichi Ogishima, Oliver Semler, Susanna Wiegand, Peter Kuehnen, Christopher J. Mungall, Melissa A. Haendel, Peter N. Robinson, Sylvia Thun, Oya Beyan

While Research Electronic Data Capture (REDCap) has been widely adopted in rare disease research, its unconstrained data format often leads to implementations that lack native interoperability with global health data standards, limiting secondary data use. To address this, we developed and validated RareLink, an open-source framework implementing our previously-published ontology-based rare disease common data model, enabling standardised data exchange between REDCap, international registries, and downstream analysis tools. Its preconfigured pipelines interact with the local REDCap application programming interface and enable semi-automatic import or export of data to the Global Alliance for Genomics and Health (GA4GH) Phenopackets and Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) instances, conforming to the HL7 International Patient Summary and Genomics Reporting profiles. The framework was developed in three iterative phases using retrospective and prospective clinical data from patients with various rare metabolic and neuromuscular disorders, as well as inborn errors of immunity. Phase one involved deployment across four German university hospitals for registry and data analysis purposes. Phase two integrated RareLink with the Canadian Inborn Errors of Immunity National Registry, enhancing extensibility. Phase three focuses on international implementation in South Africa and Japan to assess global scalability. Implementation feedback was continuously incorporated to validate outputs and improve usability. For evaluation purposes, we defined a simulated Kabuki syndrome cohort based on published cases and demonstrated data export to both Phenopackets and FHIR instances. RareLink can enhance the clinical utility of REDCap by enabling structured data analysis and interoperability. Its global applicability and open-source nature can support equitable rare disease research with the ultimate goal to improve patient care. Broader adoption and coordination with entities such as HL7 and the European Reference Networks are thus essential to realise its full potential. The framework and its documentation are freely available through GitHub and Read the Docs, respectively

An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets

2025 - Open Access -
Adam S. L. Graefe, Miriam R. Hübner, Filip Rehburg, Steffen Sander, Sophie A. I. Klopfenstein, Samer Alkarkoukly, Ana Grönke, Annic Weyersberg, Daniel Danis, Jana Zschüntzsch, Elisabeth F. Nyoungui, Susanna Wiegand, Peter Kühnen, Peter N. Robinson, Oya Beyan & Sylvia Thun

Although rare diseases (RDs) affect over 260 million individuals worldwide, low data quality and scarcity challenge effective care and research. This work aims to harmonise the Common Data Set by European Rare Disease Registry Infrastructure, Health Level 7 Fast Healthcare Interoperability Base Resources, and the Global Alliance for Genomics and Health Phenopacket Schema into a novel rare disease common data model (RD-CDM), laying the foundation for developing international RD-CDMs aligned with these data standards. We developed a modular-based GitHub repository and documentation to account for flexibility, extensions and further development. Recommendations on the model’s cardinalities are given, inviting further refinement and international collaboration. An ontology-based approach was selected to find a common denominator between the semantic and syntactic data standards. Our RD-CDM version 2.0.0 comprises 78 data elements, extending the ERDRI-CDS by 62 elements with previous versions implemented in four German university hospitals capturing real world data for development and evaluation. We identified three categories for evaluation: Medical Data Granularity, Clinical Reasoning and Medical Relevance, and Interoperability and Harmonisation.

Seeing the primary tumor because of all the trees: Cancer type prediction on low-dimensional data

2024 - Open Access -
Julia Gehrmann, Devina Johanna Soenarto, Johanna Soenarto, Hidayat Kevin, Maria Beyer, Lars Quakulinski, L, Samer Alkarkoukly, Scarlett Berressem, Anna Gundert, Michael Butler, Ana Grönke, Simon Lennartz, Thorsten Persigehl, Thomas Zander, Oya Beyan

The Cancer of Unknown Primary (CUP) syndrome is characterized by identifiable metastases while the primary tumor remains hidden. In recent years, various data-driven approaches have been suggested to predict the location of the primary tumor (LOP) in CUP patients promising improved diagnosis and outcome. These LOP prediction approaches use high-dimensional input data like images or genetic data. However, leveraging such data is challenging, resource-intensive and therefore a potential translational barrier. Instead of using high-dimensional data, we analyzed the LOP prediction performance of low-dimensional data from routine medical care. With our findings, we show that such low-dimensional routine clinical information suffices as input data for tree-based LOP prediction models. The best model reached a mean Accuracy of 94% and a mean Matthews correlation coefficient (MCC) score of 0.92 in 10-fold nested cross-validation (NCV) when distinguishing four types of cancer. When considering eight types of cancer, this model achieved a mean Accuracy of 85% and a mean MCC score of 0.81. This is comparable to the performance achieved by approaches using high-dimensional input data. Additionally, the distribution pattern of metastases appears to be important information in predicting the LOP.

Large language models for literature reviews-an exemplary comparison of llm-based approaches with manual methods

2025 - Open Access -

Large Language Models (LLMs) and LLM-based tools are increasingly popular for various tasks, including literature reviews. This trend holds significant potential in fields like healthcare and medical informatics, where timely updates on new research findings can have life-saving implications. However, the sensitive nature of these fields demands high reliability and trustworthiness. In this study, we assess the suitability of widely used LLM-based tools for conducting literature reviews in healthcare and medical informatics across two scenarios. First, we evaluated the tools’ performance and reliability in executing a systematic, scientific literature review by replicating the exact methodology of a recently accepted review we conducted. Second, we explored the tools’ effectiveness in quickly retrieving relevant information by testing their responses to differently phrased queries, focusing on the neutrality and balance of the information provided. Our findings indicate that while LLM-based tools can offer a useful initial overview of an unfamiliar topic, they are less effective for in-depth literature reviews. Furthermore, the choice of the specific tool is critical, as significant differences were observed in both the generated text and the references provided across tools. Additionally, our results suggest that prompts crafted in a scientific style with a negative connotation towards the research hypothesis tend to result in more balanced discussions compared to those framed in everyday language with a positive connotation towards the research hypothesis.

Tools & Services

Research Projects