Consent

This site uses third party services that need your consent.

Data Integration, Interoperability & Standards

High-quality medical AI depends on data that is not only accurate, but also consistent, connected, and usable across systems. Therefore, we focus on integrating heterogeneous data sources, ensuring interoperability, and advancing the use of open standards. By harmonizing clinical information from clinical records we enable reliable analytics, reproducible research, and scalable AI solutions. This work lays the foundation for transforming fragmented healthcare data into meaningful insights that support better decision-making, improved patient outcomes, and more efficient workflows.

Publications

Linking international registries to FHIR and Phenopackets with RareLink: a scalable REDCap-based framework for rare disease data interoperability

2025 - Open Access -
Adam S.L. Graefe, Filip Rehburg, Samer Alkarkoukly, Daniel Danis, Ana Grönke, Miriam R. Hübner, Alexander Bartschke, Thomas Debertshäuser, Sophie A.I. Klopfenstein, Julian Saß, Julia Fleck, Mirko Rehberg, Jana Zschüntzsch, Elisabeth F. Nyoungui, Tatiana Kalashnikova, Luis Murguía-Favela, Beata Derfalvi, Nicola A.M. Wright, Shahida Moosa, Soichi Ogishima, Oliver Semler, Susanna Wiegand, Peter Kuehnen, Christopher J. Mungall, Melissa A. Haendel, Peter N. Robinson, Sylvia Thun, Oya Beyan

While Research Electronic Data Capture (REDCap) has been widely adopted in rare disease research, its unconstrained data format often leads to implementations that lack native interoperability with global health data standards, limiting secondary data use. To address this, we developed and validated RareLink, an open-source framework implementing our previously-published ontology-based rare disease common data model, enabling standardised data exchange between REDCap, international registries, and downstream analysis tools. Its preconfigured pipelines interact with the local REDCap application programming interface and enable semi-automatic import or export of data to the Global Alliance for Genomics and Health (GA4GH) Phenopackets and Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) instances, conforming to the HL7 International Patient Summary and Genomics Reporting profiles. The framework was developed in three iterative phases using retrospective and prospective clinical data from patients with various rare metabolic and neuromuscular disorders, as well as inborn errors of immunity. Phase one involved deployment across four German university hospitals for registry and data analysis purposes. Phase two integrated RareLink with the Canadian Inborn Errors of Immunity National Registry, enhancing extensibility. Phase three focuses on international implementation in South Africa and Japan to assess global scalability. Implementation feedback was continuously incorporated to validate outputs and improve usability. For evaluation purposes, we defined a simulated Kabuki syndrome cohort based on published cases and demonstrated data export to both Phenopackets and FHIR instances. RareLink can enhance the clinical utility of REDCap by enabling structured data analysis and interoperability. Its global applicability and open-source nature can support equitable rare disease research with the ultimate goal to improve patient care. Broader adoption and coordination with entities such as HL7 and the European Reference Networks are thus essential to realise its full potential. The framework and its documentation are freely available through GitHub and Read the Docs, respectively

An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets

2025 - Open Access -
Adam S. L. Graefe, Miriam R. Hübner, Filip Rehburg, Steffen Sander, Sophie A. I. Klopfenstein, Samer Alkarkoukly, Ana Grönke, Annic Weyersberg, Daniel Danis, Jana Zschüntzsch, Elisabeth F. Nyoungui, Susanna Wiegand, Peter Kühnen, Peter N. Robinson, Oya Beyan & Sylvia Thun

Although rare diseases (RDs) affect over 260 million individuals worldwide, low data quality and scarcity challenge effective care and research. This work aims to harmonise the Common Data Set by European Rare Disease Registry Infrastructure, Health Level 7 Fast Healthcare Interoperability Base Resources, and the Global Alliance for Genomics and Health Phenopacket Schema into a novel rare disease common data model (RD-CDM), laying the foundation for developing international RD-CDMs aligned with these data standards. We developed a modular-based GitHub repository and documentation to account for flexibility, extensions and further development. Recommendations on the model’s cardinalities are given, inviting further refinement and international collaboration. An ontology-based approach was selected to find a common denominator between the semantic and syntactic data standards. Our RD-CDM version 2.0.0 comprises 78 data elements, extending the ERDRI-CDS by 62 elements with previous versions implemented in four German university hospitals capturing real world data for development and evaluation. We identified three categories for evaluation: Medical Data Granularity, Clinical Reasoning and Medical Relevance, and Interoperability and Harmonisation.

Semi-automated approach to validate and enrich LOINC codes by FHIR Server

2021 - Open Access -
Abdul Mateen Rajput, Ana Grönke, Wibke Johannis

Current efforts in modernizing health system are bringing great possibility for secondary use of medical data. To further support these efforts, medical institutions worldwide are fostering use of electronic terminologies for clinical care and data management. One of the challenges in sharing medical data between medical institutions is to assure existence of semantic interoperability among exchanging information systems. To this end, we present here a novel method for automated validation of locally used LOINC concepts. This semi- automated approach will allow medical institutions to check if their laboratory terms are correctly mapped to LOINC concepts, thus assuring semantic interoperability required for secondary use of medical data.

Courses

RDM4Researchers: Designing Reproducible Life Science Research Across the Data Lifecycle

Doctoral Studies Master Studies PostDoc SoSe

This course is designed for life science researchers who want their data to stay understandable, usable, and trustworthy long after an experiment is finished. Instead of treating Research Data Management (RDM) as paperwork or compliance, the course approaches it as a practical part of doing good research. Working with examples from omics, imaging, microscopy, and molecular biology, participants explore how everyday decisions—how data are named, documented, analysed, and shared—shape reproducibility, reuse, and long-term value across the research data lifecycle.

Show in KLIPS

Open Science Essentials: Tools, Principles, and Practices

Bachelor Studies Clinical Semesters Clinicians Doctoral Studies Master Studies PostDoc SoSe

Lecture introduces the principles and practice of Open Science and the FAIR (Findable, Accessible, Interoperable, Reusable) principles in a clear and accessible way for researchers across disciplines. It explores why transparency, collaboration, and responsible data sharing are becoming central to research, particularly in light of funder requirements under programmes such as Horizon Europe. Participants will gain an overview of key concepts, policy context, and practical steps to integrate Open Science and FAIR principles into their everyday research workflows.

Show in KLIPS