Consent

This site uses third party services that need your consent.

Employee photo Ms. Ekaterina Kutafina
© MedizinFotoKöln

Ekaterina Kutafina (Dr. Dr. rer. medic.)

Research lead for data-driven medicine
ORCID: 0000-0002-3430-5123

Biography

I hold two doctoral degrees: in mathematics (AGH University of Science and Technology, Krakow, Poland) and in theoretical medicine (Uniklinik RWTH Aachen, Aachen, Germany). My expertise lies in developing comprehensive pathways to provide computational support for collaborative and integrative medical research. I translate medical questions into computational terms and build mathematical models, including AI-based decision systems. For many years, I have been working on analysis of the data from medical sensors and wearable devices, in the areas of neurology, psychiatry, and physiology, particularly epilepsy and neuropathic pain. At BI-K, I lead the strategic research direction of data-driven medicine, emphasizing FAIR data integration and optimizing data flows for interdisciplinary research. Additionally, as an EOSC (European Open Science Cloud) expert for Open Scholarly Communication, I advocate for open science and publishing of scientific artifacts beyond manuscripts. Finally, I am researching and teaching methodologies necessary for successful transdisciplinarity collaborations in computational medicine.

Contact

Academic Background

Areas of Expertise

Research Focus

  • Computational Medicine
  • Time-series Physiological Data
  • Digital Human Twins
  • Neuroscience of Pain
  • Transdisciplinarity
  • Medical Data Science

Current Teachings

Medical AI - From basics to pro: Heart Rate Variability & AI symbiosis in personalized medicine

Clinical Semesters Clinicians Doctoral Studies PostDoc Preclinical Semesters & SoSe WiSe

[Diese Lehrveranstaltung wird nur auf Englisch angeboten] Heart rate variability (HRV) is widely used in clinical settings as a non-invasive autonomic nervous system function marker. It helps assess cardiovascular health, stress levels, and overall well-being. Clinicians use HRV to monitor conditions like heart disease, hypertension, and diabetes, as well as to evaluate recovery in post-surgical and critically ill patients. HRV also plays a role in mental health, aiding in the diagnosis and management of anxiety, depression, or PTSD. Additionally, it is used in sports medicine and rehabilitation to track recovery and optimize training. Its broad applications make it a valuable tool in personalized medicine. The development of AI methods allows to make more complex predictions using multiple HRV parameters simultaneously. This complexity enabled successful decision support in the domains where HRV was not previously prominent for clinical use, such as epileptology. In this lecture block, we will discuss the technical aspects of HRV assessment, such as different sensors, data quality control or different HRV measures. We will review various types of clinical applications, but also the ¿citizen science¿ approach and sports coaching. For the practical part we take a dataset with precomputed R-to-R intervals and different labels (e.g. RR Interval Time Series Modeling: The PhysioNet/Computing in Cardiology Challenge 2002 v1.0.0 ). We will test different machine learning approaches to classify the data, e.g. to detect whether the data was recorded in a stressed or relaxed phase. This block lecture does not require any previous coding experience, we will use the graphical low-code platform KNIME. Students are required to bring their own laptop.

Show in KLIPS

Studium Integrale: Interdisciplinary collaboration for digital solutions

Bachelor Studies Master Studies WiSe & SoSe
Show in KLIPS

Kolloquium "Advances in Biomedical Informatics Research: Graduate Students"

Doctoral Studies & SoSe WiSe

[Diese Lehrveranstaltung wird nur auf Englisch angeboten] The colloquium is aimed primarily at masters and doctoral students, researchers, and scientific personnel of the biomedical Informatics Institute. The co-supervised students from other clinics or research institutions in biomedical informatics and medical data science can also enroll. Students will be mentored and will present the outcomes of their ongoing research, or will review an article from a peer-reviewed journal.

Show in KLIPS

Publications from Ekaterina Kutafina

Supervised spike sorting feasibility of noisy single-electrode extracellular recordings: Systematic study of human C-nociceptors recorded via microneurography

Alina Troglio, Peter Konradi, Andrea Fiebig, Ariadna Pérez Garriga, Rainer Röhrig, James Dunham, Ekaterina Kutafina, Barbara Namer

Sorting spikes from noisy single-channel in-vivo extracellular recordings is challenging, particularly due to the lack of ground truth data. Microneurography, an electrophysiological technique for studying peripheral sensory systems, employs experimental protocols that time-lock a subset of spikes. Stable propagation speed of nerve signals enables reliable sorting of these spikes. Leveraging this property, we established ground truth labels for data collected in two European laboratories and designed a proof-of-concept open-source pipeline to process data across diverse hardware and software systems. Using the labels derived from the time-locked spikes, we employed a supervised approach instead of the unsupervised methods typically used in spike sorting. We evaluated multiple low-dimensional representations of spikes and found that raw signal features outperformed more complex approaches, which are effective in brain recordings. However, the choice of the optimal features remained dataset-specific, influenced by the similarity of average spike shapes and the number of fibers contributing to the signal. Based on our findings, we recommend tailoring lightweight algorithms to individual recordings and assessing the “sortability feasibility” based on achieved accuracy and the research question before proceeding with sorting of non-time-locked spikes in future projects.

Reinforcement Learning for Large Language Model Fine-Tuning: A Systematic Literature Review

Lingxiao Kong, Qusai Ramdan, Oussama Zoubia, Jahid Hasan Polash, Mayra Elwes, Mehdi Akbari Gurabi, Lu Jin, Ekaterina Kutafina, Roman Matzutt, Yuanbin Wang, Junqi Xu, Oya Deniz Beyan, Cong Yang, Zeyd Boukhers

Large Language Models (LLMs) have been developed for a wide range of language-based tasks, while Reinforcement Learning (RL) has been primarily applied to decision-making problems such as robotics, game theory, and control systems. Nowadays, these two paradigms are integrated through different synergies. In this literature review, we focus on \textit{RL4LLM fine-tuning}, where RL techniques are systematically leveraged to fine-tune LLMs and align them with various preferences. Our review provides a comprehensive analysis of 230 recent publications, presenting a methodological taxonomy that organizes current research into three primary method domains: \textit{Optimization Algorithm}, concerning innovation in core RL update rules; \textit{Training Framework}, regarding innovation in the orchestration of the training process; and \textit{Reward Modeling}, addressing how LLMs learn and represent preferences and feedback. Within these primary domains, we further analyze methods and innovations through more granular categories to provide an in-depth summary of RL4LLM fine-tuning research. We address three research questions: 1) recent methods overview, 2) methodological innovations, and 3) limitations and future directions. Our analysis comprehensively demonstrates the breadth and impact of recent RL4LLM fine-tuning research while highlighting valuable directions for future investigation.

Spectral changes in electroencephalography linked to neuroactive medications: A computational pipeline for data mining and analysis

Anna Maxion, Arnim Johannes Gaebler, Rainer Röhrig, Klaus Mathiak, Jana Zweerings, Ekaterina Kutafina

From normal to optimal: investigating metabolic and inflammatory parameters as predictors of survival in locally advanced cervical cancer

2025 - Open Access -
Mayra Elwes, Bhanu Prasanna Koppolu, Carminia Lapuz, Mark Tacey, Simone Marnitz, Oya Beyan, Maike Trommer, Ekaterina Kutafina

Cervical cancer is the third most common cancer in women, and recent studies have highlighted the importance of body composition markers in predicting patient outcomes. We build upon the data of 83 patients from the Uterus-11 study, to explore the relation of pairwise feature combinations to long-term progression-free survival. We propose a framework to identify the parameter combinations with pre-defined thresholds of “normal range” which provide good separation of the survival group. Further, we optimize the pair-wise thresholds to further improve the separation measured by F1 scores. This approach allowed us to improve the statistical significance of hazard ratios in comparison to the previous studies. The optimization results suggest that the normal ranges of well-established biomarkers such as body mass index could be shifted in the context of specific diseases to achieve optimal outcome.

Semi-automatic export of electrophysiological metadata to NFDI4Health Local Data Hubs: Use case of microneurography odML-tables: A technical Case Report

2024 - Open Access -
Mayra Roxana Elwes, Alina Troglio, Masoud Abedi, Martin Golebiewski, Frank Meineke, Toralf Kirsten, Barbara Namer, Oya Beyan, Ekaterina Kutafina

Introduction:

The Local Data Hub (LDH) is a platform for FAIR sharing of medical research (meta-)data. In order to promote the usage of LDH in different research communities, it is important to understand the domain-specific needs, solutions currently used for data organization and provide support for seamless uploads to a LDH. In this work, we analyze the use case of microneurography, which is an electrophysiological technique for analyzing neural activity.

Methods:

After performing a requirements analysis in dialogue with microneurography researchers, we propose a concept-mapping and a workflow, for the researchers to transform and upload their metadata. Further, we implemented a semi-automatic upload extension to odMLtables, a template-based tool for handling metadata in the electrophysiological community.

Results:

The open-source implementation enables the odML-to-LDH concept mapping, allows data anonymization from within the tool and the creation of custom-made summaries on the underlying data sets.

Discussion:

This concludes a first step towards integrating improved FAIR processes into the research laboratory’s daily workflow. In future work, we will extend this approach to other use cases to disseminate the usage of LDHs in a larger research community.

Harmonizing Microneurography Metadata with Local Data Hubs: A Concept

2026 - Open Access -
Mayra Roxana Elwes, Barbara Namer, Alina Troglio, Toralf Kirsten, Oya Beyan, Ekaterina Kutafina

This work aims to improve FAIR-ness of the microneurography research by integrating the local (meta)data to existing research data infrastructures. In the previous work, we developed an odML based solution for local metadata storage of microneurography data. However, this solution is limited to a narrow community. As a next step, we propose the integration into the Local Data Hubs, data-sharing services within NFDI4Health infrastructure. We outline a first concept, that streams chosen data from the established odMLtables GUI.

Reliable detection of focal onset impaired awareness seizures in patients with epilepsy using wearable ECG: Development and validation study

Mohamed Alhaskir, Ekaterina Kutafina, Florian Linke, Florian P. Fischer, Elisabeth Schriewer, Stephan Lauxmann, Kevin Klett, Julian Hofmeister, Florian Lutz, Lukas Burow, Michal Cicanic, Sara Khosrawikatoli, Stefan Wolking, Thomas Mayer, Sandor Beniczky, Josua Kegele, Rainer Röhrig, Henner Koch, Yvonne Weber