Consent

This site uses third party services that need your consent.

© MedizinFotoKöln

Mehrshad Jaberansary (M.Sc.)

Research assistant and Team Lead DA
ORCID: 0000-0003-3407-1387

Biography

Mehrshad earned his Bachelor's degree in Computer Science with a focus on Mathematics from Amirkabir University of Technology in Tehran, Iran, before pursuing a Master's degree in Data Science at RWTH Aachen University. During his studies, he became involved in distributed analytics and federated learning, collaborating with experts from RWTH i5 (Databases and Information Systems), the University of Leipzig (Medical Data Science Department), and the Fraunhofer Institute for Applied Information Technology on a distributed analytics infrastructure called PADME (Platform for Analytics and Distributed Machine Learning for Enterprises). After completing his studies, Mehrshad joined the Institute for Biomedical Informatics in Cologne (BI-K), where he currently works as a Research Assistant and DA team group lead. He collaborates closely with colleagues on projects related to distributed analytics and federated learning. His research interests also extend to Generative AI and Large Language Models (LLMs) and their application in healthcare processes. Additionally, he works closely with the Medical Data Integration Center (MeDIC) team at University Hospital Cologne, focusing on designing and developing infrastructure in collaboration with the German Medical Informatics Initiative consortium. Mehrshad’s work spans several collaborative initiatives, including FAIR Data Spaces, PrivateAIM, and the EU-wide BETTER project, all aimed at enhancing healthcare data integration and security.

Contact

  • Email mehrshad.jaberansary@uk-koeln.de
  • Office Mail:
    Kerpener St. 62 - 50937 Köln
    Visiting:
    BI-K - Geb. 705 - Zülpicher Str. 58e -  50672 Köln - 1 OG - Raum 1.010
    MeDIC - Geb. 5 -Gleuler Str. 70 - 50937 Köln - 2 OG - Raum 2.007

Academic Background

Areas of Expertise

Publications from Mehrshad Jaberansary

TrainTracks - federated learning for reproducible research on sensitive medical data

2026 - Open Access -

Background Reproducibility of computational algorithms is a challenging but crucial requirement for medical research and an important component of trustworthy training and application of AI algorithms. Federated Learning(FL) is commonly used to enable privacy-preserving AI in medical research. One prerequisite of reproducibility is traceability. A majority of publications on traceable FL platforms leverage blockchain technology to achieve traceable FL. In healthcare settings, resource-efficient alternatives to blockchains are possible; however, their traceability features require separate design considerations.

Methods To meet the growing demand for reproducible AI, as outlined in guidelines published by governing bodies such as the European Commission, we propose a novel concept TrainTracks. TrainTracks extends the established Personal Health Train (PHT) Platform for Analytics and Distributed Machine Learning for Enterprises (PADME) tosupport reproducible and traceable federated learning in medical research. PADME already partially supports tracing the FL and changes to the analysis algorithms used in the FL projects. We extend PADME by adding privacy-preserving change tracing of the data, metadata, and computational experiment execution through integrating it with the tools specialized in distributed data management (DataLad and MetaLad). Finally, we evaluate the proposed concept against the detailed list of requirements to analyze the advantages of the TrainTracks and the need for further design improvements.

Results Evaluation of the TrainTracks concepts’ compliance with a checklist for reproducible AI showed that TrainTracks improves the original PADME platform in 15 points out of 47 points applicable to FL. The greatest improvement was in data reproducibility, improving 10 of 12 points from no support to full support for automatic information extraction. No improvements were made to method reproducibility, aside from introducing a dedicated reproducibility repository. Experiment reproducibility saw upgrades in 5 of 30 applicable points, mainly through workflow and code traceability.

Conclusions Our concept of combining FL technology with a data versioning tool provides a structured, automated workflow that traces the FL algorithm itself, delivered algorithms, and the used data. TrainTracks demonstrated high compliance with recommendations for reproducible AI experiments, methods, and data. Our work emphasizes the importance of complete FL process traceability, as all considered aspects individually contribute to the reproducibility of the medical research. Tracking dataset versions in FL is crucial for dynamic application areas, such as medical research, where new data, e.g. electronic health records, are continuously recorded and added.

Keywords Reproducibility crisis, Traceability, Trustworthy AI, Federated learning, Personal Health Train