Technical Challenges to Data Sharing

GliMR Webinar - 23 April 2024

title image
Stephan Heunis
@jsheunis jsheunis

Psychoinformatics lab, Institute of Neuroscience and Medicine, Brain & Behavior (INM-7)
Research Center Jülich, Germany

Slides: 
jsheunis.github.io/glimr-2024

Acknowledgements

DataLad software
& ecosystem
  • Psychoinformatics Lab,
    Research center Jülich
  • Center for Open
    Neuroscience,
    Dartmouth College
  • Joey Hess (git-annex)
  • >100 additional contributors
Funders
Collaborators

Technical challenges to data sharing?

john-travolta

Not much, relative to the social, procedural and legal challenges

Development history


developments

=> As technology develops, so do our tools to share data

Of course, there are still challenges:


challenges

But there are also existing solutions:


solutions.png

All roads lead to data: 1


data_sharing_infra_datatouser_jsheunis.png

=> Free for all

All roads lead to data: 2


data_sharing_infra_usertodata_jsheunis.png

=> So-called data islands

All roads lead to data: 3


data_sharing_infra_codetodata_jsheunis.png

=> Users don't see the actual data, develop code based on samples

All roads lead to data: 4


data_sharing_infra_federated_codetodata_jsheunis.png

=> Code from all over, data from all-over

Examples of privacy preserving technology and decentralized tools:


  • OpenMined: secure connections, running algorithms without seeing data
  • Vantage6: delivering algorithms to data stations and collect their results
  • Personal Health Train: federated machine learning in radiomics
  • DataLad: decentralized data storage, access, collaboration, tracking
  • COINstac: decentralized pipelines, results aggregation, differential privacy
  • EBRAINS: GDPR-compliant digital research infrastructure of Human Brain Project
  • Canadian Open Neuroscience Platform: FAIR data access while keeping personal data private

What is the future?


What is the future?


Example of decentralized code-to-data


Example of decentralized code-to-data:


Example of decentralized code-to-data


Examples of metadata-based catalogs



What is the future?



Thank you!