OHBM 2022 Educational course:

How to Write a Re-Executable Publication

Stephan Heunis
@fMRwhy jsheunis

Psychoinformatics lab
Institute of Neuroscience and Medicine, Brain & Behavior (INM-7)


Slides: jsheunis.github.io/ohbm-2022/talks/ohbm-2022-educational-jsheunis.html

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

1. What is DataLad?

2. Accessing data

Clone dataset from GitHub:

datalad clone https://github.com/OpenNeuroDatasets/ds001907.git

Get specific subset or file:
datalad get ds001907/sub-RC4101/ses-1/anat/sub-RC4101_ses-1_T1w.nii.g

Once finished, drop data:
datalad drop ds001907/sub-RC4101/ses-1/anat/sub-RC4101_ses-1_T1w.nii.g

3. Generating and reproducing results

DataLad extension for working with containers, and Singularity:

pip install datalad_container
or
conda install -c conda-forge datalad-container

Get and run container:
datalad clone https://github.com/ReproNim/containers containers
datalad containers-run \
         -n containers/repronim-simple-workflow \
         --input 'ds001907/sub-RC4*/ses-1/anat/sub-*_ses-1_T1w.nii.gz' \
         code/simple_workflow/run_demo_workflow.py \
           -o . -w data/workdir --plugin_args 'dict(n_procs=10)' '{inputs}'

Rerun the analysis:
datalad rerun "specific-commit-hash-of-previous-run-command"

4. Publishing data

Create a dataset sibling:
datalad create-sibling-github github-lfs
Create a special remote of type git-lfs:
git annex initremote github-lfs type=git-lfs url="https://github.com/[your-github-username]/github-lfs" encryption=none embedcreds=no
Copy dataset content to LFS:
git annex copy --to github-lfs .
Push DataLad dataset to GitHub:
datalad push --to github
Or: add a publication dependency and push once:

    datalad siblings github --publish-depends github-lfs
    datalad push --to github